r - dplyr Time Diff between rows -
i have data frame in below format , i'm trying find difference in time between event 'assigned' , last time event 'created' comes before it.
**accountid** **time** **event** 1 2016-11-08t01:54:15.000z created 1 2016-11-09t01:54:15.000z assigned 1 2016-11-10t01:54:15.000z created 1 2016-11-11t01:54:15.000z called 1 2016-11-12t01:54:15.000z assigned 1 2016-11-12t01:54:15.000z sleep currently code follows, difficulty selecting created comes before assigned event
test <- timetable.filter %>% group_by(accountid) %>% mutate(timetoassign = ifelse(event == 'assigned', interval(ymd_hms(time), max(ymd_hms(time[event == 'created']))) %/% hours(1), na)) i'm looking output be
**accountid** **time** **event** **timetoassign** 1 2016-11-08t01:54:15.000z created na 1 2016-11-09t01:54:15.000z assigned 12 1 2016-11-10t01:54:15.000z created na 1 2016-11-11t01:54:15.000z called na 1 2016-11-12t01:54:15.000z assigned 24 1 2016-11-12t01:54:15.000z sleep na
with dplyr , tidyr:
library(dplyr); library(tidyr); library(anytime) df %>% group_by(accountid) %>% mutate(created_index = if_else(event == 'created', row_number(), na_integer_), time = anytime(time)) %>% fill(created_index) %>% mutate(timetoassign = if_else(event == 'assigned', as.numeric(time - time[created_index], units = 'hours'), na_real_)) %>% select(-created_index) # tibble: 6 x 4 # groups: accountid [1] # accountid time event timetoassign # <int> <dttm> <fctr> <dbl> #1 1 2016-11-08 01:54:15 created na #2 1 2016-11-09 01:54:15 assigned 24 #3 1 2016-11-10 01:54:15 created na #4 1 2016-11-11 01:54:15 called na #5 1 2016-11-12 01:54:15 assigned 48 #6 1 2016-11-12 01:54:15 sleep na
Comments
Post a Comment