R - Select first entry of a day (date and time) given the respective subject ID -
i trying sort out multiple entries per day, selecting first registered entry each day, per subject id.
i handling big data set, here snapshot of data structure:
df <- c(contact.id, date.time, age, gender, attendance) contact.id date.time age gender attendance 1 2012-07-06 18:54:48 37 male 30 2 2012-07-06 20:50:18 37 male 30 3 2012-08-14 20:18:44 37 male 30 4 b 2012-03-15 16:58:15 27 female 40 5 b 2012-04-18 10:57:02 27 female 40 6 b 2012-04-18 17:31:22 27 female 40 7 b 2012-04-18 18:37:00 27 female 40 8 c 2013-10-22 17:46:07 40 male 5 9 c 2013-10-27 11:21:00 40 male 5 10 d 2012-07-28 14:48:33 20 female 12
i have tried few different things such as:
t.first <- df[match(unique(df$date.time), df$date.time),] setdt(df)[,.sd[which.max(df$date.time)],keyby=df$contact.id] library(dplyr) t.first <- ddply(df, "date.time", function(z) tail(z,1))
but none of them me first entry given specific subject id.
so need left @ end data set such that:
contact.id date.time age gender attendance 1 2012-07-06 18:54:48 37 male 29 2 2012-08-14 20:18:44 37 male 29 3 b 2012-03-15 16:58:15 27 female 38 4 b 2012-04-18 10:57:02 27 female 38 5 c 2013-10-22 17:46:07 40 male 5 6 c 2013-10-27 11:21:00 40 male 5 7 d 2012-07-28 14:48:33 20 female 12
please, if help, have been stuck on way long.
another option using dplyr::slice(). prevent duplicates.
library(dplyr) library(lubridate) dt2 <- dt %>% mutate(date.time = ymd_hms(date.time)) %>% mutate(date = as.date(date.time)) %>% group_by(contact.id, date) %>% arrange(date.time) %>% slice(1) %>% ungroup() %>% select(-date)
Comments
Post a Comment