R - Select first entry of a day (date and time) given the respective subject ID -


i trying sort out multiple entries per day, selecting first registered entry each day, per subject id.

i handling big data set, here snapshot of data structure:

 df <- c(contact.id, date.time, age, gender, attendance)  contact.id       date.time       age   gender   attendance    1         2012-07-06 18:54:48   37    male         30     2         2012-07-06 20:50:18   37    male         30     3         2012-08-14 20:18:44   37    male         30    4   b       2012-03-15 16:58:15   27  female         40     5   b       2012-04-18 10:57:02   27  female         40     6   b       2012-04-18 17:31:22   27  female         40     7   b       2012-04-18 18:37:00   27  female         40     8   c       2013-10-22 17:46:07   40    male         5     9   c       2013-10-27 11:21:00   40    male         5     10  d       2012-07-28 14:48:33   20  female         12  

i have tried few different things such as:

t.first <- df[match(unique(df$date.time), df$date.time),]  setdt(df)[,.sd[which.max(df$date.time)],keyby=df$contact.id]  library(dplyr) t.first <- ddply(df, "date.time", function(z) tail(z,1)) 

but none of them me first entry given specific subject id.

so need left @ end data set such that:

contact.id       date.time       age   gender   attendance    1         2012-07-06 18:54:48   37    male         29     2         2012-08-14 20:18:44   37    male         29    3   b       2012-03-15 16:58:15   27  female         38     4   b       2012-04-18 10:57:02   27  female         38     5   c       2013-10-22 17:46:07   40    male         5     6   c       2013-10-27 11:21:00   40    male         5     7   d       2012-07-28 14:48:33   20  female         12  

please, if help, have been stuck on way long.

another option using dplyr::slice(). prevent duplicates.

library(dplyr) library(lubridate)  dt2 <- dt %>%   mutate(date.time = ymd_hms(date.time)) %>%   mutate(date = as.date(date.time)) %>%   group_by(contact.id, date) %>%   arrange(date.time) %>%   slice(1) %>%   ungroup() %>%   select(-date) 

Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -