dataframe - R data.table: How to sum variables by group based on a condition? -


let's have following r data.table (though i'm happy work base r, data.frame well)

library(data.table)  dt = data.table(category=c("first","first","first","second","third", "third", "second"), frequency=c(10,15,5,2,14,20,3), times = c(0, 0, 0, 3, 3, 1))  > dt    category frequency times 1:    first        10     0 2:    first        15     0 3:    first         5     0 4:   second         2     3 5:    third        14     3 6:    third        20     1 7:   second         3     0 

if wished sum frequencies category, use following:

data[, sum(frequency), = category] 

however, let's wanted sum frequency category if , if times non-zero , not equal na?

how 1 make sum conditional based on values of separate column?

edit: apologies obvious question. quick addition: if elements of column strings?

e.g.

> dt    category frequency times 1:    first        ten    0 2:    first        ten    0 3:    first        5   0 4:   second        5   3 5:    third        5   3 6:    third        5   1 7:   second        ten    0 

sum() not calculate frequencies of ten versus five

remember logic of data.table: dt[i, j, by], take dt, subset rows using i, calculate j grouped by.

dt[times != 0 & !is.na(times), sum(frequency), = category]    category v1 1:   second  2 2:    third 34 

Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -