r - Adding values allowing a maximum number of NA values using dplyr -
i have this:
df = data.frame(month=rep(1:3,3), year=rep(1998:2000,each=3), a=c(na,3,2,rep(na,2),4,4,5,na), b=c(na,4,5,rep(na,4),5,6), c=c(10,rep(na,3),2:4,rep(na,2))) > head(df) month year b c 1 1 1998 na na 10 2 2 1998 3 4 na 3 3 1998 2 5 na 4 1 1999 na na na 5 2 1999 na na 2 6 3 1999 4 na 3
and want this:
year b c 1 1998 5 9 na 2 1999 na na 5 3 2000 9 11 na
the above means sum
function allows 1 na
value per year
.
as first attempt tried:
library(dplyr) df %>% group_by(year) %>% summarise_all(function(x) sum(x, na.rm=t))
but got following output wrote na.rm=t
:
year b c 1 1998 5 9 10 2 1999 4 0 5 3 2000 9 11 4
my question is: how pass maximum number of na
values sum
function in order intended data frame?
i quite solved complicated for
, if
loops wonder if using vectorized functions.
any thoughts?
library(dplyr) df <- data.frame(month=rep(1:3,3), year=rep(1998:2000,each=3), a=c(na,3,2,rep(na,2),4,4,5,na), b=c(na,4,5,rep(na,4),5,6), c=c(10,rep(na,3),2:4,rep(na,2))) df <- df %>% group_by(year) %>% summarise_all(function(x) ifelse(2 * sum(is.na(x)) > length(x), na, sum(x, na.rm=t))) df$month <- null as.data.frame(df) # year b c # 1 1998 5 9 na # 2 1999 na na 5 # 3 2000 9 11 na
Comments
Post a Comment