r - How to properly return character values for dplyr's do? -
consider following code:
foo <- function() {   if (runif(1) < 0.5) {     return(data.frame(result="low"))   } else {     return(data.frame(result="high"))   } }  df = data.frame(val=c(1,2,3,4,5,6)) df %>% group_by(val) %>% do(foo()) it random, if there both "low" , "high" results returned, you'll see errors this:
warning messages: 1: in bind_rows_(x, .id) : unequal factor levels: coercing character 2: in bind_rows_(x, .id) :   binding character , factor vector, coercing character vector 3: in bind_rows_(x, .id) :   binding character , factor vector, coercing character vector 4: in bind_rows_(x, .id) :   binding character , factor vector, coercing character vector 5: in bind_rows_(x, .id) :   binding character , factor vector, coercing character vector i believe first value being returned (say, "low") converted factor 1 level, , when other level comes along, incurs dplyr's wrath.
what proper way code example avoid warnings?
edit: 1 solution this:
foo <- function() {   if (runif(1) < 0.5) {     return(data.frame(result=factor("low", levels=c("low", "high"))))   } else {     return(data.frame(result=factor("high", levels=c("low", "high"))))   } } but if don't know factor levels ahead of time?
also, more fundamentally, i'd return character vector, not factor.
either:
- use stringsasfactors=false:return(data.frame(..., stringsasfactors=false))
or:
- use data_frame:return(data_frame(...))
see ?data.frame more factor treatment.
Comments
Post a Comment