r - How to properly return character values for dplyr's do? -
consider following code:
foo <- function() { if (runif(1) < 0.5) { return(data.frame(result="low")) } else { return(data.frame(result="high")) } } df = data.frame(val=c(1,2,3,4,5,6)) df %>% group_by(val) %>% do(foo())
it random, if there both "low" , "high" results returned, you'll see errors this:
warning messages: 1: in bind_rows_(x, .id) : unequal factor levels: coercing character 2: in bind_rows_(x, .id) : binding character , factor vector, coercing character vector 3: in bind_rows_(x, .id) : binding character , factor vector, coercing character vector 4: in bind_rows_(x, .id) : binding character , factor vector, coercing character vector 5: in bind_rows_(x, .id) : binding character , factor vector, coercing character vector
i believe first value being returned (say, "low") converted factor 1 level, , when other level comes along, incurs dplyr's wrath.
what proper way code example avoid warnings?
edit: 1 solution this:
foo <- function() { if (runif(1) < 0.5) { return(data.frame(result=factor("low", levels=c("low", "high")))) } else { return(data.frame(result=factor("high", levels=c("low", "high")))) } }
but if don't know factor levels ahead of time?
also, more fundamentally, i'd return character vector, not factor.
either:
- use
stringsasfactors=false
:return(data.frame(..., stringsasfactors=false))
or:
- use
data_frame
:return(data_frame(...))
see ?data.frame more factor treatment.
Comments
Post a Comment