dplyr - Split column to multiple fields using R -
i have column in csv has field "features". fields has data in format
{""air conditioning"",""elevator"",""smoke detector""} {""air conditioning"",""railing lights"",""smoke detector""} {""air conditioning"",""washer"",""dryer"",""smoke detector""}
their 20000 records these strings inside field "features" not in particular order.
how can split them different columns in way "air conditioning" fall under 1st column,"elevators" under 2nd , on.
b c d air conditioning elevators smokedetectors air conditioning elevators smokedetectors washer air conditioning elevators smokedetectors washer
a combination of separate
tidyr
, mutate_at
dplyr
(with gsub
thrown in):
dfr <- data.frame(features = c('{""air conditioning"",""elevator"",""smoke detector""}', '{""air conditioning"",""railing lights"",""smoke detector""}', '{""air conditioning"",""washer"",""dryer"",""smoke detector""}')) library(tidyr) library(dplyr) # remove {,}, , quotes (") fix_txt <- function(x)gsub("[{]\"|\"|[}]", "", x) separate(dfr, features, c("a","b","c"), sep=",", extra="merge") %>% mutate_at(vars(a:c), fix_txt)
gives
b c 1 air conditioning elevator smoke detector 2 air conditioning railing lights smoke detector 3 air conditioning washer dryer,smoke detector
note fields merged (as in third record), @ ?separate
more options.
Comments
Post a Comment