dplyr - Split column to multiple fields using R -


i have column in csv has field "features". fields has data in format

{""air conditioning"",""elevator"",""smoke detector""} {""air conditioning"",""railing lights"",""smoke detector""} {""air conditioning"",""washer"",""dryer"",""smoke detector""} 

their 20000 records these strings inside field "features" not in particular order.

how can split them different columns in way "air conditioning" fall under 1st column,"elevators" under 2nd , on.

                   b       c              d             air conditioning elevators smokedetectors  air conditioning elevators smokedetectors washer air conditioning elevators smokedetectors washer 

a combination of separate tidyr , mutate_at dplyr (with gsub thrown in):

dfr <- data.frame(features = c('{""air conditioning"",""elevator"",""smoke detector""}',                                '{""air conditioning"",""railing lights"",""smoke detector""}',                                '{""air conditioning"",""washer"",""dryer"",""smoke detector""}'))  library(tidyr) library(dplyr)  # remove {,}, , quotes (") fix_txt <- function(x)gsub("[{]\"|\"|[}]", "", x) separate(dfr, features, c("a","b","c"), sep=",", extra="merge") %>% mutate_at(vars(a:c), fix_txt) 

gives

                              b                    c 1 air conditioning       elevator       smoke detector 2 air conditioning railing lights       smoke detector 3 air conditioning         washer dryer,smoke detector 

note fields merged (as in third record), @ ?separate more options.


Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -