r - Aggregate values for all combinations of factor levels including missing ones -


i'm trying find minimum value of dataframe based on multiple columns. i'm able using aggregate function below. however, result not contain combinations of factors there no data in input data frame.

what i've got:

# possibilities of fruits, cities, , vegetables: fruits<-c('apple','banana','grape') cities<-c('new york','chicago','los angeles') vegetables<-c('cucumber','mushroom')  #my input (ie, sample test: inputdf<-data.frame(fruit=c('apple','apple','apple','banana','banana','banana','grape','grape','grape'),city=c('new york','new york','new york','new york','chicago','los angeles','chicago','chicago','chicago'),vegetable=c('cucumber','cucumber','mushroom','cucumber','mushroom','mushroom','cucumber','cucumber','cucumber'),value=c(5,3,4,6,5,7,2,7,4))  #my aggregation: outdf<-aggregate(value ~ fruit + city + vegetable,inputdf,function(x) min(x)) 

the output is:

fruit   city        vegetable   value grape   chicago     cucumber    2 apple   new york    cucumber    3 banana  new york    cucumber    6 banana  chicago     mushroom    5 banana  los angeles mushroom    7 apple   new york    mushroom    4 

this correct, however, want rows correspond combinations of columns didnt exist @ in input df:

fruit   city        vegetable   value apple   new york    cucumber    3 apple   new york    mushroom    4 apple   chicago     cucumber    na apple   chicago     mushroom    na apple   los angeles cucumber    na apple   los angeles mushroom    na banana  new york    cucumber    6 banana  new york    mushroom    na banana  chicago     cucumber    na banana  chicago     mushroom    5 banana  los angeles cucumber    na banana  los angeles mushroom    7 grape   new york    cucumber    na grape   new york    mushroom    na grape   chicago     cucumber    2 grape   chicago     mushroom    na grape   los angeles cucumber    na grape   los angeles mushroom    na 

i'd able number of columns on combine. there simple way that? reason want output because need transform nas specific value , average values on same subsets again. thanks!

you can using expand.grid generate combinations, using merge:

outdf<-aggregate(value ~ fruit + city + vegetable,inputdf,function(x) min(x)) df=expand.grid(fruits, cities, vegetables) outdf=merge(outdf,df,by.x=c('fruit','city','vegetable'),by.y=c('var1','var2','var3'),all.y=t)  > outdf     fruit        city vegetable value 1   apple     chicago  cucumber    na 2   apple     chicago  mushroom    na 3   apple los angeles  cucumber    na 4   apple los angeles  mushroom    na 5   apple    new york  cucumber     3 6   apple    new york  mushroom     4 7  banana     chicago  cucumber    na 8  banana     chicago  mushroom     5 9  banana los angeles  cucumber    na 10 banana los angeles  mushroom     7 11 banana    new york  cucumber     6 12 banana    new york  mushroom    na 13  grape     chicago  cucumber     2 14  grape     chicago  mushroom    na 15  grape los angeles  cucumber    na 16  grape los angeles  mushroom    na 17  grape    new york  cucumber    na 18  grape    new york  mushroom    na 

Comments

Popular posts from this blog

python Tkinter Capturing keyboard events save as one single string -

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

javascript - Z-index in d3.js -