dataframe - Specific group rankings in R -


i have data frame "category", "id", "score(t)", , want "rank(t)":

category    id          score.08.2007   score.09.2007    rank.08.2007    rank.09.2007   ... orange      fsgbr070n3  0.16            ...              5               ... orange      fsgbr070n3  0.05            ...              7               ... orange      fsgbr070n3  0.11                             6 orange      fs00008l4g  0.28                             1 orange      fs00008vld  0.27                             2 orange      fs00008vld  0.27                             2 orange      fs00008vld  0.27                             2 orange      fs00009sqx  -2.03                            8 orange      fs00009sqx  na                           orange      fsusa0a1kw  na           orange      fsusa0a1kw  na   orange      fsusa0a1kx  na   orange      fsusa0a1ky  na   orange      fs0000b389  na   banana      fs000092gp  96.25                            1 banana      fs000092gp  96.25                            1 banana      fs000092gp  96.25                            1 banana      fs000092gp  52.33                            4 banana      fs0000atln  31.73                            5 banana      fsusa0avmf  1.38                             7 banana      fsgbr058o8  1.37                             8 banana      fsgbr05845  2.24                             6 

the ranking based on descending sorting of "score" in each "category". additional specification, struggle capture, when there identical scores , identical id's, following score has different value assign rank equal rank previous id plus number of id's shared same score (the rank output column in example should make clear).

na's should receive no ranking:

na.last = na 

i have started creating matrix ranks, need sort(), struggle capture time-series , additional specification... couldn't find such specific existing questions either. appreciated!

time_series <- c("08.2007","09.2007","10.2007",...) abs_ranks_mat <- as.data.frame(mat.or.vec(nrow(id),length(time_series))) 

a solution uses dplyr. df example @trosendal's example. df3 final output.

the key use min_rank function create rank. mutate_at allows specify column or not want conduct ranking. after that, can change column names , merge original data frame.

library(dplyr)  df <- df %>% mutate(rowid = 1:n())  df2 <- df %>%   group_by(category) %>%   mutate_at(vars(-id, -rowid), funs(min_rank(desc(.)))) %>%   ungroup() %>%   select(-category, -id) %>%   setnames(., gsub("score", "rank", colnames(.)))  df3 <- df %>%    left_join(df2, = "rowid") %>%   select(-rowid) 

Comments

Popular posts from this blog

PHP and MySQL WP -

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

go - golang pprof for c library code -