r - transposing a dataframe with repeats -

August 15, 2015

i have data frame has 2 columns, 1 gene symbols, , functional pathways. pathways column has repeated values there number of genes belong each pathway. reorder dataset each column single pathway , each row in columns gene belongs in pathway.

starting dataframe:

data.frame(pathway = c("p1", "p1", "p1", "p1", "p2", "p2", "p2"),  gene.symbol = c("g1", "g2", "g3", "g4", "g33", "g43", "g10"))

desired dataframe:

data.frame(p1 = c("g1", "g2", "g3", "g4"), p2 = c("g33", "g43", "g10",  ""))

i know not columns same length, , having blank values preferable nas.

here option.

split list using pathway splitting element
get max length of each group, , set other groups same length
turn data frame

here code.

mydf <- data.frame(pathway = c("p1", "p1", "p1", "p1", "p2", "p2", "p2"),             gene.symbol = c("g1", "g2", "g3", "g4", "g33", "g43", "g10"))  # function run on each element in list set_to_max_length <- function(x) {   length(x) <- max.length   return(x) }  # 1. split  list mydf.split <- split(mydf$gene.symbol, mydf$pathway)  # 2.a max length of columns max.length <- max(sapply(mydf.split, length))  # 2.b set each list element max length mydf.split.2 <- lapply(mydf.split, set_to_max_length)  # 3. combine df data.frame(mydf.split.2)

edit

here option using tidyverse - more succinct:

library(tidyverse) mydf <- data.frame(pathway = c("p1", "p1", "p1", "p1", "p2", "p2", "p2"),                     gene.symbol = c("g1", "g2", "g3", "g4", "g33", "g43", "g10"))  mydf %>%    group_by(pathway) %>%    mutate(rownum = row_number()) %>%    ungroup() %>%    spread(pathway, gene.symbol) %>%    select(-1)

Search This Blog

LP

r - transposing a dataframe with repeats -

Comments

Post a Comment

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

nginx - phpPgAdmin - log in works but I have to login again after clicking on any links -

How to deploy a middleman blog inside a rails app? -