loops - Iteratively concatenating and labelling data frames in R -
i'm trying write loop in r reads in list of filenames directory, turns them data frames, , concatenates them 1 large data frame, while adding on identifier each data frame know file generated data came while plotting. far, have loop runs on function appends each data frame empty data frame initialise on, looks this:
filenames <- list.files(path="reads/metrics", pattern="*.txt", all.files=t, recursive=false, full.names = true) n= 0 pesto = data.frame(size=character(), fcount= character(),rcount=character(), total = character(), identifier= character()) concat = function(filename, n){ dat = read.table(filename, header=true, na.strings="empty") dat_i = transform(dat, identifier = rep((paste("time", n, sep="")), nrow((dat)))) pesto <<- rbind(dat_i) } (f in filenames) { n = n+1 concat(f, n) }
so 2 examples data frames, after being read in:
> df1 (from file of time = 1) size fcount rcount total [1,] 1 2 3 5 [2,] 4 1 1 2 [3,] 5 1 2 3 > df2 (from file of time = 2) size fcount rcount total [1,] 1 3 6 9 [2,] 3 1 5 6 [3,] 5 1 2 3
the desired output like,
> pesto size fcount rcount total identifier [1,] 1 2 3 5 time1 [1,] 1 3 6 9 time2 [2,] 3 1 5 6 time2 [2,] 4 1 1 2 time1 [3,] 5 1 2 3 time1 [3,] 5 1 2 3 time2
instead, output df2, labelled!
so far in debugging i've asked function print(n) make sure iterating in loop correctly , gave me correct output:
[1] 1 [1] 2
i'm @ complete loss on getting work - concatenating files hand pain!
you can without for
loops, using lapply
. (i know *apply
functions loops in disguise, they're considered better r
code.)
files_list <- lapply(filenames, read.table, header=true, na.strings="empty") pesto <- lapply(seq_along(files_list), function(n){ x <- files_list[[n]] x$identifier <- paste0("time", n) x }) pesto <- do.call(rbind, pesto) pesto <- pesto[order(pesto$size), ] pesto
Comments
Post a Comment