r - Grouping and dividing between the columns -


this question follow-up group group division

          date bal        d  1: 1/31/2013  10       na  2: 1/31/2013  11       na  3: 1/31/2013  12       na  4: 1/31/2013  13       na  5: 1/31/2013  14       na  6: 2/28/2013  20       na  7: 2/28/2013  30 3.000000  8: 2/28/2013  40 3.636364  9: 2/28/2013  50 4.166667 10: 2/28/2013  60 4.615385 11: 3/30/2013  10       na 12: 3/30/2013  11 0.550000 13: 3/30/2013  12 0.400000 14: 3/30/2013  13 0.325000 15: 3/30/2013  15 0.300000 

as per below:

library(data.table)   # cran version 1.10.4 used setdt(bb)[, d := bal / shift(bal, 6l)][seq(1l, nrow(bb), 5l), d := na][] 

now questions is:

  1. at every 4th , 5th of each group, answer should print 100%, means, 9th, 10th, 14th , 15th, , on, values under d should 100%

  2. values in d should in %

expected o/p

       date     bal    d  1: 1/31/2013  10      na  2: 1/31/2013  11      na  3: 1/31/2013  12      na  4: 1/31/2013  13 100.00  5: 1/31/2013  14 100.00  6: 2/28/2013  20      na  7: 2/28/2013  30 300.00  8: 2/28/2013  40 363.64  9: 2/28/2013  50 100.00 10: 2/28/2013  60 100.00 11: 3/30/2013  10      na 12: 3/30/2013  11  55.00 13: 3/30/2013  12  40.00 14: 3/30/2013  13 100.00 15: 3/30/2013  15 100.00 

thats expected output.

it supposed same conditions hold in previous answer, namely there same number of rows each date. observation, simple solution possible lagging value of bal 6 rows denominator. ignores groups in first place, necessary set result d na first row in each group, i.e. in every 5th row, finally.

the additional request specific rows need manually overwritten 1.0 (printed 100%) handled likewise computing respective indices.

library(data.table) setdt(bb)[, d := formattable::percent(bal / shift(bal, 6l))][seq(1l, .n, 5l), d := na][   rep(seq(4l, nrow(bb), 5l), each = 2l) + 0:1, d := 1.0][] 
         date bal       d  1: 1/31/2013  10      na  2: 1/31/2013  11      na  3: 1/31/2013  12      na  4: 1/31/2013  13 100.00%  5: 1/31/2013  14 100.00%  6: 2/28/2013  20      na  7: 2/28/2013  30 300.00%  8: 2/28/2013  40 363.64%  9: 2/28/2013  50 100.00% 10: 2/28/2013  60 100.00% 11: 3/30/2013  10      na 12: 3/30/2013  11  55.00% 13: 3/30/2013  12  40.00% 14: 3/30/2013  13 100.00% 15: 3/30/2013  15 100.00% 

note percent function formattable package used. has advantage values still numeric can used calculation printed percent.


by request of op, here version without using formattable::percent():

setdt(bb)[, d := 100.0 * bal / shift(bal, 6l)][seq(1l, .n, 5l), d := na][   rep(seq(4l, nrow(bb), 5l), each = 2l) + 0:1, d := 100.0][] 
         date bal        d  1: 1/31/2013  10       na  2: 1/31/2013  11       na  3: 1/31/2013  12       na  4: 1/31/2013  13 100.0000  5: 1/31/2013  14 100.0000  6: 2/28/2013  20       na  7: 2/28/2013  30 300.0000  8: 2/28/2013  40 363.6364  9: 2/28/2013  50 100.0000 10: 2/28/2013  60 100.0000 11: 3/30/2013  10       na 12: 3/30/2013  11  55.0000 13: 3/30/2013  12  40.0000 14: 3/30/2013  13 100.0000 15: 3/30/2013  15 100.0000 

the op has requested have dynamic version user can select rows in each group 100. have tried make full flexible version number of elements in each group dynamic (still required same across groups) , packageded function:

divide_by_group <- function(df,                              id_of_rows_in_group_to_override = na,                              val_override = 100.0) {   library(data.table)   # check parameters   checkmate::assert_data_frame(df)   checkmate::assert_names(c("date", "bal"), subset.of = names(df))   checkmate::assert_number(val_override)   # retrieve group length, verify groups have same length   l_grp <- setdt(df)[, .n, = date][     , if (any(n != first(n))) stop("differing group lengths") else first(n)]   # verify user specified row ids   checkmate::assert_integerish(id_of_rows_in_group_to_override, lower = 1l, upper = l_grp)   # compute result   result <- df[, d := 100.0 * bal / shift(bal, l_grp + 1l)][seq(1l, .n, l_grp), d := na]   # apply override   # compute rows   rn <- c(outer(id_of_rows_in_group_to_override, seq(l_grp, nrow(df) - l_grp, 5l), `+`))   # verify rn in range    checkmate::assert_integerish(rn, lower = l_grp + 1l, upper = nrow(df))   result[rn, d := val_override]   return(result[]) } 

note more 50% of code checking parameters , assumptions.

sample calls

divide_by_group(bb) 
         date bal        d  1: 1/31/2013  10       na  2: 1/31/2013  11       na  3: 1/31/2013  12       na  4: 1/31/2013  13       na  5: 1/31/2013  14       na  6: 2/28/2013  20       na  7: 2/28/2013  30 300.0000  8: 2/28/2013  40 363.6364  9: 2/28/2013  50 416.6667 10: 2/28/2013  60 461.5385 11: 3/30/2013  10       na 12: 3/30/2013  11  55.0000 13: 3/30/2013  12  40.0000 14: 3/30/2013  13  32.5000 15: 3/30/2013  15  30.0000 
divide_by_group(bb, 4:5) 
         date bal        d  1: 1/31/2013  10       na  2: 1/31/2013  11       na  3: 1/31/2013  12       na  4: 1/31/2013  13       na  5: 1/31/2013  14       na  6: 2/28/2013  20       na  7: 2/28/2013  30 300.0000  8: 2/28/2013  40 363.6364  9: 2/28/2013  50 100.0000 10: 2/28/2013  60 100.0000 11: 3/30/2013  10       na 12: 3/30/2013  11  55.0000 13: 3/30/2013  12  40.0000 14: 3/30/2013  13 100.0000 15: 3/30/2013  15 100.0000 
divide_by_group(bb, c(2, 5), -9.9) 
         date bal        d  1: 1/31/2013  10       na  2: 1/31/2013  11       na  3: 1/31/2013  12       na  4: 1/31/2013  13       na  5: 1/31/2013  14       na  6: 2/28/2013  20       na  7: 2/28/2013  30  -9.9000  8: 2/28/2013  40 363.6364  9: 2/28/2013  50 416.6667 10: 2/28/2013  60  -9.9000 11: 3/30/2013  10       na 12: 3/30/2013  11  -9.9000 13: 3/30/2013  12  40.0000 14: 3/30/2013  13  32.5000 15: 3/30/2013  15  -9.9000 

Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -