r - Grouping and dividing between the columns -
this question follow-up group group division
date bal d 1: 1/31/2013 10 na 2: 1/31/2013 11 na 3: 1/31/2013 12 na 4: 1/31/2013 13 na 5: 1/31/2013 14 na 6: 2/28/2013 20 na 7: 2/28/2013 30 3.000000 8: 2/28/2013 40 3.636364 9: 2/28/2013 50 4.166667 10: 2/28/2013 60 4.615385 11: 3/30/2013 10 na 12: 3/30/2013 11 0.550000 13: 3/30/2013 12 0.400000 14: 3/30/2013 13 0.325000 15: 3/30/2013 15 0.300000
as per below:
library(data.table) # cran version 1.10.4 used setdt(bb)[, d := bal / shift(bal, 6l)][seq(1l, nrow(bb), 5l), d := na][]
now questions is:
at every 4th , 5th of each group, answer should print 100%, means, 9th, 10th, 14th , 15th, , on, values under d should 100%
values in d should in %
expected o/p
date bal d 1: 1/31/2013 10 na 2: 1/31/2013 11 na 3: 1/31/2013 12 na 4: 1/31/2013 13 100.00 5: 1/31/2013 14 100.00 6: 2/28/2013 20 na 7: 2/28/2013 30 300.00 8: 2/28/2013 40 363.64 9: 2/28/2013 50 100.00 10: 2/28/2013 60 100.00 11: 3/30/2013 10 na 12: 3/30/2013 11 55.00 13: 3/30/2013 12 40.00 14: 3/30/2013 13 100.00 15: 3/30/2013 15 100.00
thats expected output.
it supposed same conditions hold in previous answer, namely there same number of rows each date. observation, simple solution possible lagging value of bal
6 rows denominator. ignores groups in first place, necessary set result d
na
first row in each group, i.e. in every 5th row, finally.
the additional request specific rows need manually overwritten 1.0
(printed 100%
) handled likewise computing respective indices.
library(data.table) setdt(bb)[, d := formattable::percent(bal / shift(bal, 6l))][seq(1l, .n, 5l), d := na][ rep(seq(4l, nrow(bb), 5l), each = 2l) + 0:1, d := 1.0][]
date bal d 1: 1/31/2013 10 na 2: 1/31/2013 11 na 3: 1/31/2013 12 na 4: 1/31/2013 13 100.00% 5: 1/31/2013 14 100.00% 6: 2/28/2013 20 na 7: 2/28/2013 30 300.00% 8: 2/28/2013 40 363.64% 9: 2/28/2013 50 100.00% 10: 2/28/2013 60 100.00% 11: 3/30/2013 10 na 12: 3/30/2013 11 55.00% 13: 3/30/2013 12 40.00% 14: 3/30/2013 13 100.00% 15: 3/30/2013 15 100.00%
note percent
function formattable
package used. has advantage values still numeric can used calculation printed percent.
by request of op, here version without using formattable::percent()
:
setdt(bb)[, d := 100.0 * bal / shift(bal, 6l)][seq(1l, .n, 5l), d := na][ rep(seq(4l, nrow(bb), 5l), each = 2l) + 0:1, d := 100.0][]
date bal d 1: 1/31/2013 10 na 2: 1/31/2013 11 na 3: 1/31/2013 12 na 4: 1/31/2013 13 100.0000 5: 1/31/2013 14 100.0000 6: 2/28/2013 20 na 7: 2/28/2013 30 300.0000 8: 2/28/2013 40 363.6364 9: 2/28/2013 50 100.0000 10: 2/28/2013 60 100.0000 11: 3/30/2013 10 na 12: 3/30/2013 11 55.0000 13: 3/30/2013 12 40.0000 14: 3/30/2013 13 100.0000 15: 3/30/2013 15 100.0000
the op has requested have dynamic version user can select rows in each group 100. have tried make full flexible version number of elements in each group dynamic (still required same across groups) , packageded function:
divide_by_group <- function(df, id_of_rows_in_group_to_override = na, val_override = 100.0) { library(data.table) # check parameters checkmate::assert_data_frame(df) checkmate::assert_names(c("date", "bal"), subset.of = names(df)) checkmate::assert_number(val_override) # retrieve group length, verify groups have same length l_grp <- setdt(df)[, .n, = date][ , if (any(n != first(n))) stop("differing group lengths") else first(n)] # verify user specified row ids checkmate::assert_integerish(id_of_rows_in_group_to_override, lower = 1l, upper = l_grp) # compute result result <- df[, d := 100.0 * bal / shift(bal, l_grp + 1l)][seq(1l, .n, l_grp), d := na] # apply override # compute rows rn <- c(outer(id_of_rows_in_group_to_override, seq(l_grp, nrow(df) - l_grp, 5l), `+`)) # verify rn in range checkmate::assert_integerish(rn, lower = l_grp + 1l, upper = nrow(df)) result[rn, d := val_override] return(result[]) }
note more 50% of code checking parameters , assumptions.
sample calls
divide_by_group(bb)
date bal d 1: 1/31/2013 10 na 2: 1/31/2013 11 na 3: 1/31/2013 12 na 4: 1/31/2013 13 na 5: 1/31/2013 14 na 6: 2/28/2013 20 na 7: 2/28/2013 30 300.0000 8: 2/28/2013 40 363.6364 9: 2/28/2013 50 416.6667 10: 2/28/2013 60 461.5385 11: 3/30/2013 10 na 12: 3/30/2013 11 55.0000 13: 3/30/2013 12 40.0000 14: 3/30/2013 13 32.5000 15: 3/30/2013 15 30.0000
divide_by_group(bb, 4:5)
date bal d 1: 1/31/2013 10 na 2: 1/31/2013 11 na 3: 1/31/2013 12 na 4: 1/31/2013 13 na 5: 1/31/2013 14 na 6: 2/28/2013 20 na 7: 2/28/2013 30 300.0000 8: 2/28/2013 40 363.6364 9: 2/28/2013 50 100.0000 10: 2/28/2013 60 100.0000 11: 3/30/2013 10 na 12: 3/30/2013 11 55.0000 13: 3/30/2013 12 40.0000 14: 3/30/2013 13 100.0000 15: 3/30/2013 15 100.0000
divide_by_group(bb, c(2, 5), -9.9)
date bal d 1: 1/31/2013 10 na 2: 1/31/2013 11 na 3: 1/31/2013 12 na 4: 1/31/2013 13 na 5: 1/31/2013 14 na 6: 2/28/2013 20 na 7: 2/28/2013 30 -9.9000 8: 2/28/2013 40 363.6364 9: 2/28/2013 50 416.6667 10: 2/28/2013 60 -9.9000 11: 3/30/2013 10 na 12: 3/30/2013 11 -9.9000 13: 3/30/2013 12 40.0000 14: 3/30/2013 13 32.5000 15: 3/30/2013 15 -9.9000
Comments
Post a Comment