python - Create new columns by grouping and aggregating multicolumns in pandas -
i have dataframe 50 columns, of them period_start_time, id, speed_throughput, etc. dataframe sample:
id period_start_time speed_througput ... 0 1 2017-06-14 20:00:00 6 1 1 2017-06-14 20:00:00 10 2 1 2017-06-14 21:00:00 2 3 1 2017-06-14 21:00:00 5 4 2 2017-06-14 20:00:00 8 5 2 2017-06-14 20:00:00 12 ...
i have tried go create 2 new columns grouping 2 columns(id , period_start_time) , find avg , min of speed_trhoughput. code i've tried:
df['throughput_avg']=df.sort_values(['period_start_time'],ascending=false).groupby(['period_start_time','id'])[['speed_trhoughput']].max() df['throughput_min'] = df.groupby(['period_start_time', 'id'])[['speed_trhoughput']].min()
as can see, there 2 ways i've tried, nothing works. error message received both attempts:
typeerror:incompatible index of inserted column frame index
i suppose know output needs be, there no need post it.
option 1
use agg
in groupby
, join
attach main dataframe
df.join( df.groupby(['id', 'period_start_time']).speed_througput.agg( ['mean', 'min'] ).rename(columns={'mean': 'avg'}).add_prefix('throughput_'), on=['id', 'period_start_time'] ) id period_start_time speed_througput throughput_avg throughput_min 0 1 2017-06-14 20:00:00 6 8.0 6 1 1 2017-06-14 20:00:00 10 8.0 6 2 1 2017-06-14 21:00:00 2 3.5 2 3 1 2017-06-14 21:00:00 5 3.5 2 4 2 2017-06-14 20:00:00 8 10.0 8 5 2 2017-06-14 20:00:00 12 10.0 8
option 2
use transform
in groupby
context , use assign
add new columns
g = df.groupby(['id', 'period_start_time']).speed_througput.transform df.assign(throughput_avg=g('mean'), throughput_min=g('min')) id period_start_time speed_througput throughput_avg throughput_min 0 1 2017-06-14 20:00:00 6 8.0 6 1 1 2017-06-14 20:00:00 10 8.0 6 2 1 2017-06-14 21:00:00 2 3.5 2 3 1 2017-06-14 21:00:00 5 3.5 2 4 2 2017-06-14 20:00:00 8 10.0 8 5 2 2017-06-14 20:00:00 12 10.0 8
Comments
Post a Comment