python - Pandas if/then aggregation -


i've been searching , haven't figured out yet. hoping can aide python newb solving problem.

i'm trying figure out how write if/then statement in python , perform aggregation off if/then statement. end goal if date = 1/7/2017 use value in "fake" column. if date = else average 2 columns together.

here have far:

import pandas pd import numpy np import datetime  np.random.seed(42) dte=pd.date_range(start=datetime.date(2017,1,1), end= datetime.date(2017,1,15)) fake=np.random.randint(15,100, size=15) fake2=np.random.randint(300,1000,size=15)  so_df=pd.dataframe({'date':dte,              'fake':fake,              'fake2':fake2})  so_df['avg']= so_df[['fake','fake2']].mean(axis=1) so_df.head() 

assuming have computed average column:

so_df['fake'].where(so_df['date']=='20170107', so_df['avg']) out:  0     375.5 1     260.0 2     331.0 3     267.5 4     397.0 5     355.0 6      89.0 7     320.5 8     449.0 9     395.5 10    197.0 11    438.5 12    498.5 13    409.5 14    525.5 name: fake, dtype: float64 

if not, can replace column reference same calculation:

so_df['fake'].where(so_df['date']=='20170107', so_df[['fake','fake2']].mean(axis=1)) 

to check multiple dates, need use element-wise version of or operator (which pipe: |). otherwise raise error.

so_df['fake'].where((so_df['date']=='20170107') | (so_df['date']=='20170109'), so_df['avg']) 

the above checks 2 dates. in case of 3 or more, may want use isin list:

so_df['fake'].where(so_df['date'].isin(['20170107', '20170109', '20170112']), so_df['avg']) out[42]:  0     375.5 1     260.0 2     331.0 3     267.5 4     397.0 5     355.0 6      89.0 7     320.5 8      38.0 9     395.5 10    197.0 11     67.0 12    498.5 13    409.5 14    525.5 name: fake, dtype: float64 

Comments

Popular posts from this blog

python Tkinter Capturing keyboard events save as one single string -

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

javascript - Z-index in d3.js -