pandas - Kernel dead running a simple average operation using Python -
i running simple average operation on 3 columns. transforming monthly data quarterly average. data looks this:
2000.1 2000.2 2000.3.... 18 15 27
i want transform
2000.q1 20
here have far:
def convert_housing_data_to_quarters(): '''converts housing data quarters , returns mean values in dataframe. dataframe should dataframe columns 2000q1 through 2016q3, , should have multi-index in shape of ["state","regionname"].
note: quarters defined in assignment description, not arbitrary 3 month periods. resulting dataframe should have 67 columns, , 10,730 rows. ''' # read in zillow housing data zillow_df = pd.read_csv('city_zhvi_allhomes.csv') print(zillow_df.iloc[1,1]) print(len(zillow_df)) # slice 2000q1 2016q3 print(zillow_df.columns) print(zillow_df.columns[6:51]) zillow_df.drop(zillow_df.columns[6:51],axis=1,inplace=true) # generate quarterly average y = 2000 q = 1 in range(67): y_q = str(y)+'q'+str(q) #print(y_q) print(zillow_df.columns[6+(i)*3]) print(zillow_df[zillow_df.columns[6+(i)*3]]) zillow_df[y_q]=(zillow_df[zillow_df.columns[6+(i)*3]]+zillow_df[zillow_df.columns[6+1+(i)*3]]+zillow_df[zillow_df.columns[6+2+(i)*3]])/3 q=q+1 if q==5: q=1 y=y+1 return zillow_df.head()
i think code correct every time run in ipython notebook. says kernel dead. not sure why.
i think need convert columns names to_datetime
, month period
to_period
first.
then resample
quarters
, aggregate mean
(axis=1
aggregate columns names).
last convert columns strftime
strings format:
df.columns = pd.to_datetime(df.columns, format='%y.%m').to_period('m') print (df) 2000-01 2000-02 2000-03 0 18 15 27 df = df.resample('q', axis=1).mean() df.columns = df.columns.strftime('%y.q%q') print (df) 2000.q1 0 20
Comments
Post a Comment