python 3.x - How to sum up combined string has serval numbers in a pandas DataFrame column -

May 15, 2010

i have string contains comma delimited int values, such x = "1,2,3,4,5,6" , how calculate sum of x contained values?

i tried:

values = x.split(",").map(lambda a:int(a)) sum(values)

attributeerror: 'list' object has no attribute 'map'

actually, have pandas dataframe have such data format:

import numpy np import pandas pd df = pd.dataframe({'id':[100,101,201],                    'prices_a':['1,2,3','4,5,6','7,8,9'],                    'prices_b':['1,2,3','2,6,6','3,5,8']})

so be:

   id     prices_a prices_b 0  100    1,2,3    1,2,3 1  101    4,5,6    2,6,6 2  201    7,8,9    3,5,8

i add new column diff compare prices_a & prices_b, if same, df['diff'] = 'match', otherwise, df['diff'] = sum(prices_a values) - sum(prices_b b values)

you can use numpy.where, sums columns use str.split, astype sum per rows (axis=1):

a = df['prices_a'].str.split(',', expand=true).astype(float).sum(axis=1) b = df['prices_b'].str.split(',', expand=true).astype(float).sum(axis=1)  print (a) 0     6.0 1    15.0 2    24.0 dtype: float64  print (b) 0     6.0 1    14.0 2    16.0 dtype: float64  df['df'] =  np.where(df['prices_a'] == df['prices_b'], 'match', - b) print (df)     id prices_a prices_b     df 0  100    1,2,3    1,2,3  match 1  101    4,5,6    2,6,6    1.0 2  201    7,8,9    3,5,8    8.0

but better not mixed strings numeric.

so possible use e.g nans instead match:

df['diff'] =  np.where(df['prices_a'] == df['prices_b'], np.nan, - b) print (df)     id prices_a prices_b  diff 0  100    1,2,3    1,2,3   nan 1  101    4,5,6    2,6,6   1.0 2  201    7,8,9    3,5,8   8.0

Search This Blog

LP

python 3.x - How to sum up combined string has serval numbers in a pandas DataFrame column -

Comments

Post a Comment

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -