How to cast time columns and find timedelta with condition in python pandas -


i have column time non null object , cannot convert timedelta or datetime.

     time             msg 12:29:36.306000      setup 12:29:36.507000      alerting 12:29:38.207000      service 12:29:39.194000      setup 12:30:05.773000      alerting 12:30:06.205000      service 12:32:07.315000      setup 12:32:17.194000      service 12:32:26.889000      setup 12:36:06.274000      alerting 12:36:08.523000      service 12:37:59.200000      setup 12:47:10.652000      alerting 12:47:43.921000      setup 

when type df.info(), got 'time' column non null object , couldn't convert timedelta or datetime(for it's obvious why can't it). so, what's solution find difference between consecutive msg (time delta), if timedelta < 5sec pass. output:

     time             msg         diff 12:29:36.306000      setup          12:29:36.507000      alerting       12:29:38.207000      service 12:29:39.194000      setup 12:30:05.773000      alerting 12:30:06.205000      service 12:32:07.315000      setup 12:32:17.194000      service 12:32:26.889000      setup 12:36:06.274000      alerting    6.30*** 12:36:08.523000      service      12:37:59.200000      setup 12:47:10.652000      alerting    11.02***     12:47:43.921000      setup       

i've tried this:

df['diff'] = (df['time']df['time'].shift()).fillna(0) 

but didn't know write condition 5sec interval.

i think first need convert str , call to_timedelta.

then diff , comapre 5s.

last new column use mask mask:

df['time'] = pd.to_timedelta(df['time'].astype(str))  df['diff'] = df['time'].diff() df['mask'] = df['time'].diff() > pd.timedelta(5, unit='s') print (df)               time       msg            diff   mask 0  12:29:36.306000     setup             nat  false 1  12:29:36.507000  alerting 00:00:00.201000  false 2  12:29:38.207000   service 00:00:01.700000  false 3  12:29:39.194000     setup 00:00:00.987000  false 4  12:30:05.773000  alerting 00:00:26.579000   true 5  12:30:06.205000   service 00:00:00.432000  false 6  12:32:07.315000     setup 00:02:01.110000   true 7  12:32:17.194000   service 00:00:09.879000   true 8  12:32:26.889000     setup 00:00:09.695000   true 9  12:36:06.274000  alerting 00:03:39.385000   true 10 12:36:08.523000   service 00:00:02.249000  false 11 12:37:59.200000     setup 00:01:50.677000   true 12 12:47:10.652000  alerting 00:09:11.452000   true 13 12:47:43.921000     setup 00:00:33.269000   true 

df['time'] = pd.to_timedelta(df['time']) diff = df['time'].diff() mask = df['time'].diff() > pd.timedelta(5, unit='s') df['new'] = diff.where(mask) print (df)               time       msg             new 0  12:29:36.306000     setup             nat 1  12:29:36.507000  alerting             nat 2  12:29:38.207000   service             nat 3  12:29:39.194000     setup             nat 4  12:30:05.773000  alerting 00:00:26.579000 5  12:30:06.205000   service             nat 6  12:32:07.315000     setup 00:02:01.110000 7  12:32:17.194000   service 00:00:09.879000 8  12:32:26.889000     setup 00:00:09.695000 9  12:36:06.274000  alerting 00:03:39.385000 10 12:36:08.523000   service             nat 11 12:37:59.200000     setup 00:01:50.677000 12 12:47:10.652000  alerting 00:09:11.452000 13 12:47:43.921000     setup 00:00:33.269000 

Comments

Popular posts from this blog

PHP and MySQL WP -

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

go - golang pprof for c library code -