python - Split a dataframe by time column - pandas -


i select right portion of dataset explain following example:

input df:

id_b, ts_b,value id1,2017-04-27 01:35:30,0 id1,2017-04-27 01:35:40,0 id1,2017-04-27 01:35:50,1 id1,2017-04-27 01:36:00,4 id1,2017-04-27 01:36:10,5 id1,2017-04-27 01:36:20,100 id1,2017-04-27 01:36:30,155 id1,2017-04-27 01:36:40,235 id1,2017-04-27 01:36:50,0 id1,2017-04-27 01:36:60,0 id1,2017-04-27 01:37:00,2353 id1,2017-04-27 01:37:10,221 id1,2017-04-27 01:37:20,2432 id1,2017-04-27 01:37:30,2654 id1,2017-04-27 01:37:40,12 id1,2017-04-27 01:37:50,5 id1,2017-04-27 01:38:00,5 id1,2017-04-27 01:38:10,23 id1,2017-04-27 01:38:20,5 id1,2017-04-27 01:38:30,2 id1,2017-04-27 01:38:40,2 id1,2017-04-27 01:38:50,1 id1,2017-04-27 01:39:00,0 id1,2017-04-27 01:39:10,0 id1,2017-04-27 01:39:20,0 id1,2017-04-27 01:39:30,0 id1,2017-04-27 01:39:40,0 id1,2017-04-27 01:39:50,0 id1,2017-04-27 01:40:00,0 id1,2017-04-27 01:40:10,1 id1,2017-04-27 01:40:20,5 id1,2017-04-27 01:40:30,221 id1,2017-04-27 01:40:40,2432 id1,2017-04-27 01:40:50,2654  id1,2017-04-27 01:40:60,12 id1,2017-04-27 01:41:00,5 id1,2017-04-27 01:41:10,5 id1,2017-04-27 01:41:20,23 id1,2017-04-27 01:41:30,5 id1,2017-04-27 01:41:40,2 id1,2017-04-27 01:41:50,1 

considering following: segment_number = 1
duration = 3 minuts

i want select first segment of dataframe starting first df.value non 0 until last value covering duration of 3 minutes.

output: id1,2017-04-27 01:35:50,1 id1,2017-04-27 01:36:00,4 id1,2017-04-27 01:36:10,5 id1,2017-04-27 01:36:20,100 id1,2017-04-27 01:36:30,155 id1,2017-04-27 01:36:40,235 id1,2017-04-27 01:36:50,0 id1,2017-04-27 01:36:60,0 id1,2017-04-27 01:37:00,2353 id1,2017-04-27 01:37:10,221 id1,2017-04-27 01:37:20,2432 id1,2017-04-27 01:37:30,2654 id1,2017-04-27 01:37:40,12 id1,2017-04-27 01:37:50,5 id1,2017-04-27 01:38:00,5 id1,2017-04-27 01:38:10,23 id1,2017-04-27 01:38:20,5 id1,2017-04-27 01:38:30,2 id1,2017-04-27 01:38:40,2 id1,2017-04-27 01:38:50,1

considering following: segment_number = 2
duration = 1.40 minuts

i want select second segment of dateframe starting first df.value non 0 until last value covering duration of 1.40 minutes.

output:

id1,2017-04-27 01:40:10,1 id1,2017-04-27 01:40:20,5 id1,2017-04-27 01:40:30,221 id1,2017-04-27 01:40:40,2432 id1,2017-04-27 01:40:50,2654  id1,2017-04-27 01:40:60,12 id1,2017-04-27 01:41:00,5 id1,2017-04-27 01:41:10,5 id1,2017-04-27 01:41:20,23 id1,2017-04-27 01:41:30,5 id1,2017-04-27 01:41:40,2 id1,2017-04-27 01:41:50,1 

so far, did indexed df w.r.t ts_b using `pd.to_datetime , set_index' , using variable "last_end_point" keeps track of index of previous segment.
not right output.

any appreciated.

this answer formulated:

import pandas pd import numpy np import datetime  df = pd.read_csv("filename.csv") df['ts_b'] = pd.to_datetime(df['ts_b'])    def find_the_energenies_segment(key_mapped, duration, energenie_df, threshold):     non_zero_indexs = energenie_df[energenie_df["value"]>threshold].index       first_index = non_zero_indexs[0]  if len(non_zero_indexs)>0 else none       if(not first_index):        return {"sub_df": none,            "start_index": none,            "end_index":none,            "duration": duration}      start_time = energenie_df.loc[first_index].ts_b      hours,minutes,seconds = duration.split(":")     end_time = start_time + datetime.timedelta(hours=int(hours),minutes=int(minutes),seconds=int(seconds))       last_index = energenie_df[energenie_df["ts_b"]>end_time].index[0]-1       return {"sub_df": energenie_df.loc[first_index:last_index],        "start_index": first_index,        "end_index":last_index,        "duration": duration}   out = find_the_energenies_segment("id1", "00:03:00", df, 0 ) print(out) 

Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -