python - Merging two Pandas series with duplicate datetime indices -
i have 2 pandas series (d1 , d2) indexed datetime , each containing 1 column of data both float , nan. both indices @ one-day intervals, although time entries inconsistent many periods of missing days. d1 ranges 1974-12-16 2002-01-30. d2 ranges 1997-12-19 2017-07-06. period 1997-12-19 2002-01-30 contains many duplicate indices between 2 series. data duplicated indices same value, different values, or 1 value , nan.
i combine these 2 series one, prioritizing data d2 anytime there duplicate indices (that is, replace d1 data d2 data anytime there duplicated index). efficient way among many pandas tools available (merge, join, concatenate etc.)?
here example of data:
in [7]: print d1 flddate 1974-12-16 19.0 1974-12-17 28.0 1974-12-18 24.0 1974-12-19 18.0 1974-12-20 17.0 1974-12-21 28.0 1974-12-22 28.0 1974-12-23 10.0 1974-12-24 6.0 1974-12-25 5.0 1974-12-26 12.0 1974-12-27 19.0 1974-12-28 22.0 1974-12-29 20.0 1974-12-30 16.0 1974-12-31 12.0 1975-01-01 12.0 1975-01-02 15.0 1975-01-03 14.0 1975-01-04 15.0 1975-01-05 18.0 1975-01-06 21.0 1975-01-07 22.0 1975-01-08 18.0 1975-01-09 20.0 1975-01-10 12.0 1975-01-11 8.0 1975-01-12 -2.0 1975-01-13 13.0 1975-01-14 24.0 ... 2002-01-01 18.0 2002-01-02 16.0 2002-01-03 nan 2002-01-04 24.0 2002-01-05 23.0 2002-01-06 15.0 2002-01-07 22.0 2002-01-08 34.0 2002-01-09 35.0 2002-01-10 29.0 2002-01-11 21.0 2002-01-12 24.0 2002-01-13 nan 2002-01-14 18.0 2002-01-15 14.0 2002-01-16 10.0 2002-01-17 5.0 2002-01-18 7.0 2002-01-19 7.0 2002-01-20 7.0 2002-01-21 11.0 2002-01-22 nan 2002-01-23 9.0 2002-01-24 8.0 2002-01-25 15.0 2002-01-26 nan 2002-01-27 nan 2002-01-28 18.0 2002-01-29 13.0 2002-01-30 13.0 name: maxtempmid, dtype: float64 in [8]: print d2 flddate 1997-12-19 22.0 1997-12-20 14.0 1997-12-21 18.0 1997-12-22 16.0 1997-12-23 16.0 1997-12-24 10.0 1997-12-25 12.0 1997-12-26 12.0 1997-12-27 9.0 1997-12-28 12.0 1997-12-29 18.0 1997-12-30 23.0 1997-12-31 28.0 1998-01-01 26.0 1998-01-02 29.0 1998-01-03 27.0 1998-01-04 22.0 1998-01-05 19.0 1998-01-06 17.0 1998-01-07 14.0 1998-01-08 14.0 1998-01-09 14.0 1998-01-10 16.0 1998-01-11 20.0 1998-01-12 21.0 1998-01-13 19.0 1998-01-14 20.0 1998-01-15 16.0 1998-01-16 17.0 1998-01-17 20.0 ... 2017-06-07 68.0 2017-06-08 71.0 2017-06-09 71.0 2017-06-10 59.0 2017-06-11 41.0 2017-06-12 57.0 2017-06-13 58.0 2017-06-14 36.0 2017-06-15 50.0 2017-06-16 58.0 2017-06-17 54.0 2017-06-18 53.0 2017-06-19 58.0 2017-06-20 68.0 2017-06-21 71.0 2017-06-22 71.0 2017-06-23 59.0 2017-06-24 61.0 2017-06-25 65.0 2017-06-26 68.0 2017-06-27 71.0 2017-06-28 60.0 2017-06-29 54.0 2017-06-30 48.0 2017-07-01 60.0 2017-07-02 68.0 2017-07-03 65.0 2017-07-04 73.0 2017-07-05 74.0 2017-07-06 77.0 name: maxtempmid, dtype: float64
let's use, combine_first
:
df2.combine_first(df1)
output:
flddate 1974-12-16 19.0 1974-12-17 28.0 1974-12-18 24.0 1974-12-19 18.0 1974-12-20 17.0 1974-12-21 28.0 1974-12-22 28.0 1974-12-23 10.0 1974-12-24 6.0 1974-12-25 5.0 1974-12-26 12.0 1974-12-27 19.0 1974-12-28 22.0 1974-12-29 20.0 1974-12-30 16.0 1974-12-31 12.0 1975-01-01 12.0 1975-01-02 15.0 1975-01-03 14.0 1975-01-04 15.0 1975-01-05 18.0 1975-01-06 21.0 1975-01-07 22.0 1975-01-08 18.0 1975-01-09 20.0 1975-01-10 12.0 1975-01-11 8.0 1975-01-12 -2.0 1975-01-13 13.0 1975-01-14 24.0 ... 2017-06-07 68.0 2017-06-08 71.0 2017-06-09 71.0 2017-06-10 59.0 2017-06-11 41.0 2017-06-12 57.0 2017-06-13 58.0 2017-06-14 36.0 2017-06-15 50.0 2017-06-16 58.0 2017-06-17 54.0 2017-06-18 53.0 2017-06-19 58.0 2017-06-20 68.0 2017-06-21 71.0 2017-06-22 71.0 2017-06-23 59.0 2017-06-24 61.0 2017-06-25 65.0 2017-06-26 68.0 2017-06-27 71.0 2017-06-28 60.0 2017-06-29 54.0 2017-06-30 48.0 2017-07-01 60.0 2017-07-02 68.0 2017-07-03 65.0 2017-07-04 73.0 2017-07-05 74.0 2017-07-06 77.0
Comments
Post a Comment