I have two time series data frame:
df1 = pd.DataFrame({'Date': [pd.to_datetime('1980-01-03'), pd.to_datetime('1980-01-04'),
pd.to_datetime('1980-01-05'), pd.to_datetime('1980-01-06'),
pd.to_datetime('1980-01-07'), pd.to_datetime('1980-01-8')],
'Temp': [13.5,10,14,12,10,9]})
df1
Date Temp
0 1980-01-03 13.5
1 1980-01-04 10.0
2 1980-01-05 14.0
3 1980-01-06 12.0
4 1980-01-07 10.0
5 1980-01-08 9.0
and
df2 = pd.DataFrame({'Date': [pd.to_datetime('1980-01-01'), pd.to_datetime('1980-01-02'),
pd.to_datetime('1980-01-03'), pd.to_datetime('1980-01-04')],
'Temp': [10,17,13.5,10]})
df2
Date Temp
0 1980-01-01 10.0
1 1980-01-02 17.0
2 1980-01-03 13.5
3 1980-01-04 10.0
Now my task is to join these data frames based on Dates such that the resultant data frame has the dates that are unique to both data frames and also has single entry for common (present in both data frames) dates and are arranged in proper date sequence.
To that effect I tried the following:
df = pd.concat([df1, df2])
df.reset_index().drop(columns = ['index'], axis = 1)
Date Temp
0 1980-01-03 13.5
1 1980-01-04 10.0
2 1980-01-05 14.0
3 1980-01-06 12.0
4 1980-01-07 10.0
5 1980-01-08 9.0
6 1980-01-01 10.0
7 1980-01-02 17.0
8 1980-01-03 13.5
9 1980-01-04
But this is incorrect result. What I am trying to get is:
Date Temp
0 1980-01-01 10.0
1 1980-01-02 17.0
2 1980-01-03 13.5
3 1980-01-04 10.0
4 1980-01-05 14.0
5 1980-01-06 12.0
6 1980-01-07 10.0
7 1980-01-08 9.0
What can I do? May be the pd.concat()
is not the way to go?
CodePudding user response:
A possible solution:
pd.merge(df1, df2, how="outer").sort_values(by="Date").reset_index(drop=True)
Output:
Date Temp
0 1980-01-01 10.0
1 1980-01-02 17.0
2 1980-01-03 13.5
3 1980-01-04 10.0
4 1980-01-05 14.0
5 1980-01-06 12.0
6 1980-01-07 10.0
7 1980-01-08 9.0