Home > Software design >  concat 2 df by time index without dropping rows
concat 2 df by time index without dropping rows

Time:12-18

I convert the 'time' columns to index with the setindex function. I have 2 dfs, one with quarter hourly data and time as index. The second one contains hourly data with the identical index but hourly.

Now I would like to concat the dfs so the missing rows for the hourly data is just filled up with nans.

I have tried concat and it says "Reindexing only valid with uniquely valued Index objects" I have tried merge with inner, outer, by time, by index. not the result i hoped I have tried join with left, right and all possible combinations. I have tried merge_ordered no success.

I dont know why but I just want to keep the quarter hourly time and add the hourly column so it has (see picture)

There are so many posts about it but nothing has worked, I would really appreciate somebody's help on this.

how it looks initially

desired result result

This is the original df, i would like to concat/merge/append/join whatever another column which has hourly entries. Maybe i need to interpolate the entries and generate quarter hourly data before with filling of NaN?

CodePudding user response:

Here's an example of how to do what I believe your question asks:

import pandas as pd
df1 = pd.DataFrame({
'time':pd.to_datetime(['2021-12-31 19:15:00','2021-12-31 19:30:00','2021-12-31 19:45:00','2021-12-31 20:00:00','2021-12-31 20:15:00','2021-12-31 20:30:00','2021-12-31 20:45:00','2021-12-31 21:00:00','2021-12-31 21:15:00','2021-12-31 21:30:00','2021-12-31 21:45:00']),
'IA:(Euro/MWh)':[6.22,-5.08,-44.94,12.69,14.52,12.94,2.66,35.41,51.77,34.24,1.97],
'ID1:(Euro/MWh)':[25.04,5.36,0.25,6.74,2.44,5.35,8.65,43.96,80.70,90.41,92.83],
'ID3:(Euro/MWh)':[46.57,20.20,1.49,24.27,24.10,19.24,16.52,43.60,65.80,72.87,66.44]})
print(df1)

df2 = pd.DataFrame({
'time':pd.to_datetime(['2021-12-31 20:00:00','2021-12-31 21:00:00']),
'hourly data':[33,66]})
print(df2)

res = df1.join(df2.set_index('time'), on='time')
print(res)

Input:

                  time  IA:(Euro/MWh)  ID1:(Euro/MWh)  ID3:(Euro/MWh)
0  2021-12-31 19:15:00           6.22           25.04           46.57
1  2021-12-31 19:30:00          -5.08            5.36           20.20
2  2021-12-31 19:45:00         -44.94            0.25            1.49
3  2021-12-31 20:00:00          12.69            6.74           24.27
4  2021-12-31 20:15:00          14.52            2.44           24.10
5  2021-12-31 20:30:00          12.94            5.35           19.24
6  2021-12-31 20:45:00           2.66            8.65           16.52
7  2021-12-31 21:00:00          35.41           43.96           43.60
8  2021-12-31 21:15:00          51.77           80.70           65.80
9  2021-12-31 21:30:00          34.24           90.41           72.87
10 2021-12-31 21:45:00           1.97           92.83           66.44
                 time  hourly data
0 2021-12-31 20:00:00           33
1 2021-12-31 21:00:00           66

Output:

                  time  IA:(Euro/MWh)  ID1:(Euro/MWh)  ID3:(Euro/MWh)  hourly data
0  2021-12-31 19:15:00           6.22           25.04           46.57          NaN
1  2021-12-31 19:30:00          -5.08            5.36           20.20          NaN
2  2021-12-31 19:45:00         -44.94            0.25            1.49          NaN
3  2021-12-31 20:00:00          12.69            6.74           24.27         33.0
4  2021-12-31 20:15:00          14.52            2.44           24.10          NaN
5  2021-12-31 20:30:00          12.94            5.35           19.24          NaN
6  2021-12-31 20:45:00           2.66            8.65           16.52          NaN
7  2021-12-31 21:00:00          35.41           43.96           43.60         66.0
8  2021-12-31 21:15:00          51.77           80.70           65.80          NaN
9  2021-12-31 21:30:00          34.24           90.41           72.87          NaN
10 2021-12-31 21:45:00           1.97           92.83           66.44          NaN
  • Related