Home > Blockchain >  replace dataframe values based on another dataframe
replace dataframe values based on another dataframe

Time:08-30

I have a pandas dataframe which is structured as follows:

             timestamp       y
0 2020-01-01 00:00:00   336.0
1 2020-01-01 00:15:00   544.0
2 2020-01-01 00:30:00   736.0
3 2020-01-01 00:45:00   924.0
4 2020-01-01 01:00:00  1260.0
...

The timestamp column is a datetime data type

and I have another dataframe with the following structure:

                     y
timestamp
00:00:00    625.076923
00:15:00    628.461538
00:30:00    557.692308
00:45:00    501.692308
01:00:00    494.615385
...

I this case, the time is the pandas datetime index.

Now what I want to do is replace the values in the first dataframe where the time field is matching i.e. the time of the day is matching with the second dataset.

CodePudding user response:

IIUC your first dataframe df1's timestamp is datetime type and your second dataframe (df2) has an index of type datetime as well but only time and not date.

then you can do:

df1['y'] = df1['timestamp'].dt.time.map(df2['y'])

CodePudding user response:

I wouldn't be surprised if there is a better way, but you can accomplish this by working to get the tables so that they can merge on the time. Assuming your dataframes will be df and df2.

df['time'] = df['timestamp'].dt.time
df2 = df2.reset_index()
df2['timestamp'] = pd.to_datetime(df2['timestamp'].dt.time
df_combined = pd.merge(df,df2,left_on='time',right_on='timestamp')
df_combined
    timestamp_x           y_x   time      timestamp_y   y_y
0   2020-01-01 00:00:00 336.0   00:00:00    00:00:00    625.076923
1   2020-01-01 00:15:00 544.0   00:15:00    00:15:00    628.461538
2   2020-01-01 00:30:00 736.0   00:30:00    00:30:00    557.692308
3   2020-01-01 00:45:00 924.0   00:45:00    00:45:00    501.692308
4   2020-01-01 01:00:00 1260.0  01:00:00    01:00:00    494.615385
# This clearly has more than you need, so just keep what you want and rename things back.

df_combined = df_combined[['timestamp_x','y_y']]
df_combined = df_combined.rename(columns={'timestamp_x':'timestamp','y_y':'y'})

CodePudding user response:

New answer I like way better: actually using .map()

Still need to get df2 to have the time column to match on.

df2 = df2.reset_index()
df2['timestamp'] = pd.to_datetime(df2['timestamp'].dt.time
df['y'] = df['timestamp'].dt.time.map(dict(zip(df2['timestamp',df2['y'])))
  • Related