My two dataframes:
wetter
Out[223]:
level_0 index TEMPERATURE:TOTAL SLP HOUR Time
0 0 2018-01-01 00:00:00 9.8 NaN 00 00:00
1 1 2018-01-01 01:00:00 9.8 NaN 01 01:00
2 2 2018-01-01 02:00:00 9.2 NaN 02 02:00
3 3 2018-01-01 03:00:00 8.4 NaN 03 03:00
4 4 2018-01-01 04:00:00 8.5 NaN 04 04:00
... ... ... ... ... ...
49034 49034 2018-12-31 22:40:00 8.5 NaN 22 22:40
49035 49035 2018-12-31 22:45:00 8.4 NaN 22 22:45
49036 49036 2018-12-31 22:50:00 8.4 NaN 22 22:50
49037 49037 2018-12-31 22:55:00 8.4 NaN 22 22:55
49038 49038 2018-12-31 23:00:00 8.4 NaN 23 23:00
[49039 rows x 6 columns]
df
Out[224]:
0 Time -14 -13 ... 17 18 NaN
1 00:00 1,256326635 1,218256131 ... 0,080348715 0,040194189 00:15
2 00:15 1,256564788 1,218487067 ... 0,080254367 0,039517006 00:30
3 00:30 1,260350982 1,222158528 ... 0,080219518 0,039054261 00:45
4 00:45 1,259306606 1,221145800 ... 0,080758578 0,039176953 01:00
5 01:00 1,258521518 1,220384502 ... 0,080444585 0,038164953 01:15
.. ... ... ... ... ... ... ...
92 22:45 1,253545107 1,215558891 ... 0,080164570 0,042697436 23:00
93 23:00 1,241253483 1,203639741 ... 0,078395829 0,039685235 23:15
94 23:15 1,242890274 1,205226933 ... 0,078801415 0,039170364 23:30
95 23:30 1,240459118 1,202869448 ... 0,079511294 0,039013684 23:45
96 23:45 1,236228281 1,198766818 ... 0,079186806 0,037570494 00:00
[96 rows x 35 columns]
I want to fill the SLP
column of wetter
based on TEMPERATURE:TOTAL
and Time
.
For this I want to look at the df
dataframe and fill SLP depending on the columns of df
, where the headers are temperatures.
So for the first TEMPERATURE:TOTAL
of 9.8
at 00:00
, SLP
should be filled with the value of the column that is simply called 9
in row 00:00 of Time
I have tried to do this, which is why i also created the Time columns but I am stuck. I thought of some nested loops but knowing a bit of pandas I guess there is probably a two-liner solution for this?
CodePudding user response:
Here is one way!
import numpy as np
import pandas as pd
This is me simulating your dataframes (you are free to skip this step) - next time please provide them.
wetter = pd.DataFrame()
df = pd.DataFrame()
wetter['TEMPERATURE:TOTAL'] = np.random.rand(10) * 10
wetter['SLP'] = np.nan * 10
wetter['Time'] = pd.date_range("00:00", periods=10, freq="H")
df['Time'] = pd.date_range("00:00", periods=10, freq="15T")
for i in range(-14, 18):
df[i] = np.random.rand(10)
Preprocess:
wetter['temp'] = np.floor(wetter['TEMPERATURE:TOTAL'])
wetter = wetter.astype({'temp': 'int'})
wetter.set_index('Time')
df.set_index('Time')
df = df.reset_index()
value_vars_ = list(range(-14, 18))
df_long = pd.melt(df, id_vars='Time', value_vars=value_vars_, var_name='temp', value_name="SLP")
Left-join two dataframes on Time
and temp
:
final = pd.merge(wetter.drop('SLP', axis=1), df_long, how="left", on=["Time", "temp"])