I have 2 DataFrames (signal_df
, and price_df
) that can be generated using the following code.
import pandas as pd
import numpy as np
signal_df = pd.DataFrame({
'long':[
True ,True, np.nan, True, np.nan
],
'short':[
np.nan, np.nan, True, np.nan, True
],
'date':[
'2020-01-01 19:15', '2020-01-01 20:00', '2020-01-01 22:15', '2020-01-01 22:45', '2020-01-02 00:30',
],
})
# convert the column (it's a string) to datetime type
datetime_series = pd.to_datetime(signal_df['date'])
# create datetime index passing the datetime series
datetime_index = pd.DatetimeIndex(datetime_series.values)
signal_df = signal_df.set_index(datetime_index)
signal_df.drop('date',axis=1,inplace=True)
print(signal_df)
price_df = pd.DataFrame({
'close_price':[
30, 2, 3, 29,
15, 6, 19, 56,
9 , 38, 41, 12,
23, 14, 15, 16,
38, 18, 19, 20,
21, 22, 23, 33,
25, 26, 10, 28
],
'date':[
'2020-01-01 19:00', '2020-01-01 19:15', '2020-01-01 19:30', '2020-01-01 19:45',
'2020-01-01 20:00', '2020-01-01 20:15', '2020-01-01 20:30', '2020-01-01 20:45',
'2020-01-01 21:00', '2020-01-01 21:15', '2020-01-01 21:30', '2020-01-01 21:45',
'2020-01-01 22:00', '2020-01-01 22:15', '2020-01-01 22:30', '2020-01-01 22:45',
'2020-01-01 23:00', '2020-01-01 23:15', '2020-01-01 23:30', '2020-01-01 23:45',
'2020-01-02 00:00', '2020-01-02 00:15', '2020-01-02 00:30', '2020-01-02 00:45',
'2020-01-02 01:00', '2020-01-02 01:15', '2020-01-02 01:30', '2020-01-02 01:45',
]
})
# convert the column (it's a string) to datetime type
datetime_series = pd.to_datetime(price_df['date'])
# create datetime index passing the datetime series
datetime_index = pd.DatetimeIndex(datetime_series.values)
price_df = price_df.set_index(datetime_index)
price_df.drop('date',axis=1,inplace=True)
print(price_df)
How can I create a third DataFrame (new_df
) that looks like the following?
new_df
long
2020-01-01 19:15:00 2
2020-01-01 20:00:00 15
2020-01-01 22:15:00 NaN
2020-01-01 22:45:00 16
2020-01-02 00:30:00 NaN
new_df
is created by first looking for the True
values in signal_df
's long
column, then replacing them with the corresponding values in price_df
.
CodePudding user response:
You can try mask
the True value in long
column of signal_df
out = (signal_df['long'].mask(signal_df['long'].eq(True),
price_df.loc[signal_df.index, 'close_price'])
.to_frame())
print(out)
long
2020-01-01 19:15:00 2
2020-01-01 20:00:00 15
2020-01-01 22:15:00 NaN
2020-01-01 22:45:00 16
2020-01-02 00:30:00 NaN
CodePudding user response:
You can do:
(price_df.loc[signal_df.where(signal_df.long).index].mask(
signal_df.long.ne(True)).rename(columns={'close_price':'long'}))
output:
long
2020-01-01 19:15:00 2.0
2020-01-01 20:00:00 15.0
2020-01-01 22:15:00 NaN
2020-01-01 22:45:00 16.0
2020-01-02 00:30:00 NaN