im doing some DataFrame manipulation over pandas, i made this two tables, here is a little part of the first
df2 this has a type <class 'pandas.core.frame.DataFrame'>
timestamp open high low close time equity balance position
datetime
2021-10-27 00:40:00 1635313200000 4268.13 4285.00 3890.17 3995.06 1635292800000000000 1000.0 1000.0 0
2021-10-28 00:00:00 1635397200000 3995.06 4405.50 3981.23 4355.00 1635379200000000000 1000.0 1000.0 0
2021-10-29 00:00:00 1635483600000 4355.01 4460.47 4278.70 4336.78 1635465600000000000 1000.0 1000.0 0
2021-10-30 00:00:00 1635570000000 4336.77 4399.14 4232.20 4367.59 1635552000000000000 1000.0 1000.0 0
2021-10-31 00:00:00 1635656400000 4367.58 4374.89 4146.30 4225.00 1635638400000000000 1000.0 1000.0 0
2021-11-01 00:00:00 1635746400000 4225.00 4386.58 4198.55 4365.10 1635724800000000000 1000.0 1000.0 0
2021-11-02 00:00:00 1635832800000 4365.09 4639.98 4330.02 4559.36 1635811200000000000 1000.0 1000.0 0
2021-11-03 00:00:00 1635919200000 4559.36 4665.87 4455.00 4539.73 1635897600000000000 1000.0 1000.0 0
also i got from a algoritmig trading strategy, a dataframe representing the singals here it is signals
datetime
2021-11-28 00:00:33.726288 1
2021-12-06 13:00:14.885480 1
2021-12-06 13:00:16.520445 -1
2021-11-28 00:00:42.704140 -1
2021-12-10 04:00:15.557555 1
2021-12-10 04:00:15.611935 1
2021-12-10 06:00:17.704981 -1
2021-12-10 06:00:18.404945 -1
Name: position, dtype: object
this has a type <class 'pandas.core.series.Series'>
now i am trying to change the position value on the first dataframe to 1 or -1(acording to the position column in the signals dataframe for the rows that has the datetime equal to the second dataframe, i know that it may be posible that the extact datetime can not be on the column, so i tried to make stuff like
df2['position']= signals_orders[df2['position'][signals_orders.index==df2.index]]
but as i suspect there is an length error, i hope you can helpme thanks in advance
edition:
i used the code from mozway
df2.index = pd.to_datetime(df2.index)
signals_orders.index = pd.to_datetime(signals_orders.index)
df2 = df2.sort_index()
# merge
out = (pd
.merge_asof(df2.drop(columns='position'),
signals_orders.sort_index(),
left_index=True, right_index=True, )
.assign(position=lambda d: d['position'].fillna(df2['position']))
)
but now the position column is giving a lot of -1 that should not be there
2021-10-27 00:40:00 1635313200000 4268.13 4272.00 4264.41 4272.00 2021-10-27 00:40:00 0
2021-10-27 00:45:00 1635313500000 4271.99 4277.40 4263.17 4277.39 2021-10-27 00:45:00 0
2021-10-27 00:50:00 1635313800000 4277.40 4277.92 4266.12 4270.95 2021-10-27 00:50:00 0
2021-10-27 00:55:00 1635314100000 4270.94 4280.00 4269.67 4273.46 2021-10-27 00:55:00 0
2021-10-27 01:00:00 1635314400000 4273.46 4285.00 4272.15 4273.11 2021-10-27 01:00:00 0
... ... ... ... ... ... ... ...
2021-12-11 06:40:00 1639226400000 4037.62 4048.55 4037.34 4048.55 2021-12-11 06:40:00 -1
2021-12-11 06:45:00 1639226700000 4048.55 4049.00 4035.14 4035.31 2021-12-11 06:45:00 -1
2021-12-11 06:50:00 1639227000000 4035.32 4042.38 4035.31 4037.37 2021-12-11 06:50:00 -1
2021-12-11 06:55:00 1639227300000 4037.37 4042.44 4030.90 4038.15 2021-12-11 06:55:00 -1
2021-12-11 07:00:00 1639227600000 4038.14 4043.81 4024.55 4025.42 2021-12-11 07:00:00 -1
CodePudding user response:
You should join the make your series into a dataframe with
signals_frame = signals_orders.to_frame()
Then you should join to df2 dataframe on the index
merged_df = pd.merge(df2, signals_frame)
And then you should update the values in the resultant dataframe.
merged_df['position'] = merged_df['whatever_the_column_for_your_1_or_negative_1_is']
CodePudding user response:
IIUC, you should use a merge_asof
:
# ensure datetime
df2.index = pd.to_datetime(df2.index)
signals_orders.index = pd.to_datetime(signals_orders.index)
# merge
out = (pd
.merge_asof(df2.drop(columns='position'),
signals_orders.sort_index(),
left_index=True, right_index=True, )
.assign(position=lambda d: d['position'].fillna(df2['position']))
)
Output:
timestamp open high low close time equity balance position
datetime
2021-10-27 00:40:00 1635313200000 4268.13 4285.00 3890.17 3995.06 1635292800000000000 1000.0 1000.0 0.0
2021-10-28 00:00:00 1635397200000 3995.06 4405.50 3981.23 4355.00 1635379200000000000 1000.0 1000.0 0.0
2021-10-29 00:00:00 1635483600000 4355.01 4460.47 4278.70 4336.78 1635465600000000000 1000.0 1000.0 1.0
2021-10-30 00:00:00 1635570000000 4336.77 4399.14 4232.20 4367.59 1635552000000000000 1000.0 1000.0 1.0
2021-10-31 00:00:00 1635656400000 4367.58 4374.89 4146.30 4225.00 1635638400000000000 1000.0 1000.0 1.0
2021-11-01 00:00:00 1635746400000 4225.00 4386.58 4198.55 4365.10 1635724800000000000 1000.0 1000.0 1.0
2021-11-02 00:00:00 1635832800000 4365.09 4639.98 4330.02 4559.36 1635811200000000000 1000.0 1000.0 -1.0
2021-11-03 00:00:00 1635919200000 4559.36 4665.87 4455.00 4539.73 1635897600000000000 1000.0 1000.0 -1.0
NB. I used a modified input for signals_orders:
datetime
2021-10-28 00:00:33.726288 1
2021-11-01 13:00:14.885480 -1
2021-12-06 13:00:16.520445 -1