I'm trying to complete the Sessions
column with a unique integer per session for further processing.
A session is defined by one day or a period from 9:30-16:00
Symbol Time Open High Low Close Volume LOD Sessions
2724312 AEHR 2019-09-23 09:31:00 1.42 1.42 1.42 1.42 200 NaN NaN
2724313 AEHR 2019-09-23 09:43:00 1.35 1.35 1.34 1.34 6062 NaN NaN
2724314 AEHR 2019-09-23 09:58:00 1.35 1.35 1.29 1.30 8665 NaN NaN
2724315 AEHR 2019-09-23 09:59:00 1.32 1.32 1.32 1.32 100 NaN NaN
2724316 AEHR 2019-09-23 10:00:00 1.35 1.35 1.35 1.35 400 NaN NaN
... ... ... ... ... ... ... ... ... ...
I've tried everything using loop
but I keep getting every KeyError and SettingWithCopyWarning in the book.
Edit: Error & Code added
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy self._setitem_single_block(indexer, value, name)
df = df
# Columns ['Symbol', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume', 'LOD', 'Sessions']
# Add Date column to loop through
df['Date'] = pd.to_datetime(df['Time']).dt.date
previous_session = df['Date'].iloc[0]
prev_sesh_count = 1
for i, row in df.iterrows():
current_session = df['Date'].iloc[i]
if previous_session == current_session:
df['Sessions'].iloc[i] = prev_sesh_count
else:
df['Sessions'].iloc[i] = prev_sesh_count 1
prev_sesh_count = prev_sesh_count 1
CodePudding user response:
Assuming the dataframe is sorted on Date
, we can use duplicated
along with cumsum
to assign the unqiue sessions numbers
df['Sessions'] = (~df.duplicated(['Symbol', 'Date'])).cumsum()
print(df)
Symbol Time Open High Low Close Volume LOD Sessions Date
2724312 AEHR 2019-09-23 09:31:00 1.42 1.42 1.42 1.42 200 NaN 1 2019-09-23
2724313 AEHR 2019-09-23 09:43:00 1.35 1.35 1.34 1.34 6062 NaN 1 2019-09-23
2724314 AEHR 2019-09-23 09:58:00 1.35 1.35 1.29 1.30 8665 NaN 1 2019-09-23
2724315 AEHR 2019-09-23 09:59:00 1.32 1.32 1.32 1.32 100 NaN 1 2019-09-23
2724316 AEHR 2019-09-23 10:00:00 1.35 1.35 1.35 1.35 400 NaN 1 2019-09-23