this is my dataframe:
its got 455 rows with a secuence of a period of days in range of 4 hours each row.
i need to replace each 'demand' value with 0 if the timestamp hours are "23"
so i write this:
datadf['value']=datadf['timestamp'].apply(lambda x, y=datadf['value']: 0 if x.hour==23 else y)
i know the Y value is wrong, but i couldnt find the way to refer to the same row "demand" value inside the lambda.
how can i refer to that demand value? is any alternative that my else do nothing?
CodePudding user response:
import pandas as pd
import numpy as np
#data preparation
df = pd.DataFrame()
df['date'] = pd.date_range(start='2022-06-01',periods=7,freq='4h') pd.Timedelta('3H')
df['val'] = np.random.rand(7)
print(df)
>>
date val
0 2022-06-01 03:00:00 0.601889
1 2022-06-01 07:00:00 0.017787
2 2022-06-01 11:00:00 0.290662
3 2022-06-01 15:00:00 0.179150
4 2022-06-01 19:00:00 0.763534
5 2022-06-01 23:00:00 0.680892
6 2022-06-02 03:00:00 0.585380
#if your dates not datetime format, you must convert it
df['date'] = pd.to_datetime(df['date'])
df.loc[df['date'].dt.hour == 23, 'val'] = 0
#if you don't want to change data in "demand" column you can copy it
#df['val_2'] = df['val']
#df.loc[df['date'].dt.hour == 23, 'val_2'] = 0
print(df)
>>
date val
0 2022-06-01 03:00:00 0.601889
1 2022-06-01 07:00:00 0.017787
2 2022-06-01 11:00:00 0.290662
3 2022-06-01 15:00:00 0.179150
4 2022-06-01 19:00:00 0.763534
5 2022-06-01 23:00:00 0.000000
6 2022-06-02 03:00:00 0.585380