Let it be the following Python Panda DataFrame:
| country_ID | date | counter | value |
| -------- | ------------------------- | -------- | ------ |
| USA | 2022-03-01 09:22:29 00:00 | 1 | red |
| UK | 2022-03-01 11:21:20 00:00 | 1 | blue |
| USA | 2022-04-02 12:15:23 00:00 | 1 | red |
| ITL | 2022-04-03 11:13:31 00:00 | 1 | red |
| USA | 2022-05-05 21:04:42 00:00 | 1 | green |
| USA | 2022-05-05 22:01:51 00:00 | 1 | green |
| ITL | 2022-06-06 13:00:41 00:00 | 1 | red |
Given a date and time range (start and end) and a country_ID
, I want to add 2 hours to the rows that are in that range:
Example:
add_hours('USA', '2022-03-01 09:00:00', '2022-05-05 21:30:00', 2)
| country_ID | date | counter | value |
| -------- | ------------------------- | -------- | ------ |
| USA | 2022-03-01 11:22:29 00:00 | 1 | red |
| UK | 2022-03-01 11:21:20 00:00 | 1 | blue |
| USA | 2022-04-02 14:15:23 00:00 | 1 | red |
| ITL | 2022-04-03 11:13:31 00:00 | 1 | red |
| USA | 2022-05-05 23:04:42 00:00 | 1 | green |
| USA | 2022-05-05 22:01:51 00:00 | 1 | green |
| ITL | 2022-06-06 13:00:41 00:00 | 1 | red |
CodePudding user response:
Try your logic with boolean indexing (date
must also be a datetime object, not a string):
def add_hours(df, country, start, end):
is_country = df['country_ID'].eq(country)
valid_date = df['date'].between(start, end)
df.loc[is_country & valid_date, 'date'] = pd.Timedelta('2H')