I want to add a column to my data frame prod_data
based on a range of dates. This is an example of the data in the column ['Mount Time']
I want to modify the new column from:
0 2022-08-17 06:07:00
1 2022-08-17 06:12:00
2 2022-08-17 06:40:00
3 2022-08-17 06:45:00
4 2022-08-17 06:47:00
The new column is named ['Week']
and I want it to run from M-S, with week 1 starting on 9/5/22, running through 9/11/22 and then week 2 the next M-S, and so on until the last week which would be 53. I would also like weeks previous to 9/5 to have negative week numbers, so 8/29/22 would be the start of week -1 and so on.
The only thing I could think of was to create 2 massive lists and use np.select
to define the parameters of the column, but there has to be a cleaner way of doing this, right?
CodePudding user response:
You can use pandas datetime objects to figure out how many days away a date is from your start date, 9/5/2022, and then use floor division to convert that to week numbers. I made the "mount_time" column just to emphasize that the original column should be a datetime object.
prod_data["mount_time"] = pd.to_datetime( prod_data[ "Mount Time" ] )
start_date = pd.to_datetime( "9/5/2022" )
days_away = prod_data.mount_time - start_date
prod_data["Week"] = ( days_away.dt.days // 7 ) 1
As intended, 9/5/2022 through 9/11/2022 will have a value of 1. 8/29/2022 would start week 0 (not -1 as you wrote) unless you want 9/5/2022 to start as week 0 (in which case just delete the 1 from the code). Some more examples:
>>> test[ ["date", "Week" ] ]
date Week
0 2022-08-05 -4
1 2022-08-14 -3
2 2022-08-28 -1
3 2022-08-29 0
4 2022-08-30 0
5 2022-09-05 1
6 2022-09-11 1
7 2022-09-12 2