Home > other >  Add column to dataframe based on date range
Add column to dataframe based on date range

Time:08-24

I want to add a column to my data frame prod_data based on a range of dates. This is an example of the data in the column ['Mount Time'] I want to modify the new column from:

0     2022-08-17 06:07:00
1     2022-08-17 06:12:00
2     2022-08-17 06:40:00
3     2022-08-17 06:45:00
4     2022-08-17 06:47:00

The new column is named ['Week'] and I want it to run from M-S, with week 1 starting on 9/5/22, running through 9/11/22 and then week 2 the next M-S, and so on until the last week which would be 53. I would also like weeks previous to 9/5 to have negative week numbers, so 8/29/22 would be the start of week -1 and so on.

The only thing I could think of was to create 2 massive lists and use np.select to define the parameters of the column, but there has to be a cleaner way of doing this, right?

CodePudding user response:

You can use pandas datetime objects to figure out how many days away a date is from your start date, 9/5/2022, and then use floor division to convert that to week numbers. I made the "mount_time" column just to emphasize that the original column should be a datetime object.

prod_data["mount_time"] = pd.to_datetime( prod_data[ "Mount Time" ] )
start_date = pd.to_datetime( "9/5/2022" )
days_away = prod_data.mount_time - start_date
prod_data["Week"] = ( days_away.dt.days // 7 )   1

As intended, 9/5/2022 through 9/11/2022 will have a value of 1. 8/29/2022 would start week 0 (not -1 as you wrote) unless you want 9/5/2022 to start as week 0 (in which case just delete the 1 from the code). Some more examples:

>>> test[ ["date", "Week" ] ]
        date  Week
0 2022-08-05    -4
1 2022-08-14    -3
2 2022-08-28    -1
3 2022-08-29     0
4 2022-08-30     0
5 2022-09-05     1
6 2022-09-11     1
7 2022-09-12     2
  • Related