I search in previous questions, and it does not resolve what i am searching, please can u help me
I have a dataset from
Date T2M Y T F H G Week_Number
0 1981-01-01 11.08 17.35 6.94 0.00 5.37 4.63 1
1 1981-01-02 10.82 16.41 7.51 0.00 5.55 2.73 1
2 1981-01-03 10.74 15.64 7.35 0.00 6.23 2.33 1
3 1981-01-04 11.17 15.99 8.46 0.00 6.16 1.66 1
4 1981-01-05 10.20 15.60 6.87 0.12 6.10 2.78 2
5 1981-01-06 10.35 16.16 5.95 0.00 6.59 3.92 2
6 1981-01-07 12.26 18.24 9.30 0.00 6.10 2.30 2
7 1981-01-08 12.76 19.23 8.72 0.00 6.29 3.96 2
8 1981-01-09 12.61 17.80 8.90 0.00 6.71 2.05 2
I already created a column of the week number using this code
df['Week_Number'] = df['Date'].dt.week
but it gives me only the first four days of the year that design the first week, maybe it means that the week start from monday. In my cases I don t give interest if it start from monday or another day, I just want to subdivise each year every seven days (group every 7 days of each year like from 1 1 1980 to 07 1 1980 FISRT WEEK, and go on, and every next year the first week starts too from 1 1 xxxx
CodePudding user response:
If you want your week numbers to start from the 1st of January, irrespective of the day of week, simply get the day of year, subtract 1 and compute the integer division by 7:
df['Date'] = pd.to_datetime(df['Date'])
df['week_number'] = df['Date'].dt.dayofyear.sub(1).floordiv(7).add(1)
NB. you do not need to add 1 if you want the first week to start with 0
output:
Date T2M Y T F H G Week_Number week_number
0 1981-01-01 11.08 17.35 6.94 0.00 5.37 4.63 1 1
1 1981-01-02 10.82 16.41 7.51 0.00 5.55 2.73 1 1
2 1981-01-03 10.74 15.64 7.35 0.00 6.23 2.33 1 1
3 1981-01-04 11.17 15.99 8.46 0.00 6.16 1.66 1 1
4 1981-01-05 10.20 15.60 6.87 0.12 6.10 2.78 2 1
5 1981-01-06 10.35 16.16 5.95 0.00 6.59 3.92 2 1
6 1981-01-07 12.26 18.24 9.30 0.00 6.10 2.30 2 1
7 1981-01-08 12.76 19.23 8.72 0.00 6.29 3.96 2 2
8 1981-01-09 12.61 17.80 8.90 0.00 6.71 2.05 2 2
Then you can use the new column to groupby
, for example:
df.groupby('week_number').agg({'Date': ['min', 'max'], 'T2M': 'sum'})
output:
Date T2M
min max sum
week_number
1 1981-01-01 1981-01-07 76.62
2 1981-01-08 1981-01-09 25.37