Home > Back-end >  week of the year aggregation using python (week starts from 01 01 YYYY)
week of the year aggregation using python (week starts from 01 01 YYYY)

Time:05-23

I search in previous questions, and it does not resolve what i am searching, please can u help me

I have a dataset from

    Date         T2M    Y       T       F        H       G     Week_Number
0   1981-01-01  11.08   17.35   6.94    0.00    5.37    4.63    1
1   1981-01-02  10.82   16.41   7.51    0.00    5.55    2.73    1
2   1981-01-03  10.74   15.64   7.35    0.00    6.23    2.33    1
3   1981-01-04  11.17   15.99   8.46    0.00    6.16    1.66    1
4   1981-01-05  10.20   15.60   6.87    0.12    6.10    2.78    2
5   1981-01-06  10.35   16.16   5.95    0.00    6.59    3.92    2
6   1981-01-07  12.26   18.24   9.30    0.00    6.10    2.30    2
7   1981-01-08  12.76   19.23   8.72    0.00    6.29    3.96    2
8   1981-01-09  12.61   17.80   8.90    0.00    6.71    2.05    2

I already created a column of the week number using this code

df['Week_Number'] = df['Date'].dt.week

but it gives me only the first four days of the year that design the first week, maybe it means that the week start from monday. In my cases I don t give interest if it start from monday or another day, I just want to subdivise each year every seven days (group every 7 days of each year like from 1 1 1980 to 07 1 1980 FISRT WEEK, and go on, and every next year the first week starts too from 1 1 xxxx

CodePudding user response:

If you want your week numbers to start from the 1st of January, irrespective of the day of week, simply get the day of year, subtract 1 and compute the integer division by 7:

df['Date'] = pd.to_datetime(df['Date'])
df['week_number'] = df['Date'].dt.dayofyear.sub(1).floordiv(7).add(1)

NB. you do not need to add 1 if you want the first week to start with 0

output:

        Date    T2M      Y     T     F     H     G  Week_Number  week_number
0 1981-01-01  11.08  17.35  6.94  0.00  5.37  4.63            1            1
1 1981-01-02  10.82  16.41  7.51  0.00  5.55  2.73            1            1
2 1981-01-03  10.74  15.64  7.35  0.00  6.23  2.33            1            1
3 1981-01-04  11.17  15.99  8.46  0.00  6.16  1.66            1            1
4 1981-01-05  10.20  15.60  6.87  0.12  6.10  2.78            2            1
5 1981-01-06  10.35  16.16  5.95  0.00  6.59  3.92            2            1
6 1981-01-07  12.26  18.24  9.30  0.00  6.10  2.30            2            1
7 1981-01-08  12.76  19.23  8.72  0.00  6.29  3.96            2            2
8 1981-01-09  12.61  17.80  8.90  0.00  6.71  2.05            2            2

Then you can use the new column to groupby, for example:

df.groupby('week_number').agg({'Date': ['min', 'max'], 'T2M': 'sum'})

output:

                  Date               T2M
                   min        max    sum
week_number                             
1           1981-01-01 1981-01-07  76.62
2           1981-01-08 1981-01-09  25.37
  • Related