Home > Software engineering >  Assign week number relative to month for each date in a DataFrame
Assign week number relative to month for each date in a DataFrame

Time:11-29

Let it be the following python pandas dataframe.

| date          | other_columns  |...
| ------------- | -------------- |...
| 2022-02-06    | row            |...
| 2022-02-07    | row            |...
| 2022-02-08    | row            |...
| 2022-02-15    | row            |...
| 2022-02-24    | row            |...
| 2022-02-28    | row            |...

I want to add the week corresponding to each date as an additional week column. It is simply grouping the days in 7-day intervals to assign each number. I don't want the functionality of datetime.week, I want the value to be relative to the month.

| date          | other_columns  |...| week     |
| ------------- | -------------- |...| -------- |
| 2022-02-06    | row            |...| 1 week   |
| 2022-02-07    | row            |...| 1 week   |
| 2022-02-08    | row            |...| 2 week   |
| 2022-02-15    | row            |...| 3 week   |
| 2022-02-24    | row            |...| 4 week   |
| 2022-02-28    | row            |...| 5 week   |

(1-7) correspond to the first week, (8-14) to the second, (15-21) to the third one, (21-28) fourth, (29-31) fifth. Only the day number really matters, not the month.

CodePudding user response:

Could you use something like this?

import pandas as pd
import math

# create a date range
dr = pd.date_range(
    start="2022-02-01",
    end="2022-02-28",
    freq="D"
)

# create a dataframe
df = pd.DataFrame(
    {
        "date": dr
    }
)

# define a function to get the week number
def get_week_in_month(df, date_column):
    df["day"] = df[date_column].dt.day

    df["week"] = df["day"].apply(lambda x: math.ceil(x / 7))

    del df["day"]

    return df

# transform the dataframe
df = get_week_in_month(df, "date")

This gives me the following output:

    date    week
0   2022-02-01  1
1   2022-02-02  1
2   2022-02-03  1
3   2022-02-04  1
4   2022-02-05  1
5   2022-02-06  1
6   2022-02-07  1
7   2022-02-08  2
8   2022-02-09  2
9   2022-02-10  2
10  2022-02-11  2
11  2022-02-12  2
12  2022-02-13  2
13  2022-02-14  2
14  2022-02-15  3
15  2022-02-16  3
16  2022-02-17  3
17  2022-02-18  3
18  2022-02-19  3
19  2022-02-20  3
20  2022-02-21  3
21  2022-02-22  4
22  2022-02-23  4
23  2022-02-24  4
24  2022-02-25  4
25  2022-02-26  4
26  2022-02-27  4
27  2022-02-28  4

You could then format the week number as you needed.

You could also do the whole thing in one line using the code below:

df["week"] = df["date"].dt.day.apply(lambda x: math.ceil(x / 7))

CodePudding user response:

Use:

df['date'] = pd.to_datetime(df['date'])

df['new2'] = ((df["date"].dt.day - 1) // 7   1).astype(str)   ' week'
print (df)
        date other_columns    new2
0 2022-02-06           row  1 week
1 2022-02-07           row  1 week
2 2022-02-08           row  2 week
3 2022-02-15           row  3 week
4 2022-02-24           row  4 week
5 2022-02-28           row  4 week
  • Related