Let it be the following python pandas dataframe.
| date | other_columns |...
| ------------- | -------------- |...
| 2022-02-06 | row |...
| 2022-02-07 | row |...
| 2022-02-08 | row |...
| 2022-02-15 | row |...
| 2022-02-24 | row |...
| 2022-02-28 | row |...
I want to add the week corresponding to each date as an additional week
column. It is simply grouping the days in 7-day intervals to assign each number. I don't want the functionality of datetime.week, I want the value to be relative to the month.
| date | other_columns |...| week |
| ------------- | -------------- |...| -------- |
| 2022-02-06 | row |...| 1 week |
| 2022-02-07 | row |...| 1 week |
| 2022-02-08 | row |...| 2 week |
| 2022-02-15 | row |...| 3 week |
| 2022-02-24 | row |...| 4 week |
| 2022-02-28 | row |...| 5 week |
(1-7) correspond to the first week, (8-14) to the second, (15-21) to the third one, (21-28) fourth, (29-31) fifth. Only the day number really matters, not the month.
CodePudding user response:
Could you use something like this?
import pandas as pd
import math
# create a date range
dr = pd.date_range(
start="2022-02-01",
end="2022-02-28",
freq="D"
)
# create a dataframe
df = pd.DataFrame(
{
"date": dr
}
)
# define a function to get the week number
def get_week_in_month(df, date_column):
df["day"] = df[date_column].dt.day
df["week"] = df["day"].apply(lambda x: math.ceil(x / 7))
del df["day"]
return df
# transform the dataframe
df = get_week_in_month(df, "date")
This gives me the following output:
date week
0 2022-02-01 1
1 2022-02-02 1
2 2022-02-03 1
3 2022-02-04 1
4 2022-02-05 1
5 2022-02-06 1
6 2022-02-07 1
7 2022-02-08 2
8 2022-02-09 2
9 2022-02-10 2
10 2022-02-11 2
11 2022-02-12 2
12 2022-02-13 2
13 2022-02-14 2
14 2022-02-15 3
15 2022-02-16 3
16 2022-02-17 3
17 2022-02-18 3
18 2022-02-19 3
19 2022-02-20 3
20 2022-02-21 3
21 2022-02-22 4
22 2022-02-23 4
23 2022-02-24 4
24 2022-02-25 4
25 2022-02-26 4
26 2022-02-27 4
27 2022-02-28 4
You could then format the week number as you needed.
You could also do the whole thing in one line using the code below:
df["week"] = df["date"].dt.day.apply(lambda x: math.ceil(x / 7))
CodePudding user response:
Use:
df['date'] = pd.to_datetime(df['date'])
df['new2'] = ((df["date"].dt.day - 1) // 7 1).astype(str) ' week'
print (df)
date other_columns new2
0 2022-02-06 row 1 week
1 2022-02-07 row 1 week
2 2022-02-08 row 2 week
3 2022-02-15 row 3 week
4 2022-02-24 row 4 week
5 2022-02-28 row 4 week