My data looks like this:
df_dict = {
'Year' : [2021, 2021, 2021, 2021, 2022, 2022, 2022, 2022],
'Week of Year' : [1, 1, 2, 2, 10, 10, 11, 11]
}
df = pd.DataFrame(df_dict)
How can I generate a new column, say Week Order
that shows the unique Year, Week of Year
combinations in a cumulative way. The resulting data set will be like this:
Year Week of Year Week Order
0 2021 1 1
1 2021 1 1
2 2021 2 2
3 2021 2 2
4 2022 10 3
5 2022 10 3
6 2022 11 4
7 2022 11 4
CodePudding user response:
You can use pandas.factorize
:
df['Week Order'] = df.agg(tuple, axis=1).factorize()[0] 1)
# Output :
print(df)
Year Week of Year Week Order
0 2021 1 1
1 2021 1 1
2 2021 2 2
3 2021 2 2
4 2022 10 3
5 2022 10 3
6 2022 11 4
7 2022 11 4
CodePudding user response:
here is one way to do it
df['week order']=1
df['week order']=df['week order'].mask(df.duplicated()).cumsum().ffill().astype(int)
df
Year Week of Year week order
0 2021 1 1
1 2021 1 1
2 2021 2 2
3 2021 2 2
4 2022 10 3
5 2022 10 3
6 2022 11 4
7 2022 11 4
CodePudding user response:
Another option, sort_values
duplicated
cumsum
, i.e. every non duplicated Year Week increases the order by one:
cols = ['Year', 'Week of Year']
df['Week Order'] = (~df.sort_values(cols).duplicated(cols)).cumsum()
df
Year Week of Year Week Order
0 2021 1 1
1 2021 1 1
2 2021 2 2
3 2021 2 2
4 2022 10 3
5 2022 10 3
6 2022 11 4
7 2022 11 4