i have a dataframe like this:
id category year freq
101 1 2020 1
101 1 2021 1
202 2 2020 2
202 2 2021 6
203 3 2021 2
I need to transform the dataframe based on id, category and year's value and fill the year's value with frequency for the year. The desired output is:
id category 2020 2021
101 1 1 1
202 2 2 6
203 3 0 2
i have tried using one hot encoding, but the i can't fill each year's column with frequency.
CodePudding user response:
Seems like a job for df.pivot_table
. Notice that we'll use fill_value=0
to replace missing values with 0 (to match your expected output):
>>> df.pivot_table(values="freq", index=["id", "category"], columns="year", fill_value=0)
year 2020 2021
id category
101 1 1 1
202 2 2 6
203 3 0 2