I have the following pandas DataFrame:
df = pd.DataFrame({"person": ["Joe", "Jack", "Thomas", "Jack", "Joe", "Thomas"],
"fruit": ["apple", "apple", "banana", "orange", "banana", "banana"]})
I want to transform this to a new DataFrame where each row is a person, and a column with a count for each type of fruit (the number of different fruit types is very small and known beforehand).
The desired output in this example would look like:
person apple banana orange
Joe 1 1 0
Jack 1 0 1
Thomas 0 2 0
This looks so simple and I know it has something to do with pivoting, but I could not find any example online that works in my case. How to solve this?
Thanks in advance
CodePudding user response:
This should do the job:
df = df.groupby(['person', 'fruit'])['fruit'].count().to_frame()
df = pd.crosstab(index=df.index.get_level_values('person'),
columns=df.index.get_level_values('fruit'),
aggfunc='sum',
values=df.values,
rownames=['person'],
colnames=['fruit']).fillna(0)