Home > Software design >  Get count of each unique values groupby another column and transform them into columns
Get count of each unique values groupby another column and transform them into columns

Time:02-11

I have a dataframe like below:

id name colA colB
One Ana abc xyz
One Ana abc xyz
One Ana bde xyz
One Ana bde xyz
One Ana bde yhn
One Ana bde yhn
One Ana bde qwe
One Ana teh qwe
Two Bob abc qwe
Two Bob teh qwe
Two Bob pop omg

I need to transform my dataframe as

id name abc bde teh pop xyz yhn qwe omg
One Ana 2 5 1 0 4 2 2 0
Two Bob 1 0 1 1 0 0 2 1

I wrote below code to achieve this but it do not gives me expected output and also I have no idea how to perform it for multiple columns. Please help. df = df.groupby(['id','colA']).size().reset_index(name='colA_counts')

CodePudding user response:

you could try this:

a = df[["id", "name", "A"]].copy()
b = df[["id", "name", "B"]].copy()
b.columns = ["id", "name", "A"]

       
result = pd.concat([a, b]).value_counts().reset_index()
result.columns = ["id", "name", "A", "count"]
pd.pivot_table(result, index=["id", "name"], columns=["A"], values="count").fillna(0)

What I'm basically doing is, I first create a new DataFrame, concatenated of the original one where both A and B are just called "A" then as you don't make any difference of whether it is A or B. In this I let it count the different values, i.e. combinations of id, name and value.

Then I pivot the table so, that the values in "A" are considered the columns, finally I fill up all those combinations that don't appear in the data with zeros by calling fillna(0).

What I btw don't understand is, where the 1 in the "one ana pop" in your example comes from. Your data is not providing this.

  • Related