Home > Net >  Need pandas groupby.count() or groupby.size.unstack() to output a dataframe I can use
Need pandas groupby.count() or groupby.size.unstack() to output a dataframe I can use

Time:04-17

So I need to count the number of occurrences of a value per year, per animal. I've managed to do it but it's outputting a single column kind of dataframe rather than the data being in workable cells.

I've used:

    df.groupby(["Animal", "year"])["value"].count()

and:

    df.groupby(["Animal", "year", "value"]).size().unstack(fill_value=0)

and I tried a pivot but it gave an error, in the error message it said too much data but the main error was "Index contains duplicate entries". I've about 13000 rows of data.

I have no idea what to do with the response. I can't call the columns like you would with a Dataframe, I don't know how to. I want to be able to create proportions, like value A is 10% and value B is 90% for animal 1 for year 2020, in a Dataframe.

I tried to do df.to_frame() after the count() created a series, one but it just created a one column DataFrame.

The data is like:

    Animal  Year Value
    1       2020  A
    1       2020  A
    1       2019  B
    1       2019  B
    2       2020  A

And I need it to be:

    Animal   Year  A  B
     1       2020  2  0
     1       2019  0  2
     2       2020  1  0

But in a full proper Dataframe of 4 columns, not squished into one column. Then I can create the % proportions from there.

Thanks in advance!!

CodePudding user response:

Try:

x = df.pivot_table(
    index=["Animal", "Year"], columns="Value", aggfunc="size", fill_value=0
).reset_index()
x.columns.name = None
print(x)

Prints:

   Animal  Year  A  B
0       1  2019  0  2
1       1  2020  2  0
2       2  2020  1  0
  • Related