Home > Back-end >  How can I groupby and get unique value counts for each person in Dataframe and then write that to an
How can I groupby and get unique value counts for each person in Dataframe and then write that to an

Time:06-02

I am fairly new with Python but would like to learn it. Basically I would like to get a value count by grouping. My dataset looks something like that below. I would like to sort by Client and get a count of "Expired" and "Not Expired" for each person. Then I would like to write to an excel file giving each unique "Client" its own sheet. I have the write to Excel file code but am completely lost on getting the groupby and value counts to work properly.

with pd.ExcelWriter("Test.xlsx") as writer: for client, data in summary2.groupby('client'): data.to_excel(writer, sheet_name = client)

My dataframe looks something like this:

Client |  Result    |  Result2   |
Bob    |  Expired   |  Expired   |
Bob    |  Expired   | Not Expire |
Sally  | Not Expire | Not Expire |
Bob    |  Expired   |  Expired   |
Sally  |  Expired   |  Expired   |
Doug   |  Expired   |  Expired   |
Bob    |  Expired   |  Expired   |
Doug   |  Expired   |  Expired   |

What I would like is:

Client |  Result    |  Result2   |
Bob    |  Expired   |  Expired   |
Bob    |  Expired   | Not Expire |
Bob    |  Expired   |  Expired   |
Bob    |  Expired   |  Expired   |
n  Expired |   4       |    3   |
n Not Expire|  0       |    2   |

Sally  | Not Expire | Not Expire |
Sally  |  Expired   |  Expired   |
n  Expired |    1       |    1
n Not Expire|   1       |    1

Doug   |  Expired   |  Expired   |
Doug   |  Expired   |  Expired   |
n  Expired |    2       |    2
n Not Expire|   0       |    0

CodePudding user response:

Here is one way bring the columns (result, result2) as rows and then using groupby to take the count

df2=df.melt('Client')
df2.groupby(['Client','variable','value'])['value'].count()
Client  variable    value   value
Bob Result  Expired 4
Bob Result2 Expired 3
Bob Result2 Not Expire  1
Doug    Result  Expired 2
Doug    Result2 Expired 2
Sally   Result  Expired 1
Sally   Result  Not Expire  1
Sally   Result2 Expired 1
Sally   Result2 Not Expire  1

CodePudding user response:

Here is the groupby

sample1 = pd.DataFrame(sample.groupby(['Client', 'Result', 'Result2']).size()).reset_index()

enter image description here

  • Related