Home > database >  AttributeError: 'SeriesGroupBy' object has no attribute 'tolist'
AttributeError: 'SeriesGroupBy' object has no attribute 'tolist'

Time:07-06

In a Panda's dataframe: I want to count how many of value 1 there is, in the stroke coulmn, for each value in the Residence_type column. In order to count how much 1 there is, I convert the stroke column to a list, easier I think.

So for example, the value Rural in Residence_type has 300 times 1 in the stroke column.. and so on.

The data is something like this:

    Residence_type  Stroke
0       Rural         1
1       Urban         1
2       Urban         0
3       Rural         1
4       Rural         0
5       Urban         0
6       Urban         0 
7       Urban         1
8       Rural         0
9       Rural         1

The code:

grpby_variable = data.groupby('stroke')
grpby_variable['Residence_type'].tolist().count(1)

the final goal is to find the difference between the number of times the value 1 appears, for each value in the Residence_type column (rural or urban).

Am I doing it right? what is this error ?

CodePudding user response:

Not sure I got what you need done. Please try filter stroke==1, groupby and count;

df.query("Stroke==1").groupby('Residence_type')['Stroke'].agg('count').to_frame('Stroke_Count')

          

                   Stroke_Count
Residence_type              
Rural                      3
Urban                      2

You could try the following if you need the differences between categories

 df1 =df.query("Stroke==1").groupby('Residence_type')['Stroke'].agg('count').to_frame('Stroke_Count')
df1.loc['Diff'] = abs(df1.loc['Rural']-df1.loc['Urban'])
print(df1)
    

                    Stroke_Count
Residence_type              
Rural                      3
Urban                      2
Diff                       1

CodePudding user response:

Assuming that Stroke only contains 1 or 0, you can do:

result_df = df.groupby('Residence_type').sum()

>>> result_df
                Stroke
Residence_type        
Rural                3
Urban                2

>>> result_df.Stroke['Rural'] - result_df.Stroke['Urban']
1
  • Related