Home > Back-end >  pandas sum boolean values by rows, can contain NaN
pandas sum boolean values by rows, can contain NaN

Time:04-07

So I have a dataframe like this, and I want to find how many students have more than two school experiences.

Name Primary school Middle school High School
Alex Tampli school Compton School The Tampli high school
Peng New york public North Star School
Hu L.A public South planick school
df["Name","Primary school","Middle school","High School"].isnull()
Name Primary school Middle school High School
Alex False False False
Peng False False True
Hu False False True

We can use df.count_value() to get the column summary, but how could I get sum value of rows?

desired output

df["Enough Experience?"]= #code
Name Primary school Middle school High School Enough experience?
Alex False False False True
Peng False False True False
Hu False False True False

CodePudding user response:

Based on your description of the problem, I think you need:

df['Enough experience?'] = df.drop(columns='Name').sum(axis=1).ge(2)

You shouldn't use isnull, that checks for NaN/None, instead sum the booleans (each True counts for 1).

output:

   Name  Primary school  Middle school  High School  Enough experience?
0  Alex           False          False        False               False
1  Peng           False          False         True               False
2    Hu           False          False         True               False

To get the total:

experienced_students = df.drop(columns='Name').sum(axis=1).ge(2).sum()

output: 2

Count of schools:

df['number schools'] = df.drop(columns='Name').sum(axis=1)

   Name  Primary school  Middle school  High School  number schools
0  Alex           False          False        False               0
1  Peng           False          False         True               1
2    Hu           False          False         True               1

CodePudding user response:

Use:

print (df)
   Name  Primary school  Middle school  High School
0  Alex             1.0            NaN          NaN
1  Peng             1.0            NaN          2.0
2    Hu             2.0            7.0          0.0
3  John             NaN            NaN          NaN


# test if all non missing values 
cols = ["Primary school","Middle school","High School"]
df["Enough Experience?1"] = ~df[cols].isnull().any(axis=1)
# test at least 1 missing values 
df["Enough Experience?2"] = df[cols].isnull().any(axis=1)

# test at least 2 missing values per rows with `sum`
df["Enough Experience?3"] = df[cols].isnull().sum(axis=1).lt(2)
df["Enough Experience?4"] = df[cols].count(axis=1).ge(2)
print (df)
   Name  Primary school  Middle school  High School  Enough Experience?1  \
0  Alex             1.0            NaN          NaN                False   
1  Peng             1.0            NaN          2.0                False   
2    Hu             2.0            7.0          0.0                 True   
3  John             NaN            NaN          NaN                False   

   Enough Experience?2  Enough Experience?3  Enough Experience?4  
0                 True                False                False  
1                 True                 True                 True  
2                False                 True                 True  
3                 True                False                False  

Solutions by table ouput - testing if all values are not missing:

# test at least 1 missing values 
cols = ["Primary school","Middle school","High School"]
df["Enough Experience?1"] = df[cols].notna().all(axis=1)
df["Enough Experience?2"] = ~df[cols].isna().any(axis=1)
df["Enough Experience?3"] = df[cols].isnull().sum(axis=1)<1
print (df)
   Name  Primary school  Middle school  High School  Enough Experience?1  \
0  Alex             1.0            NaN          NaN                False   
1  Peng             1.0            NaN          2.0                False   
2    Hu             2.0            7.0          0.0                 True   
3  John             NaN            NaN          NaN                False   

   Enough Experience?2  Enough Experience?3  
0                False                False  
1                False                False  
2                 True                 True  
3                False                False  
  • Related