Home > Net >  Access a list of columns to be evaluated in a Comprehension Expression
Access a list of columns to be evaluated in a Comprehension Expression

Time:06-14

There is this dataframe called frame with columns: Age, Maturity, Gender, Height which has values 'PASSED' or 'FAILED'.

I want to create a new column called result and count the number of a subset of the overall columns as seen in check_columns which has the value PASSED.

I tried to use a comprehension expression which as seen in this case I wanted it to be evaluated to 2 since the columns Age and Maturity have PASSED while Gender has FAILED which are present in a subset of columns in check_columns.

frame = pd.DataFrame(
    data = {'Age':['PASSED'], 'Maturity':['PASSED'],'Gender':['FAILED'], 'Height':['PASSED']}
)
check_columns = ['Age','Maturity','Gender']
frame['result'] = sum([1  if frame[column] =='PASSED' else 0 for column in check_columns ])

I tried to used a comprehension with a list but it says this error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

CodePudding user response:

Select only the desired columns, check which ones are equal to 'PASSED' using DataFrame.eq, and count the True values using DataFrame.sum (pass axis=1 to count column-wise)

import pandas as pd

frame = pd.DataFrame(
    data = {'Age':['PASSED'], 'Maturity':['PASSED'],'Gender':['FAILED'], 'Height':['PASSED']}
)
check_columns = ['Age','Maturity','Gender']

frame['results'] = frame[check_columns].eq('PASSED').sum(axis=1)

Output:

>>> frame 

      Age Maturity  Gender  Height  results
0  PASSED   PASSED  FAILED  PASSED        2
  • Related