Home > OS >  Create columns in dataframe by applying filters to other columns
Create columns in dataframe by applying filters to other columns

Time:12-21

Following is my dataset:

class StudentName Age Height Weight PassedExam
5 Joe 10 150 56 Pass
5 Nick 10 153 54 Fail
5 Adam 10 156 61 Pass
4 Julia 9 148 50 Pass
7 George 12 156 59 Fail
5 Robert 10 157 60 Pass
6 Sam 11 155 60 Pass
6 travis 11 153 63 Pass

I want select Students if any of the below conditions apply:

Height greater than 150

Passed Exam is equal to 'Pass'

And further create two columns valid and Invalid in output as follows:

class ValidStudent InvalidStudent Weight
5 Adam,Robert Joe,Nick 51,60,56,54
6 Sam,travis 60,63
4 Julia 50
7 George 59

Following is my code to select rows with filter values but I am not able to create the columns:

df.loc[(df['PassedExam'] == 'Pass') & (df['Height'] > 150)]

CodePudding user response:

You can use:

df_valid = df[(df['PassedExam'] == 'Pass') & (df['Height'] > 150)]
df_valid = df_valid.groupby('class').agg(ValidStudent = pd.NamedAgg(column="StudentName", aggfunc=",".join)).reset_index()

df_invalid = df[(df['PassedExam'] != 'Pass') | (df['Height'] <= 150)]
df_invalid = df_invalid.groupby('class').agg(InvalidStudent = pd.NamedAgg(column="StudentName", aggfunc=",".join)).reset_index()

df_valid = df_valid.merge(df_invalid, how='outer', on='class')
df_valid = df_valid.merge(df.groupby('class').agg(Weight = pd.NamedAgg(column="Weight", aggfunc=",".join)),
                          how='left', on=['class'])

OUTPUT

   class ValidStudent InvalidStudent       Weight
0      5  Adam,Robert       Joe,Nick  56,54,61,60
1      6   Sam,travis            NaN        60,63
2      4          NaN          Julia           50
3      7          NaN         George           59
  • Related