Home > Mobile >  creating conditional column in pandas based on values
creating conditional column in pandas based on values

Time:06-10

I have a data frame in CSV containing 5 columns. I want to create a new column based on the conditions in the rows. Like my df is:

col1 col2 col3 col4
1    1    1    1
0    0    1    1
1    1    1    1
nan  nan  nan  nan

Here is my code sample

m1 = df[['col1','col2','col3','col4']].all(axis=1)
m2 = df[['col1','col2','col3','col4']].isna().any(axis=1)
df['STATUS AUTO'] = np.select([m2, m1], ['ZD', 'FIC'],'PARTIALLY IMMUNIZED')

It does not give me "PARTIALLY IMMUNIZED" although there are many. Like in the above sample row1 is FIC, row2 & row3 are "PARTIALLY IMMUNIZED" while row4 is "ZD". It gives me "ZD" for "PARTIALLY IMMUNIZED". Any help, please. PS: (The same code works for another DF a few months back but not for this DF)

CodePudding user response:

Seems problem with strings instead numbers:

cols = ['col1','col2','col3','col4']

df[cols] = df[cols].astype(float)

m1 = df[cols].eq(1).all(axis=1)
m2 = df[cols].isna().any(axis=1)
df['STATUS AUTO'] = np.select([m2, m1], ['ZD', 'FIC'],'PARTIALLY IMMUNIZED')

print (df)
   col1  col2  col3  col4          STATUS AUTO
0   1.0   1.0   1.0   1.0                  FIC
1   0.0   0.0   1.0   1.0  PARTIALLY IMMUNIZED
2   1.0   1.0   1.0   1.0                  FIC
3   NaN   NaN   NaN   NaN                   ZD
  • Related