Home > Enterprise >  How to make a new column based on condition from multiple columns
How to make a new column based on condition from multiple columns

Time:11-02

I have a dataframe contains a few columns where the value is either 0 or 1

A B C D E
0 0 0 0 0
0 1 0 0 0
0 0 1 1 0
0 0 0 0 1

So how to create a new column "F" where the condition is :

  • if column A,B,C,D,E contains 1 so the value of F will be 1.

Here's an example of the expected output

A B C D E F
0 0 0 0 0 0
0 1 0 0 0 1
0 0 1 1 0 1
0 0 0 0 1 1

I tried using

def stress(df1):
    if 0 not in ('A', 'B', 'C', 'D', 'E'):
        return 1
    else:
        return 0
    
df1['F'] = df1.apply (stress, axis=1)
df1

but the output became like this

A B C D E F
0 0 0 0 0 1
0 1 0 0 0 1
0 0 1 1 0 1
0 0 0 0 1 1

followed by this warning message :

c:\python39\lib\site-packages\pandas\core\frame.py:3607: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._set_item(key, value)

CodePudding user response:

df['F'] = df.max(axis=1)

or

df['F'] = df.any(axis=1).astype(int)
   A  B  C  D  E  F
0  0  0  0  0  0  0
1  0  1  0  0  0  1
2  0  0  1  1  0  1
3  0  0  0  0  1  1

CodePudding user response:

pandas create new column based on values from other columns / apply a function of multiple columns, row-wise

There are multiple answers in SO. Please use something like this. df.apply (lambda row: label_race(row), axis=1).

Also u can probably do a & operation on the values of row.

This line is wrong if 0 not in ('A', 'B', 'C', 'D', 'E'): because it searches for 0 in a 5-tuple that has the characters A-E. so it is always true.

  • Related