Home > database >  How to get condition in Pandas DataFrame
How to get condition in Pandas DataFrame

Time:09-28

I try to apply an IF condition in Pandas DataFrame.

DataFrame which appears as following:

a =  {'Col1': [0,1,1,0],
      'Col2': [0,1,0,1]}
df = pd.DataFrame(data = a)

Desired result is the following:

|   col1 |  col2 |    result
--------------------------------------------------
|   0    |  0    |   other
|   1    |  1    |   col1   col2 
|   1    |  0    |   col1
|   0    |  1    |   col2

By using np.where(), I am blocked how to get columns name suppose I have multiple columns

CodePudding user response:

In [127]: df.dot(df.columns   "   ").str[:-3].replace("", "other")
Out[127]:
0          other
1    Col1   Col2
2           Col1
3           Col2
dtype: object
  • dot product the values of the frame with the columns but " " added as the joiner
    • since values are 1/0, dot product is like a selector
  • strip off the trailing " "s with [:-3]
  • if empty at this point, means came from 0-0; so replace with "other"

CodePudding user response:

I would go for a custom function for this use case converting the dataframe to Boolean, then using the row values as masking for indices:

def func_(row):
    cols = row.index[row]
    if not cols.empty:
        return "   ".join(cols)
    else:
        return "other"
    

df['result']=df.astype(bool).apply(func_, axis=1)

# df
   Col1  Col2     result
0     0     0      other
1     1     1  Col1   Col2
2     1     0       Col1
3     0     1       Col2

CodePudding user response:

(
    df.join(
        df.melt(ignore_index=False, var_name='result')
          .query('value==1')
          .groupby(level=0)['result']
          .apply(lambda x: '   '.join(x)))
    .fillna('other')
)

CodePudding user response:

Here's the np.where() method for multiple value conditions:

import pandas as pd
import numpy as np

a =  {'Col1': [0,1,1,0],
      'Col2': [0,1,0,1]}
df = pd.DataFrame(data = a)

df["result"]=np.where((df["Col1"]==0) & (df["Col2"]==0), "other",
                np.where((df["Col1"]==1) & (df["Col2"]==1), "col1   col2", 
                    np.where(df["Col1"]==1, "col1", "col2")))

print(df)

Result:

   Col1  Col2       result
0     0     0        other
1     1     1  col1   col2
2     1     0         col1
3     0     1         col2

Using nested np.where(), however, can make your code quite ugly and convoluted...

  • Related