Home > OS >  If Condition Based On 2 Columns
If Condition Based On 2 Columns

Time:07-19

Tring to conditionally execute a query, only when ColumnA = 'New' and ColumnB = 'Left' (in each individual row). I know that str.contains() works when I only have 1 condition, however, I'm getting a ValueError ("ValueError: The truth value of a Series is ambiguous..."). Can this approach not be successfully applied, for my given scenario? Alternatively, is there a better approach?

Current code:

if df1['ColumnA'].str.contains('New') and df1['ColumnB'].str.contains('Left'):
   do something...

Very basic example of the dataframe:

ColumnA Column B
New Left
Used Right
Scrap Down
New Right

First row would be the desired row to carry forward (since it meets the criteria).

CodePudding user response:

You have the right idea, however it doesn't appear like your code is expressing exactly what you want to do.

    df1['ColumnA'].str.contains('New')

will return a Series with true and false values corresponding to the indices where the condition is true, not a true or false value for whether the entire column contains 'new'. To accomplish this consider doing something like the following:

    'new' in df['ColumnA'].values

If you are trying to do it on a row by row basis then you must use the bitwise operator to compare truth values across Series (&).

This will return a boolean like you expected, hopefully this helps (:

CodePudding user response:

Use bitwise & on two mask arrays and generate another column.

>>> import pandas as pd
>>> df = pd.DataFrame({'A':['New','Used','Scrap','New'], 'B':['Left','Right','Down','Right']})
>>> df
       A      B
0    New   Left
1   Used  Right
2  Scrap   Down
3    New  Right
>>> df['C'] = df['A'].str.contains('New') & df['B'].str.contains('Left')
>>> df
       A      B      C
0    New   Left   True
1   Used  Right  False
2  Scrap   Down  False
3    New  Right  False
>>>
  • Related