If Condition Based On 2 Columns-CodePudding

Tring to conditionally execute a query, only when ColumnA = 'New' and ColumnB = 'Left' (in each individual row). I know that str.contains() works when I only have 1 condition, however, I'm getting a ValueError ("ValueError: The truth value of a Series is ambiguous..."). Can this approach not be successfully applied, for my given scenario? Alternatively, is there a better approach?

Current code:

if df1['ColumnA'].str.contains('New') and df1['ColumnB'].str.contains('Left'):
   do something...

Very basic example of the dataframe:

ColumnA	Column B
New	Left
Used	Right
Scrap	Down
New	Right

First row would be the desired row to carry forward (since it meets the criteria).

CodePudding user response：

You have the right idea, however it doesn't appear like your code is expressing exactly what you want to do.

    df1['ColumnA'].str.contains('New')

will return a Series with true and false values corresponding to the indices where the condition is true, not a true or false value for whether the entire column contains 'new'. To accomplish this consider doing something like the following:

    'new' in df['ColumnA'].values

If you are trying to do it on a row by row basis then you must use the bitwise operator to compare truth values across Series (&).

This will return a boolean like you expected, hopefully this helps (:

CodePudding user response：

Use bitwise & on two mask arrays and generate another column.

>>> import pandas as pd
>>> df = pd.DataFrame({'A':['New','Used','Scrap','New'], 'B':['Left','Right','Down','Right']})
>>> df
       A      B
0    New   Left
1   Used  Right
2  Scrap   Down
3    New  Right
>>> df['C'] = df['A'].str.contains('New') & df['B'].str.contains('Left')
>>> df
       A      B      C
0    New   Left   True
1   Used  Right  False
2  Scrap   Down  False
3    New  Right  False
>>>