Home > Software engineering >  Slicing python dataframe with isin
Slicing python dataframe with isin

Time:03-02

I am trying something I think is simple, however I keep getting errors and I don't know why.

I am trying to set a value in a new column for df2. If value is column from df2 matches any value from df1 "col", then write "result", otherwise "no result".

#Create a series from df column 
series_from_df = df1['Col']
df2['new_col'] = 'result' if df2['Col1'].isin(series_from_df) else 'Not result'

The above gets me an error:

(<class 'ValueError'>, ValueError('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'), <traceback object at 0x7f9081a28f80>)

Then I try the below adding the square brakes for series_from_df

#Create a series from df column 
series_from_df = df1['Col']
df2['new_col'] = 'result' if df2['Col1'].isin([series_from_df]) else 'Not result'

I get the same error than before.

What am I missing?

CodePudding user response:

df2['Col1'].isin(df1['Col1']) is a boolean Series, but you're trying to use it as a condition in if, which is expecting a truth-value. You can use numpy.where instead where the Series created by the isin is used as the condition:

df2['new_col'] = np.where(df2['Col1'].isin(df1['Col1']), 'result', 'Not result')

CodePudding user response:

You can also use map to replace boolean values:

df2['new_col'] = \
    df2['Col1'].isin(df1['Col']).replace({True: 'result', False: 'Not Result'})
print(df2)

# Output
   Col1     new_col
0     1      result
1     2      result
2     3  Not Result
3     4      result
4     5      result

Setup:

df1 = pd.DataFrame({'Col': [1, 2, 4, 5]})
df2 = pd.DataFrame({'Col1': [1, 2, 3, 4, 5]})
  • Related