I am trying something I think is simple, however I keep getting errors and I don't know why.
I am trying to set a value in a new column for df2
. If value is column from df2
matches any value from df1
"col", then write "result", otherwise "no result".
#Create a series from df column
series_from_df = df1['Col']
df2['new_col'] = 'result' if df2['Col1'].isin(series_from_df) else 'Not result'
The above gets me an error:
(<class 'ValueError'>, ValueError('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'), <traceback object at 0x7f9081a28f80>)
Then I try the below adding the square brakes for series_from_df
#Create a series from df column
series_from_df = df1['Col']
df2['new_col'] = 'result' if df2['Col1'].isin([series_from_df]) else 'Not result'
I get the same error than before.
What am I missing?
CodePudding user response:
df2['Col1'].isin(df1['Col1'])
is a boolean Series, but you're trying to use it as a condition in if
, which is expecting a truth-value. You can use numpy.where
instead where the Series created by the isin
is used as the condition:
df2['new_col'] = np.where(df2['Col1'].isin(df1['Col1']), 'result', 'Not result')
CodePudding user response:
You can also use map
to replace boolean values:
df2['new_col'] = \
df2['Col1'].isin(df1['Col']).replace({True: 'result', False: 'Not Result'})
print(df2)
# Output
Col1 new_col
0 1 result
1 2 result
2 3 Not Result
3 4 result
4 5 result
Setup:
df1 = pd.DataFrame({'Col': [1, 2, 4, 5]})
df2 = pd.DataFrame({'Col1': [1, 2, 3, 4, 5]})