Home > database >  Using lamda to compare two columns
Using lamda to compare two columns

Time:06-16

My dataframe is like this: df = pd.DataFrame({'A': [1,2,3], 'B': [1,4,5]})

If column A has the same value as column B, output 1, else 0.

I want to output like this:

    A   B   is_equal
0   1   1   1
1   2   4   0
2   3   5   0

I figured out df['is_equal'] = np.where((df['A'] == df['B']), 1, 0) worked fine.

But I want to use lambda here because I used a similar line in another case before. df['is_equals'] = df.apply(lambda x: 1 if df['A']==1 else 0, axis=1) won't work. It threw the error ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). Why did this error happen and how can I fix the code.

Thank you in advance.

CodePudding user response:

What you attempt to do is very inefficient. Don not do it. .apply should not be used when other solutions are possible. The best solution is:

df['is_equal'] = (df['A'] == df['B']).astype(int)

But if you insist:

df.apply(lambda row: int(row['A'] == row['B']), axis=1)

The latter answer is 2,5 times slower. The original np.where is the fastest.

CodePudding user response:

I also agree with DYZ's opinion. But if you want to use .apply anyway, I can suggest something like this.

df['is_equal'] = df.apply(lambda x: 1 if x.A == x.B else 0, axis=1)
  • Related