When I do match=a.isin(b)
the number of matches doesnt equal the number of matches in match2=b.isin(a)
. Here a and b are dataframe columns (series) and a match is each "True" value in the column. I think of a.isin(b)
as a function returning "True" for those elements in a found in b and b.isin(a)
as a function returning "True" for those elements in b found in a. I would expect an equal amount of matches, why does it not? I have len(match)>>len(match2)
, can this be possible?
CodePudding user response:
I think you are confused on what isin
does.
a = pd.Series([1,1,1,2,2,3])
b = pd.Series([1,2,2,4])
then a.isin(b)
has the same length (and index) as a
:
pd.Series([True, True, True, True, True, False])
while b.isin(a)
has the same length (and index) as ba
:
pd.Series([True, True, True, False])
What will be the same? The unique values:
set(a[a.isin(b)]) == set(b[b.isin(a)])