Home > Software engineering >  Pandas map, check if any values in a list is inside another
Pandas map, check if any values in a list is inside another

Time:09-15

I have the following list

x  = [1,2,3]

And the following df

Sample df

pd.DataFrame({'UserId':[1,1,1,2,2,2,3,3,3,4,4,4],'Origins':[1,2,3,2,2,3,7,8,9,10,11,12]})

Lets say I want to return, the userid who contains any of the values in the list, in his groupby origins list.

Wanted result

pd.Series({'UserId':[1,2]})

What would be the best approach? To do this, maybe a groupby with a lambda, but I am having a little trouble formulating the condition.

CodePudding user response:

df['UserId'][df['Origins'].isin(x)].drop_duplicates()

I had considered using unique(), but that returns a numpy array. Since you wanted a series, I went with drop_duplicates().

CodePudding user response:

IIUC, OP wants, for each Origin, the UserId whose number appears in list x. If that is the case, the following, using pandas.Series.isin and pandas.unique will do the work

df_new = df[df['Origins'].isin(x)]['UserId'].unique()

[Out]:
[1 2]

Assuming one wants a series, one can convert the dataframe to a series as follows

df_new = pd.Series(df_new)

[Out]:
0    1
1    2
dtype: int64

If one wants to return a Series, and do it all in one step, instead of pandas.unique, one can use pandas.DataFrame.drop_duplicates (see Steven Rumbaliski answer).

  • Related