List of Python Dataframe column values meeting criteria in another dataframe?-CodePudding

I have two dataframes, df1 and df2, which have a common column heading, Name. The Name values are unique within df1 and df2. df1's Name values are a subset of those in df2; df2 has more rows -- about 17,300 -- than df1 -- about 6,900 -- but each Name value in df1 is in df2. I would like to create a list of Name values in df1 that meet certain criteria in other columns of the corresponding rows in df2. Example:

df1:

   Name Age Hair
0  Jim   25 black
1  Mary  58 brown
3  Sue   15 purple

df2:

   Name Country phoneOS
0  Shari  GB    Android
1  Jim    US    Android
2  Alain  TZ    iOS
3  Sue    PE    iOS
4  Mary   US    Android

I would like a list of only those Name values in df1 that have df2 Country and phoneOS values of US and Android. The example result would be [Jim, Mary].

I have successfully selected rows within one dataframe that meet multiple criteria in order to copy those rows to a new dataframe. In that case pandas/Python does the iteration over rows internally. I guess I could write a "manual" iteration over the rows of df1 and access df2 on each iteration. I was hoping for a more efficient solution whereby the iteration was handled internally as in the single-dataframe case. But my searches for such a solution have been fruitless.

CodePudding user response：

try:

df_1.loc[df_1.Name.isin(df_2.loc[df_2.Country.eq('US') & \
                                 df_2.phoneOS.eq('Android'), 'Name']), 'Name']

Result:

0     Jim
1    Mary
Name: Name, dtype: object

if you want the result as a list just add .to_list() at the end

CodePudding user response：

data = df1.merge(df2, on='Name')

data.loc[((data.phoneOS == 'Android') & (data.Country == "US")), 'Name'].values.tolist()