Syntax issue in creating subsets based on column contents pandas-CodePudding

My dataset looks as follows:

Country code	Value
IRL	10
IRL	12
IRL	11
FRA	15
FRA	16
IND	9
IND	11
USA	19
USA	4
HUN	30
HUN	1
HUN	31
HUN	11

I am attempting to extract rows with specific country codes using the .loc function, however this doesn't seem to work when multiple strings are added into the function.

My code looks as follows: subset = df.loc[df["Country Code"] == ("IRL", "FRA", "IND")]

When I do this, my code doesn't return an error, but rather gives me an empty subset, so I am curious, what is wrong with my syntax, and what is my current code actually doing?

CodePudding user response：

df["Country Code"] == ("IRL", "FRA", "IND") checks for equality between the tuple ("IRL", "FRA", "IND") and each item in the column Country Code - which is why it doesn't error out and would give you nothing (as none of the values in your column is a tuple).

you want to use pd.Series.isin i.e. df["Country Code"].isin(("IRL", "FRA", "IND")) instead

CodePudding user response：

You can do that by nested by multiple loc

display(df.loc[df["Country Code"] =="IRL" || df.loc[df["Country Code"] =="FRA", [columns] )

and so on

Country code	Value
IRL	10
IRL	12
IRL	11
FRA	15
FRA	16
IND	9
IND	11
USA	19
USA	4
HUN	30
HUN	1
HUN	31
HUN	11

Country code	Value
IRL	10
IRL	12
IRL	11
FRA	15
FRA	16
IND	9
IND	11
USA	19
USA	4
HUN	30
HUN	1
HUN	31
HUN	11

Country code	Value
IRL	10
IRL	12
IRL	11
FRA	15
FRA	16
IND	9
IND	11
USA	19
USA	4
HUN	30
HUN	1
HUN	31
HUN	11