Home > OS >  Syntax issue in creating subsets based on column contents pandas
Syntax issue in creating subsets based on column contents pandas

Time:12-21

My dataset looks as follows:

Country code Value
IRL 10
IRL 12
IRL 11
FRA 15
FRA 16
IND 9
IND 11
USA 19
USA 4
HUN 30
HUN 1
HUN 31
HUN 11

I am attempting to extract rows with specific country codes using the .loc function, however this doesn't seem to work when multiple strings are added into the function.

My code looks as follows: subset = df.loc[df["Country Code"] == ("IRL", "FRA", "IND")]

When I do this, my code doesn't return an error, but rather gives me an empty subset, so I am curious, what is wrong with my syntax, and what is my current code actually doing?

CodePudding user response:

df["Country Code"] == ("IRL", "FRA", "IND") checks for equality between the tuple ("IRL", "FRA", "IND") and each item in the column Country Code - which is why it doesn't error out and would give you nothing (as none of the values in your column is a tuple).

you want to use pd.Series.isin i.e. df["Country Code"].isin(("IRL", "FRA", "IND")) instead

CodePudding user response:

You can do that by nested by multiple loc

display(df.loc[df["Country Code"] =="IRL" || df.loc[df["Country Code"] =="FRA", [columns] )

and so on

  • Related