Pandas DataFrame Not Dropping Rows based on String Value-CodePudding

I am having some trouble with filtering a dataset based on string values. I have tried multiple methods and none of them seem to work. I have data which looks like the following:

Some of the "CountryNames" in this dataset are "Unknown", like the following:

I would like to filter out the rows with the "Unknown" value in CountryNames. I have tried mulitple methods and none of them seem to work for some odd reason. They just produce the exact same dataset as before.

Here's a snippet of my code:

data = pd.read_excel(r"C:\Users\DylanNdengu\Downloads\combined_table.xlsx", index_col=False)

located_data = data[~data["CountryNames"].isin(["Unknown"])]

data and located data have the exact same shape, and the rows with Unknown are still there. Please also note I have also tried the following commands:

located_data = data[~data["CountryNames"].isin(["Unknown"])==True]
located_data = data[data["CountryNames"].isin(["Unknown"])==False]
located_data = data[data["CountryNames"]!="Unknown"]

All of these are not working either. Please tell me what I am doing wrong and how to fix this. The dtype for the CountryNames column is "object" if that helps.

CodePudding user response：

From your image examples it looks like the values in your data are actually misspelled as Uknown rather than the correct Unknown. Otherwise your code seems correct.

Try this:

located_data = data[~data["CountryNames"].isin(["Uknown"])]

Alternatively you should fix your data by renaming the mispelled names with the correct ones, for example by using:

data[data["CountryNames"]=="Uknown"] = "Unknown"