Home > Back-end >  Dataframe - filter the values of a particular column with isin()
Dataframe - filter the values of a particular column with isin()

Time:06-01

I have a pandas dataframe in which I have the column "Bio Location", I would like to filter it so that I only have the locations of my list in which there are names of cities. I have made the following code which works except that I have a problem.

For example, if the location is "Paris France" and I have Paris in my list then it will return the result. However, if I had "France Paris", it would not return "Paris". Do you have a solution? Maybe use regex? Thank u a lot!!!

  df = pd.read_csv(path_to_file, encoding='utf-8', sep=',')
  cities = [Paris, Bruxelles, Madrid]
  values = df[df['Bio Location'].isin(citiesfr)]
  values.to_csv(r'results.csv', index = False)

CodePudding user response:

Well, hello there o/

What you want here is .str.contains():

1. The DF I used to test:

df = {
    'col1':['Paris France','France Paris Test','France Paris','Madrid Spain','Spain Madrid Test','Spain Madrid'] #so tested with 1x at start, 1x in the middle and 1x at the end of a str
              }
df = pd.DataFrame(df)
df

Result:

index col1
0 Paris France
1 France Paris Test
2 France Paris
3 Madrid Spain
4 Spain Madrid Test
5 Spain Madrid

2. Then applying the code below:

Updated following comment #so tested with 1x at start, 1x in the middle and 1x at the end of a str

reg = ('Paris|Madrid')
df = df[df.col1.str.contains(reg)]
df

Result:

index col1
0 Paris France
1 France Paris Test
2 France Paris
3 Madrid Spain
4 Spain Madrid Test
5 Spain Madrid

Please accept ✅ this answer if it solved your problem, it motivates me :)

Otherwise mention me (using @) in comment while telling me what's wrong ;)

  • Related