Home > Enterprise >  filtering a column, specify case sensitivity
filtering a column, specify case sensitivity

Time:12-17

To reduce ambiguity, I want to improve my code to specify that my '==' function can include any case.

For example Apple, APPLE, aPpLe should be accepted (and so on)

function = (df.loc[(df['Food']=='Apple') #and so on...

Do I have to specify every single variation like below, or is there a cleaner alternative

function = (df.loc[(df['Food']=='Apple|apple|APPLE') #and so on...

CodePudding user response:

Use Series.str.fullmatch which allows for a case argument. fullmatch requires the string to be an exact match, not just a substring (for which you would use .str.contains or .str.match).

import pandas as pd
df = pd.DataFrame({'Food': ['ApPlE', 'apple', 'APPLE', 'apples', 'Banana']})

# Assign the Boolean mask for illustration
df['any_case_apple'] = df['Food'].str.fullmatch('apple', case=False)

print(df)
#     Food  any_case_apple
#0   ApPlE            True
#1   apple            True
#2   APPLE            True
#3  apples           False
#4  Banana           False

CodePudding user response:

To ignore case:

df.loc[df['Food'].str.lower()=='apple']

# or

df.loc[df['Food'].str.lower()=='Apple'.lower()]

CodePudding user response:

If you want to check if Apple is in any of the strings, ignoring case, you can do this:

df[df['Food'].str.contains('apple', case=False)]
  • Related