To reduce ambiguity, I want to improve my code to specify that my '==' function can include any case.
For example Apple, APPLE, aPpLe should be accepted (and so on)
function = (df.loc[(df['Food']=='Apple') #and so on...
Do I have to specify every single variation like below, or is there a cleaner alternative
function = (df.loc[(df['Food']=='Apple|apple|APPLE') #and so on...
CodePudding user response:
Use Series.str.fullmatch
which allows for a case
argument. fullmatch
requires the string to be an exact match, not just a substring (for which you would use .str.contains
or .str.match
).
import pandas as pd
df = pd.DataFrame({'Food': ['ApPlE', 'apple', 'APPLE', 'apples', 'Banana']})
# Assign the Boolean mask for illustration
df['any_case_apple'] = df['Food'].str.fullmatch('apple', case=False)
print(df)
# Food any_case_apple
#0 ApPlE True
#1 apple True
#2 APPLE True
#3 apples False
#4 Banana False
CodePudding user response:
To ignore case:
df.loc[df['Food'].str.lower()=='apple']
# or
df.loc[df['Food'].str.lower()=='Apple'.lower()]
CodePudding user response:
If you want to check if Apple
is in any of the strings, ignoring case, you can do this:
df[df['Food'].str.contains('apple', case=False)]