Given the following df:
data = {'Description': ['with chicken', 'champagne', 'Chicken', 'bananas and chicken', 'fafsa Lemons', 'GIN CHICKEN'],}
df = pd.DataFrame(data)
print (df)
if I search on the column of a word, in this case "chicken", I would like to find the initial position in the string if present, here the expected output:
data = {'Description': ['with chicken', 'champagne', 'chicken', 'bananas and chicken','fafsa Lemons', 'GIN CHICKEN'],
'ChickenPosition': ['6', 'NA', '1', '13', 'NA', '5']
}
df = pd.DataFrame(data)
print (df)
anybody able to write something extremely compact without many steps? Thanks a lot in advance!
CodePudding user response:
string_to_search = "chicken"
df['ChickenPosition'] = df['Description'].apply(lambda x: x.lower().index(string_to_search.lower()) 1 if string_to_search.lower() in x.lower() else "NA")
print(df)
Output:
Description ChickenPosition
0 with chicken 6
1 champagne NA
2 Chicken 1
3 bananas and chicken 13
4 fafsa Lemons NA
5 GIN CHICKEN 5
1 here is to have the same result as you requested, otherwise it would be the string index which starts the count from 0
Alternative solution using pd.Series.str.find:
string_to_search = "chicken"
df['ChickenPosition'] = df['Description'].apply(str.lower).str.find(string_to_search.lower()).replace(-1, "NA")
CodePudding user response:
This would be shorthand version...
import pandas as pd
import re
data = {'Description': ['with chicken', 'champagne', 'Chicken', 'bananas and chicken', 'fafsa Lemons', 'GIN CHICKEN'],}
df = pd.DataFrame(df)
pattern = re.compile('chicken')
results = df['Description'].apply(lambda x: pattern.search(str(x).lower()).span()[0] \
if (pattern.search(str(x).lower()) != None) else 'NA')