Home > Blockchain >  python pandas search string in a column and get initial position if the string is found
python pandas search string in a column and get initial position if the string is found

Time:07-23

Given the following df:

data = {'Description':  ['with chicken', 'champagne', 'Chicken', 'bananas and chicken', 'fafsa Lemons', 'GIN CHICKEN'],}
df = pd.DataFrame(data)
print (df)

if I search on the column of a word, in this case "chicken", I would like to find the initial position in the string if present, here the expected output:

data = {'Description':  ['with chicken', 'champagne', 'chicken', 'bananas and chicken','fafsa Lemons', 'GIN CHICKEN'],
       'ChickenPosition':  ['6', 'NA', '1', '13', 'NA', '5']
       }
df = pd.DataFrame(data)
print (df)

anybody able to write something extremely compact without many steps? Thanks a lot in advance!

CodePudding user response:

string_to_search = "chicken"

df['ChickenPosition'] = df['Description'].apply(lambda x: x.lower().index(string_to_search.lower())  1 if string_to_search.lower() in x.lower() else "NA")
print(df)

Output:

           Description ChickenPosition
0         with chicken               6
1            champagne              NA
2              Chicken               1
3  bananas and chicken              13
4         fafsa Lemons              NA
5          GIN CHICKEN               5

1 here is to have the same result as you requested, otherwise it would be the string index which starts the count from 0

Alternative solution using pd.Series.str.find:

string_to_search = "chicken"

df['ChickenPosition'] = df['Description'].apply(str.lower).str.find(string_to_search.lower()).replace(-1, "NA")

CodePudding user response:

This would be shorthand version...

import pandas as pd
import re
data = {'Description':  ['with chicken', 'champagne', 'Chicken', 'bananas and chicken', 'fafsa Lemons', 'GIN CHICKEN'],}
df = pd.DataFrame(df)

pattern = re.compile('chicken')

results = df['Description'].apply(lambda x: pattern.search(str(x).lower()).span()[0] \
                                  if (pattern.search(str(x).lower()) != None) else 'NA')
  • Related