Home > Blockchain >  How to put pandas df column values into extract regular expression
How to put pandas df column values into extract regular expression

Time:12-08

I am wondering how to pass pandas data frame column values into a regular expression. I have tried the below but get "TypeError: 'Series' objects are mutable, thus they cannot be hashed"

Im after the result below. (I could just use a different regex but was wondering how this might be done dynamically)

Thoughts appreciated :)

to_search     search_string  search_result
ABC-T3-123    ABC            ABC-T3
ABC-T2-123    ABC            ABC-T3
DEF-T1-123    ABC            DEF-T1

import pandas as pd
  
# create list for data frame
data = [['ABC-T3-123', 'ABC'], ['ABC-T2-123', 'ABC'], ['DEF-T1-123', 'DEF']]
  
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['to_search', 'search_string'])

df['search_results']=df['to_search'].str.extract("("   df['search_string']   "-T[0-9])")}```

CodePudding user response:

I know that you want an efficient solution, but typically these pandas functions do not take values such as Serieses. Here is an apply-based solution, which I think, besides simplifying the regular expression, is the only viable solution here:

searched = df.apply(lambda row: re.search("("   row['search_string']   "-T[0-9])", row['to_search']).group(1), axis=1)

Output:

>>> searched
0    ABC-T3
1    ABC-T2
2    DEF-T1
dtype: object
  • Related