I am wondering how to pass pandas data frame column values into a regular expression. I have tried the below but get "TypeError: 'Series' objects are mutable, thus they cannot be hashed"
Im after the result below. (I could just use a different regex but was wondering how this might be done dynamically)
Thoughts appreciated :)
to_search search_string search_result
ABC-T3-123 ABC ABC-T3
ABC-T2-123 ABC ABC-T3
DEF-T1-123 ABC DEF-T1
import pandas as pd
# create list for data frame
data = [['ABC-T3-123', 'ABC'], ['ABC-T2-123', 'ABC'], ['DEF-T1-123', 'DEF']]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['to_search', 'search_string'])
df['search_results']=df['to_search'].str.extract("(" df['search_string'] "-T[0-9])")}```
CodePudding user response:
I know that you want an efficient solution, but typically these pandas functions do not take values such as Series
es. Here is an apply
-based solution, which I think, besides simplifying the regular expression, is the only viable solution here:
searched = df.apply(lambda row: re.search("(" row['search_string'] "-T[0-9])", row['to_search']).group(1), axis=1)
Output:
>>> searched
0 ABC-T3
1 ABC-T2
2 DEF-T1
dtype: object