I have a pandas Series of strings (job titles). I also have a string with a "target" job title.
My task is to iterate over the Series and check if the "target" is present in the elements of the Series. It must be an exact match (including white spaces).
If it is present then the element of the Series must be replaced with the "target". If there is no match then the original element will stay unchanged. I must return and an adjusted Series.
Here is my code and errors.
def nn(s, target)
sn = pd.Series()
for w in s:
if target in w:
sn.append(pd.Series(target))
else:
sn.append(pd.Series(w))
return sn
INPUT data:
import pandas as pd
import copy
s = pandas.Series(['DATA ANALYTIC SCIENTIST',
'BEST DATA SCIENTIST',
'DATA SCIENTIST',
'DATA SCIENTIST - SPACE OPTIMIZATION',
'SCIENTIST DATA'])
target = 'DATA SCIENTIST'
nn(s, target)
ERRORS (empty Series)
Series([], dtype: float64)
Thank you!!
CodePudding user response:
You want str.contains
and mask
:
s.mask(s.str.contains(target), target)
Or extract
then fillna
:
s.str.extract(f'({target})', expand=False).fillna(s)
Output:
0 DATA ANALYTIC SCIENTIST
1 DATA SCIENTIST
2 DATA SCIENTIST
3 DATA SCIENTIST
4 SCIENTIST DATA
dtype: object