Find a specific values in column and take few next elements that are next to it-CodePudding

I have a problem with one task. I have a list of values that looks like below:

values = ["a","b","c"]

and my DF looks like below:

column_1 column_2
1        sffasdef
2        bsrewsaf
3        tyhvbsrc
4        ertyui1c
5        qwertyyu

I have to check if one of values in list exists in column 2. If there is, in new column it should return result and next 3 elements, so the DF should look like below:

column_1 column_2  column_3
1        sffasdef  asde
2        bsrewsaf  bsre
3        tyhvbsrc  bsrc
4        ertyui1c  c
5        qwertyyu  NaN

Do you have any idea how to solve this? Regards

CodePudding user response：

Use .str.extract:

df['column_3'] = df['column_2'].str.extract(f'((?:{"|".join(values)})(?:.?){{3}})')

# OR, possibly more readable

values_re = '|'.join(values)
df['column_3'] = df['column_2'].str.extract(r'((?:'   values_re   ')(?:.?){3})')

Output:

>>> df
   column_1  column_2 column_3
0         1  sffasdef     asde
1         2  bsrewsaf     bsre
2         3  tyhvbsrc     bsrc
3         4  ertyui1c        c
4         5  qwertyyu      NaN

CodePudding user response：

Assuming you have single characters in values:

df['column_3'] = df['column_2'].str.extract(fr'([{"".join(values)}].{{,3}})')

output:

   column_1  column_2 column_3
0         1  sffasdef     asde
1         2  bsrewsaf     bsre
2         3  tyhvbsrc     bsrc
3         4  ertyui1c        c
4         5  qwertyyu      NaN