I am trying to make a function to make an apply function that the end is to find the numbers followed by 3 characters in this case alc
. The expected result should be = 54
import pandas as pd
import regex as re
numeros=[0,1,2,3,4,5,6,7,8,9]
i="sdASK23LJFASDFKJGHASDLKJF123HALSDKJFHASDF54 alcobas"
df=df.head(3)
def re_alcoba(i):
i=i.replace(" ", "")
patron_acoba=re.compile(r"alc")
matches=patron_acoba.finditer(i)
contador=1
numero_alcobas=[]
for match in matches:
index=match.start()
while contador < 3:
numero=i[index-contador]
contador =1
if numero in numeros:
numero_alcobas.insert(0,numero)
respuesta="".join(numero_alcobas)
return respuesta
respuesta=re_alcoba(i)
CodePudding user response:
If you want numbers directly before alc
then you don't need all this code but simply (\d )alc
import regex as re
i = "sdASKLJFASDFKJGHASDLKJFHALSDKJFHASDF54alcobas"
i = i.replace(" ", "")
results = re.findall("(\d )alc", i)
print(results) # ['54']
i = "4asd5alc"
i = i.replace(" ", "")
results = re.findall("(\d )alc", i)
print(results) # ['5']
CodePudding user response:
As there are spaces in your example string, you can either match optional spaces except newlines after the digits:
(\d )[^\S\n]*alc
import re
pattern = r"(\d )[^\S\n]*alc"
s = ("sdASK23LJFASDFKJGHASDLKJF123HALSDKJFHASDF54 alcobas\n"
"4asd5alc")
print(re.findall(pattern, s))
Output
['54', '5']