Home > Mobile >  numpy/pandas - find a substring by regex and replace it by selecting a random value from a list
numpy/pandas - find a substring by regex and replace it by selecting a random value from a list

Time:12-20

there is a list which is like below.

list=[1,2,3,4,5.....]

Then there's a df like below.

message
"2022-12-18 23:56:32,939  vlp=type rev=2 td=robert CIP=x.x.x.x motherBoard=A motherName=""A"" ns=nsA. npd=npd1 messageID=sfsdfdsfsdsa nu=nuA diui=8"
...
...

I use below code to find the messageID value first and then replace by selecting a random value from list. but it doesn't work

messageID = list(map(str, messageID))
df.messageID = df.messageID.str.replace(r'\s messageID=(.*?)\s ', np.random.choice(messageID, size=len(df)) , regex=True)

can any expert please help take a look?

Thanks.

CodePudding user response:

Use lookbehind with re.sub for replace in list comprehension:

import re

zipped = zip(df.messageID, np.random.choice(messageID, size=len(df)))
df['messageID'] = [re.sub(r'(?<=messageID=)\w ', s, r) for r, s in zipped]
  • Related