Home > Software design >  Pandas - Extract value from column in data frame
Pandas - Extract value from column in data frame

Time:09-27

I have a pandas data frame with really long text in a column. I wanted to select all columns that contain ABC. I was able to do this using the following

 df[df['Column'].str.contains('ABC', na=False)]

What I want to do after that is extract all values from this field that contain the prefix and the next 5 letters. S.So after finding a column, I would want to get ABC1234 or ABC7899.

I hope this makes sense.

CodePudding user response:

You can use str.extract with a regular expression that says to capture any time it sees ABC with 5 following digits

df = pd.DataFrame({'Column':['ABC12345 is in this column', 'Not in this one CCD11111','Also in this one ABC99882']})
df['capture'] = df.Column.str.extract('(ABC\d{5})')
df.dropna(inplace=True)
print(df)

Output

                      Column   capture
0  ABC12345 is in this column  ABC12345
2   Also in this one ABC99882  ABC99882
  • Related