Home > OS >  How to extract numbers at the end of the strings with repeated pattern in a Pandas column in Python?
How to extract numbers at the end of the strings with repeated pattern in a Pandas column in Python?

Time:07-12

I would like to extract all the numbers at the end of the string in a column of a data frame, and make a new column out of them.

Example:

import pandas as pd
pd.DataFrame({'target': ['w1-d2','w1-d3','w1-d5','w1-d9']})

Expected result:

pd.DataFrame({'target': ['w1-d2','w1-d3','w1-d5','w1-d9'],
              'new_column':['2','3','5','9']})

CodePudding user response:

Use str.extract and a simple regex ((\d )$):

df['new_column'] = df['target'].str.extract(r'(\d )$')

output:

  target new_column
0  w1-d2          2
1  w1-d3          3
2  w1-d5          5
3  w1-d9          9

regex:

(    # start capturing
\d   # match one or more digits
)    # stop capturing
$    # match end of line

regex demo

CodePudding user response:

df = pd.DataFrame({'target': ['w1-d2','w1-d3','w1-d5','w1-d9']})

df['result'] = [i[-1] for i in df['target']]

CodePudding user response:

Provided that all the items of the target columns end with a single digit, you could also use a list comprehension to extract it.

df = pd.DataFrame({'target': ['w1-d2','w1-d3','w1-d5','w1-d9']})
df['new_column'] = [i[-1] for i in df['target'].values]
print(df)

I hope this helps.

  • Related