Current data frame:
Name | ID |
---|---|
Peter | School_09 |
John | School_23 |
How I want it:
Name | ID |
---|---|
Peter | 09 |
John | 23 |
CodePudding user response:
We can also try using str.replace
here:
df["ID"] = df["ID"].str.replace(r'.*_', '', regex=True)
CodePudding user response:
you can use str.extract
with the \d $
regex (one or more trailing digits) to collect only the trailing digits:
df['ID'] = df['ID'].str.extract(r'(\d )$')
output:
ID
Name
0 Peter 09
1 John 23
and to have a numeric type, combine with to_numeric
:
df['ID'] = pd.to_numeric(df['ID'].str.extract(r'(\d )$', expand=False), errors='coerce')
output:
Name ID
0 Peter 9
1 John 23