Home > Back-end >  Breaking a python string in pandas dataframe
Breaking a python string in pandas dataframe

Time:10-25

I have a column 'released' which has values like 'June 13, 1980 (United States)'

I want to get the year from this string so I tried using the following code

df['year_correct'] = df['released'].astype(str).str[',':'(']

But it is returning all the values as Nan in the new 'year_correct' column. Please help

CodePudding user response:

A better way might be to extract the 4 digits value using words delimiter (\b) to ensure no more than 4 digits:

df['year_correct'] = df['released'].astype(str).str.extract(r'\b(\d{4})\b')

Example:

                        released year_correct
0  June 13, 1980 (United States)         1980
  • Related