I have a column 'released' which has values like 'June 13, 1980 (United States)'
I want to get the year from this string so I tried using the following code
df['year_correct'] = df['released'].astype(str).str[',':'(']
But it is returning all the values as Nan in the new 'year_correct' column. Please help
CodePudding user response:
A better way might be to extract the 4 digits value using words delimiter (\b
) to ensure no more than 4 digits:
df['year_correct'] = df['released'].astype(str).str.extract(r'\b(\d{4})\b')
Example:
released year_correct
0 June 13, 1980 (United States) 1980