Creating a Column from a Specific Value Contained in a Row-CodePudding

I have a data frame that is formatted like this:

details	col_1	col2	col3
ex1 2019 test	1	1	1
ex1 2020 review	2	2	2
example2 2021 survey	3	3	3
row3 2019 data	4	4	4

I want to create a new column called "Year" appended to the end of this data frame that takes the year value from the row name. I want it to look like this:

details	col_1	col2	col3	Year
ex1 2019 test	1	1	1	2019
ex1 2020 review	2	2	2	2020
example2 2021 survey	3	3	3	2021
row3 2019 data	4	4	4	2019

The row names are unstandardized on purpose to reflect my actual data. Thanks in advance for the help!

CodePudding user response：

This will work:

df['Year'] = df.details.str.extract(r'\b(\d{4})\b').astype(int)

Output:

                details  col_1  col2  col3  Year
0         ex1 2019 test      1     1     1  2019
1       ex1 2020 review      2     2     2  2020
2  example2 2021 survey      3     3     3  2021
3        row3 2019 data      4     4     4  2019

CodePudding user response：

from dateutil.parser import parse
df['Year'] = df.apply(lambda row: parse(row.details, fuzzy=True).year, axis=1)