Home > OS >  Creating a Column from a Specific Value Contained in a Row
Creating a Column from a Specific Value Contained in a Row

Time:07-27

I have a data frame that is formatted like this:

details col_1 col2 col3
ex1 2019 test 1 1 1
ex1 2020 review 2 2 2
example2 2021 survey 3 3 3
row3 2019 data 4 4 4

I want to create a new column called "Year" appended to the end of this data frame that takes the year value from the row name. I want it to look like this:

details col_1 col2 col3 Year
ex1 2019 test 1 1 1 2019
ex1 2020 review 2 2 2 2020
example2 2021 survey 3 3 3 2021
row3 2019 data 4 4 4 2019

The row names are unstandardized on purpose to reflect my actual data. Thanks in advance for the help!

CodePudding user response:

This will work:

df['Year'] = df.details.str.extract(r'\b(\d{4})\b').astype(int)

Output:

                details  col_1  col2  col3  Year
0         ex1 2019 test      1     1     1  2019
1       ex1 2020 review      2     2     2  2020
2  example2 2021 survey      3     3     3  2021
3        row3 2019 data      4     4     4  2019

CodePudding user response:

from dateutil.parser import parse
df['Year'] = df.apply(lambda row: parse(row.details, fuzzy=True).year, axis=1)
  • Related