Home > Software engineering >  How multiple new columns which values and names are dependent on the existed ones
How multiple new columns which values and names are dependent on the existed ones

Time:02-18

I have a dataframe with many columns (in this question, i will use an example of this dataframe) and want to create new columns which shows values of the specific column when year=2017 divided by initial column´s value. More specifically, take column price as an example. The first new column price_2017 should only show value of price column when year=2017. The next column price_ratio should show value which equals price/price_2017. Those _2017 and _ratio suffixes are added to all columns except year. I know how to do it manually, however as in real dataset I have 60 columns then some optimization is needed like using loop or apply, however have not any ideas how to do it.

# importing pandas as pd
import pandas as pd
 
# dictionary of lists
dict = {'year':[2015, 2016, 2017, 2018],
        'price':[1,2,3,4],
        'degree': [10,15,22,25],
        'score':[90, 40, 80, 98]}
 
df = pd.DataFrame(dict)

The final dataset should look like enter image description here

CodePudding user response:

Set the index to year, then use loc to select the row corresponding to 2017, then divide the dataframe by this row to calculate ratio and assign the results back

s = df.set_index('year')
s.assign(**{**s.loc[2017].add_suffix('_2017'), 
            **s.div(s.loc[2017]).add_suffix('_ratio')})

Result

      price  degree  score  price_2017  degree_2017  score_2017  price_ratio  degree_ratio  score_ratio
year                                                                                                   
2015      1      10     90           3           22          80     0.333333      0.454545        1.125
2016      2      15     40           3           22          80     0.666667      0.681818        0.500
2017      3      22     80           3           22          80     1.000000      1.000000        1.000
2018      4      25     98           3           22          80     1.333333      1.136364        1.225
  • Related