Home > OS >  Pandas - Applying formula on all column based on a value on the row
Pandas - Applying formula on all column based on a value on the row

Time:11-28

lets say I have a dataframe like below

 ------ ------ ------ ------------- 
|  A   |   B  |  C   | devisor_col |
 ------ ------ ------ ------------- 
|    2 |    4 |   10 |           2 |
|    3 |    3 |    9 |           3 |
|   10 |   25 |   40 |          10 |
 ------ ------ ------ ------------- 

what would be the best command to apply a formula using values from the devisor_col. Do note that I have thousand of column and rows.

the result should be like this:

 ------ ------ ------ ------------- 
| A    | B    | V    | devisor_col |
 ------ ------ ------ ------------- 
|    1 |    2 |    5 |           2 |
|    1 |    1 |    3 |           3 |
|    1 |  1.5 |    4 |          10 |
 ------ ------ ------ ------------- 

I tried using apply map but I dont know why I cant apply it to all columns.

modResult = my_df.applymap(lambda x: x/x["devisor_col"]))

CodePudding user response:

IIUC, use pandas.DataFrame.divide on axis=0 :

modResult= (
        pd.concat(
            [my_df, my_df.filter(like="Col") # selecting columns
                   .divide(my_df["devisor_col"], axis=0).add_suffix("_div")], axis=1)
           )

# Output :

print(modResult)

   Col1  Col2  Col3  devisor_col  Col1_div  Col2_div  Col3_div
0     2     4    10            2       1.0       2.0       5.0
1     3     3     9            3       1.0       1.0       3.0
2    10    25    40           10       1.0       2.5       4.0

If you need only the result of the divide, use this :

modResult= my_df.filter(like="Col").divide(my_df["devisor_col"], axis=0)

print(modResult)

   Col1  Col2  Col3
0   1.0   2.0   5.0
1   1.0   1.0   3.0
2   1.0   2.5   4.0

Or if you want to overwrite the old columns, use pandas.DataFrame.join:

modResult= (
            my_df.filter(like="Col")
                 .divide(my_df["devisor_col"], axis=0)
                 .join(my_df["devisor_col"])
           )

   Col1  Col2  Col3  devisor_col
0   1.0   2.0   5.0            2
1   1.0   1.0   3.0            3
2   1.0   2.5   4.0           10

You can replace my_df.filter(like="Col") with my_df.loc[:, my_df.columns!="devisor_col"].

CodePudding user response:

You can try using .loc

df = pd.DataFrame([[1,2,3,1],[2,3,4,5],[4,5,6,7]], columns=['col1', 'col2', 'col3', 'divisor'])

df.loc[:, df.columns != 'divisor'] = df.loc[:, df.columns != 'divisor'].divide(df['divisor'], axis=0)
  • Related