Home > Software engineering >  How to divide column values by values from different row?
How to divide column values by values from different row?

Time:12-09

I have following dataframe (df):

                       value1 value2 total
Index1 Index2 Index3     
  A0     B0     C0       10     15    25
                C1       11     12    23
                C2       1      2     3
  A1     B1     C0       2      3     5
                C1       3      4     7
                C2       4      5     9
      ...

What I want to do is to divide value1 and value2 by total only if the value of Index3 is C0.

In the case of C1 and C2, value1 and value2 must be divided by value1 and value2 where Index3 is C0, respectively.

As a result, new dataframe must be:

                       value1 value2 total
Index1 Index2 Index3     
  A0     B0     C0      10/25  15/25  25
                C1      11/10  12/15  23
                C2       1/10   2/15   3
  A1     B1     C0       2/5    3/5    5
                C1       3/2    4/3    7
                C2       4/2    5/3    9
      ...

How to achieve this complex operation?

CodePudding user response:

You can declare denominator columns based on your requirements:

  1. If Index3 == 'C0', divide by total.
  2. Otherwise, divide by the value where Index3 == 'C0'.
# Reset index to be able to reference `Index` columns more easily
df.reset_index(inplace=True)

# Create catalog of divisors for cases when `Index3 == 'C0'`
cat = df.loc[
    df['Index3'].eq('C0'),
    ['Index1','Index2','value1','value2']
].rename(columns={'value1':'div1', 'value2':'div2'})

# Merge data with catalog
df = pd.merge(df, cat, on=['Index1','Index2'], how='inner')

# Update divisors when `Index3 == 'C0'
df[['div1', 'div2']] = df[['div1', 'div2']].apply(
    lambda x: np.where(
        df['Index3'] == 'C0', df['total'], x
    )
)

# Restore multi index
df.set_index(['Index1', 'Index2', 'Index3'], inplace=True)

The data now looks as follows:

                      value1  value2  total  div1  div2
Index1 Index2 Index3                                   
A0     B0     C0          10      15     25    25    25
              C1          11      12     23    10    15
              C2           1       2      3    10    15
A1     B1     C0           2       3      5     5     5
              C1           3       4      7     2     3
              C2           4       5      9     2     3

So you only have to divide value1 by div1 and value2 by div2:

# Update `value1` and `value2` by diviging by its corresponding divisor column
df['value1'] = df['value1'] / df['div1']
df['value2'] = df['value2'] / df['div2']
  • Related