I have following dataframe (df
):
value1 value2 total
Index1 Index2 Index3
A0 B0 C0 10 15 25
C1 11 12 23
C2 1 2 3
A1 B1 C0 2 3 5
C1 3 4 7
C2 4 5 9
...
What I want to do is to divide value1
and value2
by total
only if the value of Index3
is C0
.
In the case of C1
and C2
, value1
and value2
must be divided by value1
and value2
where Index3
is C0
, respectively.
As a result, new dataframe must be:
value1 value2 total
Index1 Index2 Index3
A0 B0 C0 10/25 15/25 25
C1 11/10 12/15 23
C2 1/10 2/15 3
A1 B1 C0 2/5 3/5 5
C1 3/2 4/3 7
C2 4/2 5/3 9
...
How to achieve this complex operation?
CodePudding user response:
You can declare denominator columns based on your requirements:
- If
Index3 == 'C0'
, divide bytotal
. - Otherwise, divide by the value where
Index3 == 'C0'
.
# Reset index to be able to reference `Index` columns more easily
df.reset_index(inplace=True)
# Create catalog of divisors for cases when `Index3 == 'C0'`
cat = df.loc[
df['Index3'].eq('C0'),
['Index1','Index2','value1','value2']
].rename(columns={'value1':'div1', 'value2':'div2'})
# Merge data with catalog
df = pd.merge(df, cat, on=['Index1','Index2'], how='inner')
# Update divisors when `Index3 == 'C0'
df[['div1', 'div2']] = df[['div1', 'div2']].apply(
lambda x: np.where(
df['Index3'] == 'C0', df['total'], x
)
)
# Restore multi index
df.set_index(['Index1', 'Index2', 'Index3'], inplace=True)
The data now looks as follows:
value1 value2 total div1 div2
Index1 Index2 Index3
A0 B0 C0 10 15 25 25 25
C1 11 12 23 10 15
C2 1 2 3 10 15
A1 B1 C0 2 3 5 5 5
C1 3 4 7 2 3
C2 4 5 9 2 3
So you only have to divide value1
by div1
and value2
by div2
:
# Update `value1` and `value2` by diviging by its corresponding divisor column
df['value1'] = df['value1'] / df['div1']
df['value2'] = df['value2'] / df['div2']