dividing specific columns in a multiindexed (axis=1) dataframe by certain other columns-CodePudding

I have a pd.DataFrame which has a column multiindex of the following structure:

     a1 a1 a1 a2 a2 a2 a3 a3 a3 ...
     b1 b2 b3 b1 b2 b3 b1 b2 b3 ...
row1 values ...
row2 
row3 
...

I would like to divide each a-group by the last element in its corresponding b3 column. I.e. divide the column a1-b1 by last value in a1-b3, same for a1-b2 and a1-b3 (such that the last element of a1-b3 will actually be 1). for the next group a2, same analogy but dividing its columns by last value in a2-b3. and so on...

I am looking for a simple expression to do this. I have a very cumbersome code where I split up the dataframe into groups that are stored in a dict and then divide each group respectively and merge them back together, but that cant be it, right?

Thanks!

Best, JZ

CodePudding user response：

Use groupby:

df1 = df.groupby(level=0, axis=1).apply(lambda x: x / x.iloc[-1, -1])
print(df1)

# Output
            a1                       a2                       a3               
            b1        b2   b3        b1        b2   b3        b1        b2   b3
row1  0.333333  0.666667  1.0  0.666667  0.833333  1.0  0.777778  0.888889  1.0
row2  0.333333  0.666667  1.0  0.666667  0.833333  1.0  0.777778  0.888889  1.0
row3  0.333333  0.666667  1.0  0.666667  0.833333  1.0  0.777778  0.888889  1.0

Setup:

data = {('a1', 'b1'): [1, 1, 1], ('a1', 'b2'): [2, 2, 2], ('a1', 'b3'): [3, 3, 3],
        ('a2', 'b1'): [4, 4, 4], ('a2', 'b2'): [5, 5, 5], ('a2', 'b3'): [6, 6, 6],
        ('a3', 'b1'): [7, 7, 7], ('a3', 'b2'): [8, 8, 8], ('a3', 'b3'): [9, 9, 9]}
df = pd.DataFrame(data)