So I have a data frame called df. It looks like this.
0 | 1 | 2 |
---|---|---|
1 | 2 | 3 |
4 | 5 | 6 |
7 | 8 | 9 |
I want to sum up the columns and divide the sum of the columns by the sum of the rows. So for example: row 1, column 0: (1 4 7)/(1 2 3) row 2, column 0: (1 4 7)/(4 5 6)
so on and so forth.
so that my final result is like this.
0 | 1 | 2 |
---|---|---|
2 | 2.5 | 3 |
0.8 | 1 | 1.2 |
0.5 | 0.625 | 0.75 |
How do I do it in python using pandas and dataframe?
CodePudding user response:
You can also do it this way:
import numpy as np
a = df.to_numpy()
b = np.divide.outer(a.sum(0),a.sum(1))
# divide is a ufunc(universal function) in numpy.
# All ufunc's support outer functionality
out = pd.DataFrame(b, index=df.index, columns=df.columns)
output:
0 1 2
0 2.0 2.500 3.00
1 0.8 1.000 1.20
2 0.5 0.625 0.75
CodePudding user response:
You can use the underlying numpy array:
a = df.to_numpy()
out = pd.DataFrame(a.sum(0)/a.sum(1)[:,None],
index=df.index, columns=df.columns)
output:
0 1 2
0 2.0 2.500 3.00
1 0.8 1.000 1.20
2 0.5 0.625 0.75