I have a dataframe called "cohort_any_action"
I want to create a percentage but when I try this code;
cohort_any_action.divide(cohort_any_action.iloc[:,0], axis=0)
I obtain this; But I want the points where the same dates intersect to take the value 1, just as the cell where April 25 and April 25 intersect takes the value 1.
CodePudding user response:
Very close, you just need to obtain the values along the diagonal, not the first column as in your code. Here is one way to do that:
import numpy as np
import pandas as pd
# example data
df = pd.DataFrame(np.triu(np.random.rand(4,4)),columns=['a','b','c','d'])
df[df==0]=np.nan
a b c d
0 0.99 0.40 0.85 0.90
1 NaN 0.06 0.97 0.11
2 NaN NaN 0.30 0.31
3 NaN NaN NaN 0.12
# divide by diagonal, row-wise
diagonal = df.to_numpy().diagonal()
df_norm = df.divide(diagonal, axis=0)
a b c d
0 1.0 0.4 0.86 0.91
1 NaN 1.0 16.17 1.83
2 NaN NaN 1.00 1.03
3 NaN NaN NaN 1.00