I have a data frame of the below format.
variable count
a,x 20
a 100
a,y 40
I would like to get a matrix similar to correlational matrix but not correlational data. Required matrix is of the below format. This matrix clearly helps to distinguish the count of each variable. Is it possible to perform such transformations using pandas. As I am very naive to Python programming any lead or link is greatly appreciated.
a 40 20 100
x 0 0 20
y 0 0 40
y x a
CodePudding user response:
You probably have many ways, one would be to split the strings, pivot and combine with the transpose:
df2 = (df['variable']
.str.split(',', expand=True).ffill(axis=1)
.join(df.drop(columns='variable'))
.pivot(0, 1, 'count')
)
out = df2.combine_first(df2.T).fillna(0, downcast='infer')
output:
a x y
a 100 20 40
x 20 0 0
y 40 0 0