Assuming I have a vector of the form | a | b | c | d |
, e.g.
vec = pd.Series([0.3,0.2,0.2,0.3])
What's a quick and elegant way to build a pd.DataFrame
of the form:
| a*a | a*b | a*c | a*d |
| b*a | b*b | b*c | b*d |
| c*a | c*b | c*c | c*d |
| d*a | d*b | d*c | d*d |
CodePudding user response:
One option is to use dot
:
fr = vec.to_frame()
out = fr.dot(fr.T)
Output:
0 1 2 3
0 0.09 0.06 0.06 0.09
1 0.06 0.04 0.04 0.06
2 0.06 0.04 0.04 0.06
3 0.09 0.06 0.06 0.09
CodePudding user response:
Use numpy broadcasting:
vec = pd.Series([0.3,0.2,0.2,0.3])
a = vec.to_numpy()
df = pd.DataFrame(a * a[:, None], index=vec.index, columns=vec.index)
print (df)
0 1 2 3
0 0.09 0.06 0.06 0.09
1 0.06 0.04 0.04 0.06
2 0.06 0.04 0.04 0.06
3 0.09 0.06 0.06 0.09
Or numpy.outer
:
df = pd.DataFrame(np.outer(vec, vec), index=vec.index, columns=vec.index)
print (df)
0 1 2 3
0 0.09 0.06 0.06 0.09
1 0.06 0.04 0.04 0.06
2 0.06 0.04 0.04 0.06
3 0.09 0.06 0.06 0.09