I'm having problems with multiplying values in two different dataframes. Im doing a PCA regression and want to multiply all my loadings with the original values.
for example:
PCA dataframe
PC1 | PC2 | |
---|---|---|
X | 0 | 1 |
X1 | 1 | 2 |
X2 | 2 | 1 |
X3 | 2 | 1 |
X4 | 3 | 2 |
X5 | 5 | 4 |
Original dataframe:
A | A1 | A2 | A3 | A4 | A5 | |
---|---|---|---|---|---|---|
1 | 1 | 3 | 4 | 1 | 2 | 4 |
2 | 8 | 5 | 3 | 2 | 1 | 2 |
3 | 9 | 3 | 5 | 1 | 3 | 1 |
I then want to multiply PC1 with every row in the original dataframe such that:
PC1 = 0xA 1xA1 2xA2 2xA3 3xA4 5xA5
insert first row from second dataframe: PC1 = 0x1 3x1 4x2 2x1 3x2 5x8 = 59 Second row: PC1 = 0x8 5x1 3x2 2x2 1x3 5x2 = 28 Third row: PC1 = 0x9 1x3 2x5 2x1 3x3 1x5 = 29
new dataframe:
PC1 | PC2 | |
---|---|---|
1 | 59 | |
2 | 28 | |
3 | 29 |
And so on.
My PCA dataframe have the shape (14,4) and my value dataframe has the shape (159,14)
CodePudding user response:
You are looking for a dot product - which you can get with np.dot
print(df)
2 3
1
X 0 1
X1 1 2
X2 2 1
X3 2 1
X4 3 2
X5 5 4
print(xf)
2 3 4 5 6 7
1
1 1 3 4 1 2 4
2 8 5 3 2 1 2
3 9 3 5 1 3 1
print(pd.DataFrame(np.dot(xf, df), columns=['PC1', 'PC2']))
PC1 PC2
0 39 32
1 28 33
2 29 31
CodePudding user response:
If same length of first DataFrame
and same length of columns names in second DataFrame is possible multiple by numpy array with