I have two data frames containing float values.
- The first one is a 1 column data frame that contains positions.
- The second one is a matrix of ncol equal to the number of IDs and nrows equal to the nrow of the first data frame.
The idea is to create a new data frame of the same size as the second one. It needs to contain an equation between each value of the 1st data frame and each value for each column of the second one. The idea is that it will iterate over each row for one column before passing to the next one.
The ecuation would be something like df1 * df2 / len(df1) 1
Example data:
df1 = pd.DataFrame([10,20,30,40,50,60], columns=['POS'])
df2 = pd.DataFrame({"ID1" : [0,2,4,6,8,10] , "ID2" :[1,3,5,7,9,11]})
final = pd.DataFrame({"ID1" : [0, 5.714285714, 17.14285714, 34.28571429, 57.14285714, 85.71428571] , "ID2" :[1.428571429, 8.571428571, 21.42857143, 40, 64.28571429, 94.28571429]})
I think the the nested loop would be something like this, but I still can't get theanswer right. What I'm missing?
final = pd.DataFrame([])
for i in list(range(0,len(df1))):
for j in list(range(0,len(df2))):
final.append(df2.iloc[i,j] * df1[0][i] / len(df1) 1)
In R the answer is this:
for (i in 1:nrow(df1)){
for (j in 1:ncol(df2)){
final[i,j] <- (df2[i,j] * df1[i,1]) / nrow(df1) 1
}
}
CodePudding user response:
In a Pandorable way, you can do it with pandas.DataFrame.squeeze
with pandas.DataFrame.mul
:
result = df2.mul(df1.squeeze(), axis=0).div(len(df1) 1)
Output :
print(result)
ID1 ID2
0 0.000000 1.428571
1 5.714286 8.571429
2 17.142857 21.428571
3 34.285714 40.000000
4 57.142857 64.285714
5 85.714286 94.285714
CodePudding user response:
Explicitly for df1 * df2 / (len(df1) * df2)
calculation:
pd.DataFrame((df1.values * df2.values) / (len(df1) * df2.values), columns=df2.columns)
ID1 ID2
0 NaN 1.666667
1 3.333333 3.333333
2 5.000000 5.000000
3 6.666667 6.666667
4 8.333333 8.333333
5 10.000000 10.000000