Is it possible to calculate and add the results to an existing dataframe while the calculation bases on a lookup table with a different format?
More precise:
I have a comma separated file that I open and read in as pandas dataframe. The read-in dataframe contains 38 different data columns and an (while read in process created) additional index column over several thousand rows:
My lookup table contains values as base for a calculation. As well, it is a comma separated file read in as pandas dataframe. It contains 24 rows and 6 columns and an additional index column:
And here comes the calculation which I try to realize: In a new column "M_A" I want to write the result of a calculation like this:
while i stands for the according values of C00, C01, C02....C22, C23
While SP, FR, C00, C01, C02 [...] are a column part of the "data" dataframe, PV, W and RC_A are part of the lookup table dataframe. Common indexing parameter of the "data" and the "lookup" tables are values of the colums of C00, C01, C02 according to column "C" of the lookup table. Calculation values should be taken when data column C00, C01, C02... match lookup table row C00, C01, C02...
As iteration is not a recommended solution for datasets of this size I tried it without but do not find the right way as my lookup table has not the same length as my data table.
df_data['A_calc'] = ((df_data.T / (df_data.SF * df_data.SP)) * ((df_data.C00 * df_lookup.PV * df_lookup.W * df_lookup.RC_A) (df_data.C01 * df_lookup.PV * df_lookup.W * df_lookup.RC_A) (df_data.C02 * df_lookup.PV * df_lookup.W * df_lookup.RC_A) ...)
This leads to the Error message:
AttributeError: 'DataFrame' object has no attribute 'PTU_Airdensity_recalc'
Is there a way to realize this in Python with Pandas df? Maybe even more elegant than mine which I choose to visualize what my intention is...
Any suggestions?
Thanks, Swawa
CodePudding user response:
So for my understanding to apply the formula; for each column Ci we multiply it with values PV[i], W[i],RC_A[i] then sum over each result
result=0
for i in range(len(df_lookup)):
result=result (df_data[df_lookup.loc[i,"C"]]*df_lookup.PV.iloc[i] *
df_lookup.W.iloc[i] * df_lookup.RC_A.iloc[i])
#result is a column
#then we multiply element wise
df_data['A_calc'] = ((df_data.T / (df_data.SF * df_data.SP))*multiply(result, axis="index")
CodePudding user response:
ok, there is an approach:
I respell Ran A's answer.
it should be:
result=0
for i in range(len(df_lookup)):
result=result (df_data[df_lookup.loc[i,"C"]]*df_lookup.PV.iloc[i] *
df_lookup.W.iloc[i] * df_lookup.RC_A.iloc[i])
#result is a column
#then we multiply element wise
df_data['A_calc'] = ((df_data.T / (df_data.SF * df_data.SP)).multiply(result, axis="index")
this line is working then with "." instead of "*". But the loop I am still working on...