newbie python/coder who is trying to make data logger downloads and calculations a smoother process as a side project. Anyways I have two data frames. The first is "data" which contains the following (number of rows shortened for simplicity):
Logger Name Date and Time Battery Temp(C) Sensor Reading(dg) Sensor Temp(C) Array #
0 TDX 10/1/2021 13:35 2.93 15.59 8772.737 14.5 833
1 TDX 10/1/2021 13:36 2.93 15.59 8773.426 14.5 834
2 TDX 10/1/2021 13:36 2.93 15.59 8773.570 14.5 835
3 TDX 10/1/2021 13:37 2.93 15.59 8773.793 14.5 836
The second is "param" which has parameters which contains values that I use to make calculations:
Transducer_ID elevation_tom elevation_toc elevation_ground elevation_tos calculation gage_factor xd_zero_reading thermal_factor xd_temp_at_zero_reading piezo_elev piezo_downhole_depth
0 TDX NaN NaN 1000 NaN linear -0.04135 9138 0.003119 24.8 1600 400
1 Test NaN NaN 1000 NaN linear -0.18320 8997 -0.170100 22.6 800 200
Now what I hope the code will be able to do is make a new column in "data" called "Linear P" which populates based on this calculation that uses variables from both dataframes: [digits_zero_digits - Sensor Reading(dg)] * abs(gage_factor). Now this is not a problem if "param" only had one Transducer ID and the same number of rows as "data", but in reality it has lots of rows with different IDs.
So my question is this. What is the best way to accomplish my goal? Is it to loop over the column or is there something more efficient using the pandas library?
Thanks in advance!
edit: the output I am looking for is this
Logger Name Date and Time Battery Voltage(v) Internal Temp(C) Sensor Reading(dg) Sensor Temp(C) Array # Linear P
0 TDX 10/1/2021 13:35 2.93 15.59 8772.737 14.5 833 15.103625
1 TDX 10/1/2021 13:36 2.93 15.59 8773.426 14.5 834 15.075135
2 TDX 10/1/2021 13:36 2.93 15.59 8773.570 14.5 835 15.069181
3 TDX 10/1/2021 13:37 2.93 15.59 8773.793 14.5 836 15.059959
CodePudding user response:
The more efficient way would be based on my experience :
- join the two data frame using (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.join.html).
- make calculation on the result dataframe (
df["Linear P"] = df["Sensor Reading(dg)"] * ... ) .
CodePudding user response:
Just figured out a way to do it that seems pretty efficient. I simply remove the data in "param" that I do not need:
z = data.iloc[0,0]
param = param[param.Transducer_ID == z]
With the data filtered I pull out only the needed values from param:
x = piezo_param.iloc[0, 7]
y = piezo_param.iloc[0, 6]
And perform the calculation: data['Linear P'] = (x - data['Sensor Reading(dg)']) * abs(y)
Let me know if this seems like the best way to get the job done!