I am having trouble/am new to trying to get this to work for me smoothly using pycharm. I have one CSV that records a Date/Time and barometric pressure every single hour. I have a another CSV file that records Date/Time and water level every ten minutes.
For my task, I need to adjust/compensate the water level data using barometric pressure, and in order to do this I will need to associate the hourly Barometric data to each of the corresponding ten minute water level readings using a script:
Pressure CSV
date, time, pressure
7/31/2021, 1200, 1013
7/31/2021, 1300, 1012.5
7/31/2021, 1400, 1013.4
7/31/2021, 1500, 1014
7/31/2021, 1600, 1016
Water Level CSV
date, time, depth
7/31/2021, 1200, 34.55
7/31/2021, 1210, 34.56
7/31/2021, 1220, 34.55
...
7/31/2021, 1540, 36.75
7/31/2021, 1550, 36.77
7/31/2021, 1600, 36.78
I am trying to get the Water Level CSV to read the Pressure CSV for matching date, and matching time (but only the hour, minutes dont matter since pressure is recoded hourly), and then create anew column and write the corresponding pressure like so
New Water Level CSV
date, time, depth, pressure
7/31/2021, 1200, 34.55, 1013
7/31/2021, 1210, 34.56, 1013
7/31/2021, 1220, 34.55, 1013
...
7/31/2021, 1540, 36.75, 1014
7/31/2021, 1550, 36.77, 1014
7/31/2021, 1600, 36.78, 1016
I will need to perform additional tasks after that but it is easy to do in excel using formulas once the pressures are lined up properly, wouldnt be bad for small data sets to manually go in and change but I have months of data being recorded at 10min intervals so it can get quite time consuming. Let me know your thoughts and Ill try and answer any additional questions
CodePudding user response:
One approach would be to first load all of the available pressure entries into a dictionary based on the datetime. This would then allow you to look up each value.
Next read in the level rows. Remove the hour
value to allow you to look up the corresponding pressure. Combine the two and write to the output csv file.
For example:
from datetime import datetime
import csv
# Load all available pressures into a dictionary
pressure = {}
with open('pressure.csv') as f_pressure:
csv_pressure = csv.reader(f_pressure)
header = next(csv_pressure)
for row in csv_pressure:
pressure[datetime.strptime(f'{row[0]} {row[1]}', '%m/%d/%Y %H%M')] = row[2]
with open('waterlevel.csv') as f_level, open('output.csv', 'w', newline='') as f_output:
csv_level = csv.reader(f_level)
header = next(csv_level)
csv_output = csv.writer(f_output)
csv_output.writerow(["date", "time", "depth", "pressure"])
for row in csv_level:
dt = datetime.strptime(f'{row[0]} {row[1]}', '%m/%d/%Y %H%M').replace(minute=0)
csv_output.writerow([*row, pressure[dt]])
For the data given, this would give you:
date,time,depth,pressure
7/31/2021,1200,34.55,1013
7/31/2021,1210,34.56,1013
7/31/2021,1220,34.55,1013
7/31/2021,1540,36.75,1014
7/31/2021,1550,36.77,1014
7/31/2021,1600,36.78,1016
If a suitable pressure value is not available, you will have to add code to perhaps use the previously available entry.
CodePudding user response:
To me, the... key