I'm very new to coding and stack overflow, so my apologies if my code is clunky. I'm adjusting some code from Tim Supinie (https://github.com/tsupinie/vad-plotter) to run through a given time frame and plot hodographs for these times. I've also created a csv file of params in this loop. I'll include the code that I think is relevant below.
def main():
ap = argparse.ArgumentParser()
ap.add_argument('radar_id', help="The 4-character identifier for the radar (e.g. KTLX, KFWS, etc.)")
ap.add_argument('-m', '--storm-motion', dest='storm_motion', help="Storm motion vector. It takes one of two forms. The first is either 'BRM' for the Bunkers right mover vector, or 'BLM' for the Bunkers left mover vector. The second is the form DDD/SS, where DDD is the direction the storm is coming from, and SS is the speed in knots (e.g. 240/25).", default='right-mover')
ap.add_argument('-s', '--sfc-wind', dest='sfc_wind', help="Surface wind vector. It takes the form DDD/SS, where DDD is the direction the storm is coming from, and SS is the speed in knots (e.g. 240/25).")
ap.add_argument('-t', '--start-time', dest='start_time', help="Start time to plot. Takes the form DD/HHMM, where DD is the day, HH is the hour, and MM is the minute.")
ap.add_argument('-e', '--end-time', dest='end_time', help="End time to plot. Takes the form DD/HHMM, where DD is the day, HH is the hour, and MM is the minute.")
ap.add_argument('-f', '--img-name', dest='img_name', help="Name of the file produced.")
ap.add_argument('-p', '--local-path', dest='local_path', help="Path to local data. If not given, download from the Internet.")
ap.add_argument('-c', '--cache-path', dest='cache_path', help="Path to local cache. Data downloaded from the Internet will be cached here.")
ap.add_argument('-w', '--web-mode', dest='web', action='store_true')
ap.add_argument('-x', '--fixed-frame', dest='fixed', action='store_true')
args = ap.parse_args()
np.seterr(all='ignore')
start_time = args.start_time
end_time = args.end_time
loop_time = start_time
minute = timedelta(minutes=1)
tmp = pd.DataFrame()
while loop_time <= end_time:
try:
vad_plotter(args.radar_id,
storm_motion=args.storm_motion,
sfc_wind=args.sfc_wind,
time=loop_time,
fname=args.img_name,
local_path=args.local_path,
cache_path=args.cache_path,
web=args.web,
fixed=args.fixed
)
tmp = tmp.append(params, loop_time)
except:
if args.web:
print(json.dumps({'error':'error'}))
else:
print('This time does not exist. Continuing to next time.')
loop_time_dt = datetime.strptime(loop_time, '%Y-%m-%d/%H%M')
loop_time_dt = minute
loop_time = datetime.strftime(loop_time_dt, '%Y-%m-%d/%H%M')
tmp.to_csv('parameters.csv')
I have it working so that I get a csv file that looks something like this (I've shortened it for this example):
shear_mag_1000m
0 26
1 32
2 29
3 27
But I would like to have a time column that has each corresponding successful time so it looks more like this:
time shear_mag_1000m
2100 26
2200 32
2300 29
2400 27
I think the times would be the loop_time, but I don't know how to only have the successful loop times included (For example, I'd have a start time of 2100 and an end time of 2150 with an increment of 1 minute. However, there might only be data available at 2100, 2124, and 2148. These are currently the only times the hodographs are plotted for and the parameters are added to the csv file). Any help to add the time column is appreciated!
CodePudding user response:
First, you need to filter the data that only have loop_time value.
Then you can use set_index
from Pandas
You can set the loop_time
column as the index before saving the output to CSV
temp.set_index('loop_time')
CodePudding user response:
TIP: Always avoid appending into dataframes (besides,
append()
is deprecated) - prefer appending into lists/dictionaries instead, data structures that are meant to "grow".
Use the pandas.date_range()
to create the equally spaced index between those dates requested, like this (make sure your date-formats are recognized):
import pandas as pd
## Request 1 minute intervals, for more `freq` values
# see https://pandas.pydata.org/docs/user_guide/timeseries.html#timeseries-offset-aliases
tindex = pd.date_range(start_time, end_time, freq="1T")
...and then feed this into a function requesting the NEXRAD VAD for each date. But since vad_plotter()
may throw occasional exceptions (eg "data for time(..) do not exist"
, wrap the call into a new function handling the exception:
def fetch_nexrad_vad(timestamp, ..., errors: list):
try:
return vad_plotter(time=time, ...)
except Exception as ex:
errors.append(f" {timestamp}: failed getting VAD from NEXRAD due to: {ex}")
A sane step is to use a list-comprehension to create the records for each timestamp in a list-of-lists, and then build the dataframe:
errors = []
df_records = [fetch_nexrad_vad(..., time, ...) for time in tindex]
df = pd.from_records(df_records)
df = df.set_index(tindex)
...but experienced pandas programmers know faster/terser ways to build the dataframe from the index function in one step.
NOTE: using the
error
list to collect invalid times makes it pretty simple to convert it into JSON. Or even better collect just the invalid timestamps, if thrown exception is always the same.