I've been running into a lot of scenarios where I have to iterate through a CSV or other data file, find some data that matches a few conditions and puts that data into a single array. Pretty standard and common Numpy behavior.
My general approach is setting up a list, finding the value in a for loop, appending to that list, then converting back to an array at the end.
stats = []
for i in range(len(headers)):
max_value = np.max(data[:, i])
stats.append(max_value)
all_stats = np.array(stats, dtype = float)
This is just seems bloated, and isn't robust when I want to also insert different values for different conditions. Whats the best way to build up an array of values in a for loop where the size of the resulting array would not be known in advance?
Thanks!
CodePudding user response:
By the looks of your code you could have:
all_stats = np.max( data[:,:len(headers)], axis=1)
giving you the same results in a vectorized (faster) way.