Home > Back-end >  What dimensions should my Numpy Array be ? Obspy Traces
What dimensions should my Numpy Array be ? Obspy Traces

Time:08-04

I currently have seismic data with 175x events with 3 traces for each event (traces are numpy arrays of seismic data). I have classification labels for whether the seismic data is an earthquake or not for each of those 175 samples. I'm looking to format my data into numpy arrays for modelling. I've tried placing into a dataframe of numpy arrays with each column being a different trace. So columns would be 'Trace one' 'Trace two' 'Trace three'. This did not work. I have tried lots of different methods of arranging the data to use with keras. I'm now looking to create a numpy matrix for the data to go into and to then use for modelling. I had thought that the shape may be (175,3,7501) as (#number of events, #number of traces,#number of samples in trace), however I then iterate through and try to add the three traces to the numpy matrix and have failed. I'm used to using dataframes and not numpy for inputting to Keras.

newrow = np.array([[trace_copy_1],[trace_copy_2],[trace_copy_3]])
data = numpy.vstack([data, newrow])

The data shape is (175,3,7510). The newrow shape is (3,1,7510) and does not allow me to add newrow to data.

The form in which I receive the data is in obspy streams and each stream has the 3 trace objects. With each trace object, it holds the trace data in numpy arrays and so I'm having to access and append those to a dataframe for modelling as obviously I can't feed a stream or trace object to keras model.

CodePudding user response:

If I understand your data correctly you can try one of the following method:

  • If your data shape is (175, 3, 7510) define newrow as follows newrow = np.array([trace_copy_1,trace_copy_2,trace_copy_3]) with trace_copy_x being a numpy array with shape 7510.
  • Use the reshape function (either with numpy.reshape(new_row, (3, 7510)) or new_row.reshape((3, 7510))
  • If you're familiar with dataframes you can still use pandas dataframes by reducing the dimension of your data (you can for example add the different traces at the end of one another on the same row, something you often see when working with images). Here it could be something like pandas.DataFrame(data.reshape((175, 3*7510)))

In addition to that I recommend using numpy.concatenate instead of numpy.vstack (more general).

I hope it will works.

Cheers

CodePudding user response:

Thanks for the answers. The way I solved this was I created the NumPy array of the desired fit shape. (index or number of events, number of traces (or number of arrays), then sample amount (or amount of values in each array)

I then created a new row. I then reshaped and added. Following this, I then split the data to remove the original data before I started appending my new data.

data = np.zeros(shape=(175,3,7501))
newrow = [[trace_copy_1],[trace_copy_2],[trace_copy_3]]
newrow = np.array([[trace_copy_1],[trace_copy_2],[trace_copy_3]])
newrow = newrow.reshape((1,3,7501))

  • Related