Home > database >  Best way to save/load 4-dimensional array in Octave
Best way to save/load 4-dimensional array in Octave

Time:03-03

I have an Octave code that gathers data from thousands of .csv files and stores it in a 4-dimensional matrix (800x8x80x213) so I can access it with other code. The process of reading in the data takes about 10 minutes so I thought it would be a good idea to save the matrix and then I could load it into the workspace whenever I wanted to work with the data instead of waiting 10 minutes for the matrix to be created. I used Save to save the matrix and Load to load it into the workspace, however when I loaded the matrix, it took 30 minutes to complete. Is there a better/faster way to save/load this 4-D matrix? Seems ridiculous that it takes 3 times longer to load a matrix than to create it from 4000 files...

CodePudding user response:

The default 'format' option used by the save command is -text, which is human readable. For large datasets, this will take a long time to create (not to mention, it will lead to a much larger file, since it will need to represent floating point numbers via their text representations...), so it is indeed inappropriate for this kind of data. Loading from a large text format file will also take quite a long time, especially on a slow computer, for the same reasons.

Octave also supports a -binary option, which is octave's internal binary format. This is what you need. E.g.

save -binary outputfile.bin varname

In this particular case, the text file is 2.2G, whereas the binary format is the expected 872Mb (i.e. number of elements * 8 bytes per element). Saving and loading is near instant.

Alternatively, there's a bunch of other options too, corresponding to other common formats, e.g. as a commenter has also mentioned here, -hdf5, or -v7 which is matlab's .mat format.

Type help save on your octave console for more details.

  • Related