Home > Software design >  Is there a way to store a binary file to HDF5?
Is there a way to store a binary file to HDF5?

Time:05-18

I have a numpy .npy file and would like to store this in my HDF. I use the numpy format because the file has dtype U22 and H5py does not like that.

Is there a way to store this binary file inside the HDF so I can access it using the dictionary format like file['general/numpy_binary']. I can try to provide an MWE if required.

MWE:

a = np.array([['here',1,2,3], ['that',4,5,6]]).astype('S10') which is a 2x4 array.

with h5py.File('dump.h5','w') as debug:
    g=debug.create_group('general')
    dset = debug.create_dataset('general/data', data=a.tolist())

results in this:

resulting dataset

I would like to see this data as a 2x4 table. Is that possible?

CodePudding user response:

What you're showing seems to be a fault in your tool. Observe:

>>> import numpy as np
>>> a = np.array([['here',1,2,3], ['that',4,5,6]]).astype('S10')
>>> a
array([[b'here', b'1', b'2', b'3'],
       [b'that', b'4', b'5', b'6']], dtype='|S10')
>>> a.tolist()
[[b'here', b'1', b'2', b'3'], [b'that', b'4', b'5', b'6']]
>>> import h5py
>>> h = h5py.File('data.h5','w')
>>> h.create_dataset('data',data=a.tolist())
<HDF5 dataset "data": shape (2, 4), type "|S4">
>>> h.close()
>>> h
<Closed HDF5 file>
>>> h = h5py.File('data.h5','r')
>>> h['data']
<HDF5 dataset "data": shape (2, 4), type "|S4">
>>> h['data'][0]
array([b'here', b'1', b'2', b'3'], dtype='|S4')
>>> h['data'][1]
array([b'that', b'4', b'5', b'6'], dtype='|S4')
>>>

Note that the shape is (2,4). Both rows are present in the file.

  • Related