Home > OS >  Is it possible to have a 2D array as an element of a 2D dataframe?
Is it possible to have a 2D array as an element of a 2D dataframe?

Time:03-08

I am a student doing my project with video encoder. I have extracted pixel values with other info (POC, x position, y position, height, width) with them in txt files. I am going to convert these data into hdf5 files with h5py. But I wonder if pandas or even hdf5 support having a 2D-array(the pixel values) in a pandas 2D dataframe.

For example, let p = [[1,2],[3,4]] as my pixel value. May I have my dataframe be like dataset[0] = [0(POC),0(x),0(y),p(2D array pixel values)]? Or can it even written in hdf5 format?

CodePudding user response:

Yes, you can, here goes an example:

df = pd.DataFrame({'Col1':['a','b','c'],
                   'Col2':[{'a':[1,2,3]},{'b':[[2.1],[1]]},{'c':[{'test':'hello'}]}],
                   'Col3':[[[1,2],[3,4]],[[4],[5]],[[3,5,10],[9]]]})
df

Output:

df

    Col1                          Col2               Col3
0      a              {'a': [1, 2, 3]}   [[1, 2], [3, 4]]
1      b           {'b': [[2.1], [1]]}         [[4], [5]]
2      c    {'c': [{'test': 'hello'}]}  [[3, 5, 10], [9]]

As you can see, you can have lists, lists of lists, dictionaries, dictionaries of lists, lists of dictionaries, etc, inside a pandas dataframe.

df.info()

Output:


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Col1    3 non-null      object
 1   Col2    3 non-null      object
 2   Col3    3 non-null      object
dtypes: int64(1), object(2)
memory usage: 200.0  bytes
type(df.Col3[0])

Output:

list

CodePudding user response:

Here is an example that creactes the HDF5 file with h5py and loads some simple data as attributes.

import h5py
# pixel data to load to a dataset
p = [[1,2],[3,4]]
# attribute names and values:
attr_names = ['POC', "x_position", 'y_position', 'height', 'width']
POC = 10.
x_position = 100
y_position = 200  
height = 16.
width = 5.

with h5py.File('SO_71383695.h5', 'w') as h5w:
    ds = h5w.create_dataset('pixel_data', data=p)
    for name in attr_names:
        ds.attrs[name] = eval(name)
    
with h5py.File('SO_71383695.h5') as h5r:    
    p_data = h5r['pixel_data'][:] # to read into numpy array
    print (p_data)
    for name in h5r['pixel_data'].attrs.keys():
        print(f"{name}: {h5r['pixel_data'].attrs[name]}")

You can view the data with HDFView. Output:

[[1 2]
 [3 4]]
POC: 10.0
height: 16.0
width: 5.0
x_position: 100
y_position: 200

    
  • Related