the npy file I used ⬆️
https://github.com/mangomangomango0820/DataAnalysis/blob/master/NumPy/NumPyEx/NumPy_Ex1_3Dscatterplt.npy
2. after loading the npy file,
data = np.load('NumPy_Ex1_3Dscatterplt.npy')
'''
[([ 2, 2, 1920, 480],) ([ 1, 3, 1923, 480],)
......
([ 3, 3, 1923, 480],)]
⬆️ data.shape, (69,)
⬆️ data.shape, (69,)
⬆️ data.dtype, [('f0', '<i8', (4,))]
⬆️ type(data), <class 'numpy.ndarray'>
⬆️ type(data[0]), <class 'numpy.void'>
'''
you can see for each row, e.g. data[0]
,its type is <class 'numpy.void'>
I wish to get a ndarray based on the data above, looking like this ⬇️
[[ 2 2 1920 480]
...
[ 3 3 1923 480]]
the way I did is ⬇️
all = np.array([data[i][0] for i in range(data.shape[0])])
'''
[[ 2 2 1920 480]
...
[ 3 3 1923 480]]
'''
I am wondering if there's a smarter way to process the numpy.void
class data and achieve the expected results.
CodePudding user response:
Here is the trick
data_clean = np.array(data.tolist())
print(data_clean)
print(data_clean.shape)
Output
[[[ 2 2 1920 480]]
...............
[[ 3 3 1923 480]]]
(69, 1, 4)
In case if you dont like the extra 1 dimension in between, you can squeeze like this
data_sqz = data_clean.squeeze()
print(data_sqz)
print(data_sqz.shape)
Output
...
[ 3 3 1923 480]]
(69, 4)
CodePudding user response:
Your data
is a structured array
, with a compound dtype
.
https://numpy.org/doc/stable/user/basics.rec.html
I can recreate it with:
In [130]: dt = np.dtype([("f0", "<i8", (4,))])
In [131]: x = np.array(
...: [([2, 2, 1920, 480],), ([1, 3, 1923, 480],), ([3, 3, 1923, 480],)], dtype=dt
...: )
In [132]: x
Out[132]:
array([([ 2, 2, 1920, 480],), ([ 1, 3, 1923, 480],),
([ 3, 3, 1923, 480],)], dtype=[('f0', '<i8', (4,))])
This is 1d array onr field, and the field itself contains 4 elements.
Fields are accessed by name:
In [133]: x["f0"]
Out[133]:
array([[ 2, 2, 1920, 480],
[ 1, 3, 1923, 480],
[ 3, 3, 1923, 480]])
This has integer dtype with shape (3,4).
Accessing fields by name applies to more complex structured arrays as well.
Using the tolist
approach from the other answer:
In [134]: x.tolist()
Out[134]:
[(array([ 2, 2, 1920, 480]),),
(array([ 1, 3, 1923, 480]),),
(array([ 3, 3, 1923, 480]),)]
In [135]: np.array(x.tolist()) # (3,1,4) shape
Out[135]:
array([[[ 2, 2, 1920, 480]],
[[ 1, 3, 1923, 480]],
[[ 3, 3, 1923, 480]]])
In [136]: np.vstack(x.tolist()) # (3,4) shape
Out[136]:
array([[ 2, 2, 1920, 480],
[ 1, 3, 1923, 480],
[ 3, 3, 1923, 480]])
The documentation also suggests using:
In [137]: import numpy.lib.recfunctions as rf
In [138]: rf.structured_to_unstructured(x)
Out[138]:
array([[ 2, 2, 1920, 480],
[ 1, 3, 1923, 480],
[ 3, 3, 1923, 480]])
An element of a structured array displays as a tuple, though the type is a generic np.void
There is an older class recarray, that is similar, but with an added way of accessing fields
In [146]: y=x.view(np.recarray)
In [147]: y
Out[147]:
rec.array([([ 2, 2, 1920, 480],), ([ 1, 3, 1923, 480],),
([ 3, 3, 1923, 480],)],
dtype=[('f0', '<i8', (4,))])
In [148]: y.f0
Out[148]:
array([[ 2, 2, 1920, 480],
[ 1, 3, 1923, 480],
[ 3, 3, 1923, 480]])
In [149]: type(y[0])
Out[149]: numpy.record
I often refer to elements of structured arrays as records.