My goal is to achieve simplified access to the various json files, with recarrays in a global recarray to achieve simplified end access like: parms.logic.myVarA.
Desired structure
parms (recarray)
[
logic (recarray)
- myVarA (int)
tool (recarray)
- myVarA (int)
]
Example:
parms.logic.myVarA
parms.tool.myVarA
I'm having a little trouble understanding numpy.recarray, I'm looking for help with a little piece of code. I have a class to test what I want to achieve:
import numpy as np
import simplejson, pathlib
# Loaders
class Loaders(np.recarray):
def __new__(_, targets):
content = [[],[]]
for k, v in targets.items():
# Load child
load = None
if(pathlib.Path(v).suffix == '.json'):
with open(v, 'r', encoding='utf-8') as file:
child = [[],[]]
for k_, v_ in simplejson.load(file).items():
child[0].append(v_)
child[1].append((k_, type(v_)))
load = np.array(tuple(child[0]), dtype=np.dtype(child[1])).view(np.recarray)
print(f'CHILD {k} {type(load)}')
print(f'Brute {child}')
print(f'Check {k}.myVarA{load.myVarA}\n')
if(load):
# Add child
content[0].append(load)
content[1].append((k, type(load)))
print('------ Loaded ------')
print(f'Brute {content}')
return np.array(tuple(content[0]), dtype=np.dtype(content[1])).view(np.recarray)
if __name__ == '__main__':
try:
# FAILURE
print('\n------ Loading ------')
parms = Loaders({
'logic' : './test/logic/parms.json',
'tool' : './test/tool/parms.json'
})
print('\n------ Final check ------')
print(f'parms dtypes {parms.dtype.names}')
print(f'parms.logic {parms.logic} {type(parms.logic)}')
print(f'Check parms.logic.myVarA{parms.logic.myVarA}')
except Exception as e:
print(f'Test failure {e}')
Output
CHILD logic <class 'numpy.recarray'>
Brute [[12, 44], [('myVarA', <class 'int'>), ('valB', <class 'int'>)]]
Check logic.myVarA12
CHILD tool <class 'numpy.recarray'>
Brute [[45], [('myVarA', <class 'int'>)]]
Check tool.myVarA45
------ Loaded ------
Brute [[rec.array((12, 44),
dtype=[('myVarA', '<i8'), ('valB', '<i8')]), rec.array((45,),
dtype=[('myVarA', '<i8')])], [('logic', <class 'numpy.recarray'>), ('tool', <class 'numpy.recarray'>)]]
------ Final check ------
parms dtypes ('logic', 'tool')
parms.logic (12, 44) <class 'numpy.ndarray'>
Test failure 'numpy.ndarray' object has no attribute 'myVarA'
I can see that the type of 'logic' change once the call is made but I don't understand why...
A check of 'parms' recarray dtype shows the presence of 'logic' and 'tool' but with an ndarray type. Yet higher their type is well recarray:
CHILD logic <class 'numpy.recarray'>
parms dtypes ('logic', 'tool')
print(f'Check parms.logic.valA {parms.logic.valA}')
Test failure 'numpy.ndarray' object has no attribute 'valA'
if any of you have an idea of my problem or a way to do this more simply I'm interested, thank you in advance
CodePudding user response:
With this dtype:
In [169]: dt = np.dtype([('logic',[('myVarA',int)]),('tool',[('myVarA',int)])])
I can make a recarray (with "random" values):
In [170]: arr = np.recarray(3,dt)
In [171]: arr
Out[171]:
rec.array([((1969973520,), (598,)), ((1969973584,), (598,)),
((1969973552,), (598,))],
dtype=[('logic', [('myVarA', '<i4')]), ('tool', [('myVarA', '<i4')])])
And access by attribute:
In [172]: arr.logic
Out[172]:
rec.array([(1969973520,), (1969973584,), (1969973552,)],
dtype=[('myVarA', '<i4')])
In [173]: arr.logic.myVarA
Out[173]: array([1969973520, 1969973584, 1969973552])
or field names (as structured array):
In [174]: arr['logic']['myVarA']
Out[174]: array([1969973520, 1969973584, 1969973552])
Another way of nesting recarrays is to use object dtypes:
In [229]: dt1 = np.dtype([('logic',object),('tool',object)])
In [230]: dt2 = np.dtype([('myVarA',int)])
In [231]: arr1 = np.recarray(2, dt1)
In [232]: arr1
Out[232]:
rec.array([(None, None), (None, None)],
dtype=[('logic', 'O'), ('tool', 'O')])
The only way I can fill this with recarrays is:
In [233]: for i in range(2):
...: for n in dt1.names:
...: arr1[n][i] = np.recarray(0, dt2)
...:
In [234]: arr1
Out[234]:
rec.array([(rec.array([],
dtype=[('myVarA', '<i4')]), rec.array([],
dtype=[('myVarA', '<i4')])) ,
(rec.array([],
dtype=[('myVarA', '<i4')]), rec.array([],
dtype=[('myVarA', '<i4')])) ],
dtype=[('logic', 'O'), ('tool', 'O')])
which allows this access:
In [235]: arr1.logic[0].myVarA
Out[235]: array([], dtype=int32)
This may be too detailed for your purposes, but recarray
is essentially just a numpy array subclass with a custom method for fetching a field. If the attribute isn't one of the standard array methods or attributes, it checks the dtype.names
for a matching names:
def __getattribute__(self, attr):
# See if ndarray has this attr, and return it if so. (note that this
# means a field with the same name as an ndarray attr cannot be
# accessed by attribute).
try:
return object.__getattribute__(self, attr)
except AttributeError: # attr must be a fieldname
pass
# look for a field with this name
fielddict = ndarray.__getattribute__(self, 'dtype').fields
try:
res = fielddict[attr][:2]
except (TypeError, KeyError) as e:
raise AttributeError("recarray has no attribute %s" % attr) from e
obj = self.getfield(*res)
# At this point obj will always be a recarray, since (see
# PyArray_GetField) the type of obj is inherited. Next, if obj.dtype is
# non-structured, convert it to an ndarray. Then if obj is structured
# with void type convert it to the same dtype.type (eg to preserve
# numpy.record type if present), since nested structured fields do not
# inherit type. Don't do this for non-void structures though.
if obj.dtype.names is not None:
if issubclass(obj.dtype.type, nt.void):
return obj.view(dtype=(self.dtype.type, obj.dtype))
return obj
else:
return obj.view(ndarray)