I have a list of objects like in the test variable below:
@dataclasses.dataclass
class A:
a: float
b: float
c: float
@dataclasses.dataclass
class B:
prop: str
attr: List["A"]
test = [
B("z", [A('a', 'b', 'c'), A('d', 'l', 's')]),
B("a", [A('s', 'v', 'c')]),
]
And I want it to transform it into a pandas df like this:
prop a b c
0 z a b c
0 z d l s
1 a s v c
I can do it in several steps, but it seems unnecessary and inneficient as I'm going multiple times through the same data:
a = pd.DataFrame(
[obj.__dict__ for obj in test]
)
a
prop attr
0 z [A(a='a', b='b', c='c'), A(a='d', b='l', c='s')]
1 a [A(a='s', b='v', c='c')]
b = a.explode('attr')
b
prop attr
0 z A(a='a', b='b', c='c')
0 z A(a='d', b='l', c='s')
1 a A(a='s', b='v', c='c')
b[["a", "b", "c"]] = b.apply(lambda x: [x.attr.a, x.attr.b, x.attr.c], axis=1, result_type="expand")
b
prop attr a b c
0 z A(a='a', b='b', c='c') a b c
0 z A(a='d', b='l', c='s') d l s
1 a A(a='s', b='v', c='c') s v c
Can it be done a bit more efficient?
CodePudding user response:
Use a combination of dataclasses.asdict
and pd.json_normalize
In [59]: pd.json_normalize([dataclasses.asdict(k) for k in test], 'attr', ['prop'])
Out[59]:
a b c prop
0 a b c z
1 d l s z
2 s v c a
CodePudding user response:
Another version:
df = pd.DataFrame({"prop": b.prop, **a.__dict__} for b in test for a in b.attr)
Result:
prop a b c
0 z a b c
1 z d l s
2 a s v c