Home > Software design >  Object to pandas dataframe
Object to pandas dataframe

Time:11-08

I have a list of objects like in the test variable below:

@dataclasses.dataclass
class A:
    a: float
    b: float
    c: float

@dataclasses.dataclass
class B:
    prop: str
    attr: List["A"]

test = [
    B("z", [A('a', 'b', 'c'), A('d', 'l', 's')]),
    B("a", [A('s', 'v', 'c')]),
]

And I want it to transform it into a pandas df like this:

   prop a   b   c
0   z   a   b   c
0   z   d   l   s
1   a   s   v   c

I can do it in several steps, but it seems unnecessary and inneficient as I'm going multiple times through the same data:

a = pd.DataFrame(
        [obj.__dict__ for obj in test]
    )
a
    prop    attr
0   z   [A(a='a', b='b', c='c'), A(a='d', b='l', c='s')]
1   a   [A(a='s', b='v', c='c')]

b = a.explode('attr')
b
    prop    attr
0   z   A(a='a', b='b', c='c')
0   z   A(a='d', b='l', c='s')
1   a   A(a='s', b='v', c='c')

b[["a", "b", "c"]] = b.apply(lambda x: [x.attr.a, x.attr.b, x.attr.c], axis=1, result_type="expand")
b

prop    attr    a   b   c
0   z   A(a='a', b='b', c='c')  a   b   c
0   z   A(a='d', b='l', c='s')  d   l   s
1   a   A(a='s', b='v', c='c')  s   v   c

Can it be done a bit more efficient?

CodePudding user response:

Use a combination of dataclasses.asdict and pd.json_normalize

In [59]: pd.json_normalize([dataclasses.asdict(k) for k in test], 'attr', ['prop'])
Out[59]:
   a  b  c prop
0  a  b  c    z
1  d  l  s    z
2  s  v  c    a

CodePudding user response:

Another version:

df = pd.DataFrame({"prop": b.prop, **a.__dict__} for b in test for a in b.attr)

Result:

  prop  a  b  c
0    z  a  b  c
1    z  d  l  s
2    a  s  v  c
  • Related