Home > Back-end >  Create Pandas Dataframe from a List of Objects
Create Pandas Dataframe from a List of Objects

Time:08-10

I have a list of objects that I would like to use to create a Pandas dataframe.

I'm new to Pandas but I thought this would work:

df = pd.DataFrame.from_records(ddf_list, columns=['fixtureID', 'marketID', 'selectionID', 'competitorID'])

But it's not even close.

I can't find much information on doing this. Is it even possible? If so, how?

List:

ddf_list = [SelectionsForMarket(source_fixture_id='109040945', source_market_id='101609040945', source_market_type_id='10160', source_selection_id='10904094552946019', trading_status='NonRunner', name='Hulton Ranger', competitor_id='1052946019', ut=datetime.datetime(2022, 8, 9, 6, 30, 42, 149510), order=0, max_price=-1.0, prices=[]), SelectionsForMarket(source_fixture_id='109040945', source_market_id='101609040945', source_market_type_id='10160', source_selection_id='10904094552600648', trading_status='Trading', name='Yeeeaah', competitor_id='1052600648', ut=datetime.datetime(2022, 8, 9, 6, 30, 42, 149510), order=0, max_price=3.25, prices=[Price: 3.25, Bookmakers #:2, Price: 3.00, Bookmakers #:8]), SelectionsForMarket(source_fixture_id='109040945', source_market_id='101609040945', source_market_type_id='10160', source_selection_id='10904094553052373', trading_status='Trading', name='Helm Princess', competitor_id='1053052373', ut=datetime.datetime(2022, 8, 9, 6, 30, 42, 149510), order=0, max_price=3.75, prices=[Price: 3.75, Bookmakers #:8, Price: 3.40, Bookmakers #:1])]

Desired output:

fixtureID marketID selectionID competitorID
101609040945 10160 10904094552946000 1052946019
101609040945 10160 10904094552600600 1052600648
101609040945 10160 10904094553052300 1053052373

CodePudding user response:

Use:

L = [{'fixtureID': x.SelectionsForMarket.source_fixture_id,
      'marketID': x.SelectionsForMarket.source_market_id,
      'source_selection_id': x.SelectionsForMarket.source_selection_id,
      'competitorID': x.SelectionsForMarket.competitor_id} for x in ddf_list]

df = pd.DataFrame(L)

CodePudding user response:

You can convert you class to dataclass which stores dict representation of the object, then all you have to do is loop to get dict of each object and transform into dataframe.

Here's how you can do it.

from dataclasses import dataclass, field
import datetime
from typing import List, Dict
import pandas as pd

@dataclass
class SelectionsForMarket:
    source_fixture_id: str
    source_market_id: str
    source_market_type_id: str
    source_selection_id: str
    trading_status: str
    name: str
    competitor_id: str
    ut: datetime
    order: int
    max_price: float
    prices: List[Dict]


ddf_list = [
    SelectionsForMarket(source_fixture_id='109040945', 
                        source_market_id='101609040945', 
                        source_market_type_id='10160', 
                        source_selection_id='10904094552946019', 
                        trading_status='NonRunner', 
                        name='Hulton Ranger', 
                        competitor_id='1052946019', 
                        ut=datetime.datetime(2022, 8, 9, 6, 30, 42, 149510), 
                        order=0, 
                        max_price=-1.0, 
                        prices=[]), 
    SelectionsForMarket(source_fixture_id='109040945', 
                        source_market_id='101609040945', 
                        source_market_type_id='10160', 
                        source_selection_id='10904094552600648', 
                        trading_status='Trading', 
                        name='Yeeeaah', 
                        competitor_id='1052600648', 
                        ut=datetime.datetime(2022, 8, 9, 6, 30, 42, 149510), 
                        order=0, 
                        max_price=3.25, 
                        prices=[{"Price": 3.25, "Bookmakers" :2}, 
                                {"Price": 3.00, "Bookmakers": 8}]), 
    SelectionsForMarket(source_fixture_id='109040945', 
                        source_market_id='101609040945', 
                        source_market_type_id='10160', 
                        source_selection_id='10904094553052373', 
                        trading_status='Trading', 
                        name='Helm Princess', 
                        competitor_id='1053052373', 
                        ut=datetime.datetime(2022, 8, 9, 6, 30, 42, 149510), 
                        order=0, 
                        max_price=3.75, 
                        prices=[{"Price": 3.75, "Bookmakers" :8}, 
                                {"Price": 3.40, "Bookmakers" :1}])]


required_columns = ["source_fixture_id", "source_market_id", "source_selection_id", "competitor_id"]
data = pd.DataFrame([x.__dict__ for x in ddf_list])[required_columns]
data
  • Related