Home > Software design >  How to speed up python and sqlalchemy?
How to speed up python and sqlalchemy?

Time:03-01

The model in my source code is in the format below. Array in dict Array in dict Array in dict...

# data structure
user_list = [{user_name: 'A', 
  email: '[email protected]', 
  items:[{name:'a_itme1', properties:[{1....},{2....}...]}
 ]} * 100]

I'm trying to put the above data into a postgresql db with SQLAlchemy. There is a user table, an entity table, and an attribute table. And there are tables that link users and items, and items and properties respectively.

for u in user_list:
  new_user = User(user_name=u.get('user_name'),....)
  session.add(new_user)
  session.flush()
  for item in u.get('items'):
    new_item = Item(name=item.get('name'),.....)
    session.add(new_item)
    session.flush()
    new_item_link = UserItemLink(user_id=new_user.id, item_id=new_item.id,...)
    session.add(new_item_link)
    session.flush()
    for prop in item.properties:
      new_properties = Properties(name=prop.get('name'),...)
      session.add(new_properties)
      session.flush()
      new_prop_link = ItemPropLink(item_id=new_item.id, prop_id=new_properties.id,...)
      session.add(new_prop_link)
      session.flush()
session.commit()

My models look like this:

class User(Base):
    __tablename__ = 'user'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(20))
    email = Column(String(50))

    user_item_link = relationship('UserItemLink', back_populates='user')

class Item(Base):
    __tablename__ = 'item'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(50))
    note = Column(String(50))

    user_item_link = relationship('UserItemLink', back_populates='item')

class Properties(Base):
    __tablename__ = 'properties'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(50))
    value = Column(String(50))

    item_prop_link = relationship('ItemPropLink', back_populates='properties')

class UserItemLink(Base):
    __tablename__ = 'user_item_link'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    user_id = Column(ForeignKey('db.user.id'), nullable=False)
    item_id = Column(ForeignKey('db.item.id'), nullable=False)

The above sources have been simplified for better understanding. When session.add() is performed sequentially with the above information, it takes a lot of time. When 100 user information is input, there is a delay of 8 seconds or more.

Please advise to improve python speed and sqlalchemy speed.

CodePudding user response:

As you have relationships configured on the models you can compose complex objects using these relationships instead of relying on ids:

with Session.begin() as s:
    for u in user_list:
        user_item_links = []
        for item in u.get('items'):
            item_prop_links = []
            for prop in item['properties']:
                item_prop_link = ItemPropLink()
                item_prop_link.properties = Properties(name=prop.get('name'), value=prop.get('value'))
                item_prop_links.append(item_prop_link)
            item = Item(name=item.get('name'), item_prop_link=item_prop_links)
            user_item_link = UserItemLink()
            user_item_link.item = item
            user_item_links.append(user_item_link)
        new_user = User(name=u.get('user_name'), email=u.get('email'), user_item_link=user_item_links)
        s.add(new_user)

SQLAlchemy will automatically set the foreign keys when the session is flushed at commit time, removing the need to manually flush.

CodePudding user response:

@snakecharmerb share part of the model definition.

class User(Base):
    __tablename__ = 'user'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(20))
    email = Column(String(50))

    user_item_link = relationship('UserItemLink', back_populates='user')

class Item(Base):
    __tablename__ = 'item'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(50))
    note = Column(String(50))

    user_item_link = relationship('UserItemLink', back_populates='item')

class Properties(Base):
    __tablename__ = 'properties'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    name = Column(String(50))
    value = Column(String(50))

    item_prop_link = relationship('ItemPropLink', back_populates='properties')

class UserItemLink(Base):
    __tablename__ = 'user_item_link'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    user_id = Column(ForeignKey('db.user.id'), nullable=False)
    item_id = Column(ForeignKey('db.item.id'), nullable=False)

    user = relationship('User', back_populates='user_item_link')
    item = relationship('Item', back_populates='user_item_link')

class ItemPropLink(Base):
    __tablename__ = 'item_prop_link'

    id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
    item_id = Column(ForeignKey('db.item.id'), nullable=False)
    properties_id = Column(ForeignKey('db.properties.id'), nullable=False)

    item = relationship('Item', back_populates='item_prop_link')
    properties = relationship('Properties', back_populates='item_prop_link')
  • Related