Home > Software engineering >  Why does Fast API take upwards of 10 minutes to insert 100,000 rows into a SQL database
Why does Fast API take upwards of 10 minutes to insert 100,000 rows into a SQL database

Time:09-03

I've tried using SqlAlchemy, as well as raw mysql.connector here, but commiting an insert into a SQL database from FastAPI takes forever.

I wanted to make sure it wasn't just my DB, so I tried it on a local script and it ran in a couple seconds.

How can I work with FastAPI to make this query possible?

Thanks!

'''

@router.post('/')
def postStockData(data:List[pydanticModels.StockPrices], raw_db = Depends(get_raw_db)):
  
    cursor = raw_db[0]
    cnxn = raw_db[1]

    # i = 0
    # for row in data:
    #   if i % 10 == 0:
    #     print(i)
    #     db.flush()
    #   i  = 1
    #   db_pricing = models.StockPricing(**row.dict())
    #   db.add(db_pricing)
    # db.commit()
    SQL = "INSERT INTO "   models.StockPricing.__tablename__   " VALUES (%s, %s, %s)"
    print(SQL)

    valsToInsert = []
    for row in data:
      rowD = row.dict()
      valsToInsert.append((rowD['date'], rowD['symbol'], rowD['value']))
    cursor.executemany(SQL, valsToInsert)
    cnxn.commit()

    return {'message':'Pricing Updated'}

'''

CodePudding user response:

You are killing performances because you try a "RBAR" approach which is not suitable in RDBMS... You use a loop and execute an SQL INSERT of only one row... When the RDBMS is facing a query, the sequence of execution is the following :

  • does the user that throw the query be authenticate ?
  • parsing the string to verify the syntax
  • looking for metadata (tables, columns, datatypes...)
  • analyzing which operations on tables and columns this user is granted
  • creating an execution plan to sequences all the operations needed for the query
  • setting up lock for concurrency
  • executing the query (inserting only 1 row)
  • throw back an error or a OK message

Every steps consumes time... and your are all theses steps 100 000 times because of your loop.

Usually when inserting in a table many rows, there just one query to do even if the INSERT concerns 10000000000 rows from a file !

  • Related