Why does Fast API take upwards of 10 minutes to insert 100,000 rows into a SQL database-CodePudding

I've tried using SqlAlchemy, as well as raw mysql.connector here, but commiting an insert into a SQL database from FastAPI takes forever.

I wanted to make sure it wasn't just my DB, so I tried it on a local script and it ran in a couple seconds.

How can I work with FastAPI to make this query possible?

Thanks!

'''

@router.post('/')
def postStockData(data:List[pydanticModels.StockPrices], raw_db = Depends(get_raw_db)):
  
    cursor = raw_db[0]
    cnxn = raw_db[1]

    # i = 0
    # for row in data:
    #   if i % 10 == 0:
    #     print(i)
    #     db.flush()
    #   i  = 1
    #   db_pricing = models.StockPricing(**row.dict())
    #   db.add(db_pricing)
    # db.commit()
    SQL = "INSERT INTO "   models.StockPricing.__tablename__   " VALUES (%s, %s, %s)"
    print(SQL)

    valsToInsert = []
    for row in data:
      rowD = row.dict()
      valsToInsert.append((rowD['date'], rowD['symbol'], rowD['value']))
    cursor.executemany(SQL, valsToInsert)
    cnxn.commit()

    return {'message':'Pricing Updated'}

'''

CodePudding user response：

You are killing performances because you try a "RBAR" approach which is not suitable in RDBMS... You use a loop and execute an SQL INSERT of only one row... When the RDBMS is facing a query, the sequence of execution is the following :

does the user that throw the query be authenticate ?
parsing the string to verify the syntax
looking for metadata (tables, columns, datatypes...)
analyzing which operations on tables and columns this user is granted
creating an execution plan to sequences all the operations needed for the query
setting up lock for concurrency
executing the query (inserting only 1 row)
throw back an error or a OK message

Every steps consumes time... and your are all theses steps 100 000 times because of your loop.

Usually when inserting in a table many rows, there just one query to do even if the INSERT concerns 10000000000 rows from a file !