I have a data engineering pipeline that is essentially doing nightly scans of data tables to populate a main table with calculations for the frontend clients to view. The pipeline takes around 6 hours to complete.
Instead of deleting the main table and writing the calculations over a 6 hour period, I am instead writing the calculations to a temporary table (so the main table still has prior day's calculations while the today's calculations are generating).
Once complete, I am trying to have the Django ORM essentially delete the main table and insert all of the new calculations from the temp table into the main table in a transaction.
Is there an efficient way using the Django ORM to do this? Thank you!
Also note that the temporary model and main model have the exact same copy of fields.
CodePudding user response:
You can create a python script inside the same application of the models. The script will basically use a for loop on the temporary table and extract field data. And pass it on to the main table to add it.
And since they have the exact same fields, and presumably the same validators, you don't need to explicitly specify the field names when saving to the main table, just pass each row from the temporary table directly to the main and save.
INSERT INTO <maintable>
SELECT * FROM <temptable>;