Home > OS >  What is the best method to update small records from a big database?
What is the best method to update small records from a big database?

Time:04-28

I have a database which contains all the data from a CSV that I downloaded from a botanic website, and I want to it always updated based on the version of this CSV. My problem is that the CSV is extremely huge, It contains at least 1 million records which takes on average an entire hour to save everything. Any ideas on how can I perform this update without the need to rewrite the entire thing? TYSM!

Some extra info:

  • It's a nodeJs project, and I'm using Prisma for the ORM
  • It's actually a txt file which I parse as a CSV since all the data is separated by tabs
  • I'm using posgresql
  • Currently I'm dropping the table and saving everything from the ground up

CodePudding user response:

You don't mention how you load the data into postgres, but you could use the COPY command, something like this:

COPY mytable FROM '/tmp/myfile.csv' (FORMAT csv, header TRUE);

Should be faster than using the ORM

https://www.postgresql.org/docs/current/sql-copy.html

Other option is to use file_fdw, if you just need to query the data, no updates

https://www.postgresql.org/docs/current/file-fdw.html

CodePudding user response:

You would need to essentially find differences between two csv versions and find the records which have changed and just update those records.

You could use any text/csv comparator npm library to find the difference, and once you get the records which has updates you could just initiate an update query for them.

  • Related