I have a table that loads data from an SAP File containing hundreds of thousands of records. Every day the 3rd-party uploads a new version of this file (but that does not mean all records are updated). Nonetheless, I still load the file every day.
To update the data, currently, what I do is delete all and then insert all again. However, I think this will pose an issue in the future that I would like to avoid.
For some reason, DROP and TRUNCATE is out of the option.
CodePudding user response:
You can setup something with schema swapping and bulk insert/ssis.
- Have 3 schemas, 1 to read, 1 to load and 1 to swap
- Truncate the load schema or drop/create the table
- Do a bulk insert or load the data with ssis in the load schema
- After the bulk insert swap the read schema to the swap schema
- swap the load schema with the read schema
- swap the swap schema with the load schema
This way you don't block any readers while you are loading. Or you can setup a full ETL pipeline, with staging tables.
Article on this: https://www.sqlservercentral.com/articles/keeping-tables-online-during-loading-with-schema-swapping-using-ssis
Resources on bulk copy with .net: https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlbulkcopy?view=dotnet-plat-ext-6.0#examples
CodePudding user response:
If this process is regular at specified hour you could write script to manage those queries and execute it as a cronjob. I think SSIS can do it for you also. It depends on you how much effort you would like to put into it.