Home > OS >  Azure Synapse - Is there a way to import data from an ODBC source in batches?
Azure Synapse - Is there a way to import data from an ODBC source in batches?

Time:01-06

We are attempting to import data from Netsuite into Azure Synapse Analytics via SuiteAnalytics Connect (ODBC) - this works on all but one table (transactionLine) where the connection always times out at 2 hours

From speaking to Oracle it appears the the two hour limit is fixed and cannot be increased.

I have a Copy Data activity set up in a pipeline, when I set "Use Query" to "Table" it times out but when I set it to "Query" and limit it to a small number of rows (e.g. 100) it works fine (see screenshot)

Is there a built in way to import this data in batches, say 1000 rows at a time, without creating dozens of separate Copy Data activities?

Copy data limit to 100 rows

CodePudding user response:

For your pipeline to work when the large no of records or to create a Batches to import data you can use the following approach.

  • If there are large no of records and you want to batch them in 1000 rows first use the lookup activity and the get the count of all records and store it in the set variable with below dynamic expression or if you know the number you and directly add it in set variable activity.
#to get ftom lookup
@string(activity('Lookup1').output.count)

#to add manually
@string(5000)

enter image description here

  • I took another variable to set the starting of records processed as start.
@string(1)

enter image description here

  • I took another variable to set the ending of records processed as end.
@string(1000)

enter image description here

  • Now use until activity to process until the end value is greater than left.
@greater(variables('end'),variables('left'))

enter image description here

  • Now, use the execute pipeline activity and pass the variable values to that pipeline parameters and with help of this parameters you can query the data and import it in batches.

enter image description here

  • Now we have to update the values of start and end where we increment the value of start and and also increment the value of end.
  • Since we can't self-reference a variable, we have to take a temporary reference variable and then update it.
  • To update the value of start, I have taken as changestart variable with following value.
@string(add(int(variables('start')),1000))

enter image description here

  • Assign the above changestart value to start variable.

enter image description here

  • Follow similarly for end as well. The following is a debug output when there are 5000 records.

enter image description here

  • Related