Home > Enterprise >  Azure Data Factory REST API return invalid JSON file with pagination
Azure Data Factory REST API return invalid JSON file with pagination

Time:01-26

I'm building a pipeline, which copy a response from a API into a file in my storage account. There is also an element of pagination. However, that works like a charm and i get all my data from all the pages.

My result is something like this:

 {"data": {
  "id": "Something",
  "value": "Some other thing"
}}

The problem, is that the copy function just appends the response to the file and thereby making it invalid JSON, which is a big problem further down the line. The final output would look like:

{"data": {
  "id": "22222",
  "value": "Some other thing"
}}
{"data": {
  "id": "33333",
  "value": "Some other thing"
}}

I have tried everything I could think of and google my way to, but nothing changes how the data is appended to the file and i'm stuck with an invalid JSON file :(

As a backup plan, i'll just make a loop and create a JSON file for each PAGE. But that seems a bit janky and really slow

Anyone got an idea or have a solution for my problem?

CodePudding user response:

When you copy data from Rest API to blob storage it will copy data in the form of set of objects by default.

Example:

sample data
{ "time": "2015-04-29T07:12:20.9100000Z", "callingimsi": "466920403025604"}

sink data
{"time":"2015-04-29T07:12:20.9100000Z","callingimsi":"466920403025604"}
{"time":"2015-04-29T07:13:21.0220000Z","callingimsi":"466922202613463"}
{"time":"2015-04-29T07:13:21.4370000Z","callingimsi":"466923101048691"}

This is the invalid format of Json.

To work around this, select file pattern in sink activity setting as Array of objects this will return array of all objects. enter image description here

Output:

enter image description here

  • Related