Hello I'm really new to Azure Data Factory. My input JSON is this:
{
"name": "Ranjana Sinha",
"schools": [{"schoolName": "ABC Institute",
"schoolAddress": "123, XYZ Road"},
{"schoolName": "AFG Primary",
"schoolAddress": "1002, XYZ Road"}]
}
Here I want to find the "XYZ " and replace the following "Road" with "Avenue". I have created the pipeline and I can successfully copy the data from source to sink. Can someone direct me to the functions I should be using to modify the data in the process? Any documentation or any help is greatly appreciated.
CodePudding user response:
Data Factory pipelines do not work directly on the data, rather they execute other activities to perform operations. You've already done this with the Copy activity, but as you've discovered it is fairly limited.
For inline data manipulation you'll need to use an activity inside the Pipeline with that capability. In this case, you should investigate Data Flow which executes as a Spark job at runtime. As such it has rich expression capabilities. I don't have an example handy of your specific use case, but the following pattern should work for you:
- Read the JSON as a Source (probably a Dataset with a Schema).
- Use a Derived Column to perform a string replacement.
- Output the result to a Sink.