I am looking for a way to merge three seperate datasets (.csv format) into one in Azure Synapse and then store it as a new .csv in Azure Blob Storage. I am using the Union data flow based on this tutorial:
CodePudding user response:
You are getting 108 rows because the union transformation is combining the 3 separate datasets into 1. If you watch the video in the union transformation documentation page it describes the behavior of this transformation.
To get your desired results you need to use the join transformation. Using the CustomerID
as your join condition this will join the datasets together keeping your row count at 36.
One thing to watch out for is the type of join you choose. If you have customers in one file that are not in another you can drop records. This post describes the different types of joins very well. I suggest you get a firm understanding of this different types of joins.