Home > Enterprise >  Azure Synapse Data Flows - parquet file names not working
Azure Synapse Data Flows - parquet file names not working

Time:12-01

I have created a data flow within Azure synapse to:

  1. take data from a dedicated SQL pool
  2. perform some transformations
  3. send the resulting output to parquet files

I am then creating a View based on the resulting parquet file using OPENROWSET to allow PowerBI to use the data via the built-in serverless SQL pool

My issue is that whatever the file name I enter on the integration record, the parquet files always look like part-00000-2a6168ba-6442-46d2-99e4-1f92bdbd7d86-c000.snappy.parquet - or similar

Is there a way to have a fixed filename which is updated each time the pipeline is run, or alternatively is there a way to update the parquet file to which the View refers each time the pipeline is run, in an automated way.

Fairly new to this kind of integration, so if there is a better way to acheive this whole thing then please let me know

enter image description here

CodePudding user response:

Azure Synapse Data Flows - parquet file names not working

  • I repro'd the same and got the file name as in below image. enter image description here

In order to have the fixed name for sink file name,

  • Set Sink settings as follows
File name Option: Output to single file
Output to single file: tgtfile (give the file name)

enter image description here

  • In optimize, Select single partition.

enter image description here

Filename is as per the settings

enter image description here

  • Related