I have autogenerating files into a onprem folder that have column headers like "Gross Margin 10/22/22" (pictures also attached). Is there a way I can pull the date and create two separate columns, one gross margin and another date. I'm trying to do this in azure data factory using dataflow. Dataflow is the only thing I've seen so far that could potentially work. Also, I would like to export it form onprem to blob storage then sql but I have to program the pipeline so it makes these changes before sql otherwise the mapping won't work.
CodePudding user response:
I have created dataflow in Azure data factory. I have used two derived columns to pull date from column header. Image for reference:
I have taken csv file from delimetetext1 dataset. Source of my data flow:
Data preview of source:
I have created derived column using derived column option to the source file named "columnames". In this derived column I have taken all columns in an array using below expression:
columnNames()
Image for reference:
Data preview of derived column1:
I have created another derived column to derived column1 and in that I created 4 derived columns. Date column is for pull date from column header "Gross Margin 10/20/22".
I have used below expression:
split(at(columnnames, 4),' ')[3]
Image for reference:
I created NDate derived column for pull date from column header "Net Sales 10/21/22". I have used
split(at(columnnames,5),' ')[3]
Image for reference:
I created Gross derived column to separate date from column header "Gross Margin 10/20/22". I have used below expression:
substring({Gross Margin 10/20/22}, instr({Gross Margin 10/20/22}, "") 0, 12)
Image for reference
I created Gross derived column to separate date from column header "Net sales 10/21/22". I have used below expression:
substring({Net Sales 10/21/22}, instr({Net Sales 10/21/22}, "") 0, 9)
Image for reference:
Data preview of derived column 2:
I created select and retrieved 8 columns from derived column 2 I have used below expression:
Renaming derivedColumn2 to select1 with columns 'Sales Number, Store Short Name, Net Sales, Sales Trend, Date, NDate, Gross, net'
Image for reference:
Data preview of select:
I connected sink to the select :
Data preview for sink: