Home > OS >  Removing specific rows in an Excel file using Azure Data Factory
Removing specific rows in an Excel file using Azure Data Factory

Time:11-17

I have a set of excel files inside ADLS. The format looks similar to the one below:

enter image description here

The first 4 rows would always be the document header information and the last 3 will be 2 empty rows and the end of the document indicator. The number of rows for the employee information is indefinite. I would like to delete the first 4 rows and the last 3 rows using ADF.

Can any help me with what should be expressions in the Derived column / Select?

CodePudding user response:

My Excel file:

ExcelFile

Source Data set settings (give A5 in range and select first row as header): Preview Next, add a filter transformation with below expression

  !startsWith(sno,'dummy') && sno!=''

this will filter out the rows starting with dummy, in your case, end of document. Also we are ignoring the empty rows by checking sno!=''

Final Preview after filter:

Preview After Filtering footer

  • Related