Home > Blockchain >  Skip first and last x rows in CSV stored in SFTP using Azure Logic App
Skip first and last x rows in CSV stored in SFTP using Azure Logic App

Time:03-15

I have a logic app which uses 2 Compose, the first one skips the first 3 rows

take(skip(variables('CSV Data'),3),sub(length(variables('CSV Data')),1))

and the second one skips the last 2 rows

take(outputs('Skip_Header'),sub(length(outputs('Skip_Header')),2))

This works great when I initialise the CSV Data Variable with an array

[
  "rubbish1,rubbish2,rubbish3",
  "blank1,blank2,blank3",
  "header1,header2,header3",
  "data1,data2,data3",
  "data4,data5,data6",
  "data7,data8,data9"
]

Below is my working Logic App definition:

{
    "definition": {
        "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
        "actions": {
            "Initialize_CSV_Data": {
                "inputs": {
                    "variables": [
                        {
                            "name": "CSV Data",
                            "type": "array",
                            "value": [
                                "rubbish1,rubbish2,rubbish3",
                                "blank1,blank2,blank3",
                                "header1,header2,header3",
                                "data1,data2,data3",
                                "data4,data5,data6",
                                "data7,data8,data9"
                            ]
                        }
                    ]
                },
                "runAfter": {},
                "type": "InitializeVariable"
            },
            "Skip_Footer": {
                "inputs": "@take(outputs('Skip_Header'),sub(length(outputs('Skip_Header')),2))",
                "runAfter": {
                    "Skip_Header": [
                        "Succeeded"
                    ]
                },
                "type": "Compose"
            },
            "Skip_Header": {
                "inputs": "@take(skip(variables('CSV Data'),3),sub(length(variables('CSV Data')),1))",
                "runAfter": {
                    "Initialize_CSV_Data": [
                        "Succeeded"
                    ]
                },
                "type": "Compose"
            }
        },
        "contentVersion": "1.0.0.0",
        "outputs": {},
        "parameters": {},
        "triggers": {
            "manual": {
                "inputs": {},
                "kind": "Http",
                "type": "Request"
            }
        }
    },
    "parameters": {}
}

However that was just for testing, because in reality the data is actually in a CSV format stored in a SFTP, so if I change the initialise variable from array

[
  "rubbish1,rubbish2,rubbish3",
  "blank1,blank2,blank3",
  "header1,header2,header3",
  "data1,data2,data3",
  "data4,data5,data6",
  "data7,data8,data9"
]

to string

rubbish1,rubbish2,rubbish3
blank1,blank2,blank3
header1,header2,header3
data1,data2,data3
data4,data5,data6
data7,data8,data9

Then that will a truer reflection of the actual scenario I have. But what the Logic App does now is that only ignores the first 3 characters and last 2 characters instead of the first 3 rows and last 2 rows. The new definition is below:

{
    "definition": {
        "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
        "actions": {
            "Initialize_CSV_Data": {
                "inputs": {
                    "variables": [
                        {
                            "name": "CSV Data",
                            "type": "string",
                            "value": "rubbish1,rubbish2,rubbish3\nblank1,blank2,blank3\nheader1,header2,header3\ndata1,data2,data3\ndata4,data5,data6\ndata7,data8,data9"
                        }
                    ]
                },
                "runAfter": {},
                "type": "InitializeVariable"
            },
            "Skip_Footer": {
                "inputs": "@take(outputs('Skip_Header'),sub(length(outputs('Skip_Header')),2))",
                "runAfter": {
                    "Skip_Header": [
                        "Succeeded"
                    ]
                },
                "type": "Compose"
            },
            "Skip_Header": {
                "inputs": "@take(skip(variables('CSV Data'),3),sub(length(variables('CSV Data')),1))",
                "runAfter": {
                    "Initialize_CSV_Data": [
                        "Succeeded"
                    ]
                },
                "type": "Compose"
            }
        },
        "contentVersion": "1.0.0.0",
        "outputs": {},
        "parameters": {},
        "triggers": {
            "manual": {
                "inputs": {},
                "kind": "Http",
                "type": "Request"
            }
        }
    },
    "parameters": {}
}

Screenshot below also shows the difference between the 2 Logic Apps, everything is the same except for "Initialize CSV Data": enter image description here

CodePudding user response:

Considering the same sample that you have provided, Im using 3 compose Connectors in order to skip the first Rows. Here is my Logic app

enter image description here

I'm using Split To Get Rows Connector in order to convert the csv file into Arrays. Below is the expression in it

split(body('Get_blob_content_(V2)'),'

')

The above steps creates an extra empty array object In order to remove it Im using the next step Rows Array connector. Below is the expression in it

take(outputs('Split_To_Get_Rows'),sub(length(outputs('Split_To_Get_Rows')),1))

Going further In Skip First Number of Rows Connector is used to skip the first 3 rows. Here is the expression in it

take(skip(outputs('Rows_Array'),3),sub(length(outputs('Rows_Array')),1))

Here is the Final Result:

enter image description here

UPDATED ANSWER


To remove x number of rows of footer then the expression would be

take(outputs('Skip_First_Number_Of_Rows_(Header)'),length(skip(outputs('Skip_First_Number_Of_Rows_(Header)'),0)))

Consider the below to be the array :-

[
rubbish1,rubbish2,rubbish3
blank1,blank2,blank3
header1,header2,header3
data1,data2,data3
sample1,sample2,sample3
data4,data5,data6
rubbish4,rubbish5,rubbish6
footer1,footer2,footer3
]

then here is the result for the mentioned expression

enter image description here

and if you need to skip 2 rows from below then the expression would be

take(outputs('Skip_First_Number_Of_Rows_(Header)'),length(skip(outputs('Skip_First_Number_Of_Rows_(Header)'),1)))

and here will be the result

enter image description here

  • Related