Home > database >  Using ADF, get the Latest Folder based on Timestamp in Folder Name
Using ADF, get the Latest Folder based on Timestamp in Folder Name

Time:11-22

Lets say in ADLS Gen2 there are multiple Folders in a RootFolder, where Folder name is Timestamp.

Using Azure Data Factory, How would you get the Latest Folder based on Folder Name(ie. where the folder name is having latest timestamp). I know this could be easily done with Python or Shell Script, But How would this be done by specifically

Example -

Folder Structure :

RootFolder
    |- 20210921131200
    |- 20210920120000
    |- 20210801021345

In the above case, It should return Folder 20210921131200 as its the latest Timestamp.

CodePudding user response:

  1. Create 2 variables (ex: check_date & latest_folder) and assign a sample minimum date value in the check_date variable to compare it with the folder date, and store the result in the other variable latest_folder.

enter image description here

  1. Using Get Metadata activity, get the list of folder names under the RootFolder.

enter image description here

Output of Get Metadata:

enter image description here

  1. Pass the output of Get Metadata activity to ForEach activity.

@activity('Get Metadata1').output.childitems

enter image description here

  1. Inside ForEach activity, using If Condition activity check, the current folder name of ForEach is greater than the check_date variable value.

@greater(int(item().name),int(variables('check_date')))

enter image description here

  1. If condition is true, then pass the current item to the check_date variable. This will replace the sample value with the folder name.

enter image description here

  1. After looping all the folders, using Set variable activity, pass the check_date value to the latest_folder variable to get the latest folder name.

enter image description here

Output of Set Metadata2 holds the latest folder value in the latest_folder variable.

enter image description here

CodePudding user response:

You can use a combination of Get Metadata activity and loops to find this

Example:

Sample folders:

enter image description here

Dataset for ParentFolder (in your case it is RootFolder )

RootFolder

1. use Get Metadata activity to list the child folders under root folder

enter image description here

2. For each of the received childitems, inside foreach activity , append the folder names to a variable as int array.

@activity('Get Metadata1').output.childItems

enter image description here

3. lastly... identify the max value to get the latest folder

@string(max(variables('lastmodified')))    

enter image description here

enter image description here

  • Related