Lets say in ADLS Gen2 there are multiple Folders in a RootFolder, where Folder name is Timestamp.
Using Azure Data Factory, How would you get the Latest Folder based on Folder Name(ie. where the folder name is having latest timestamp). I know this could be easily done with Python or Shell Script, But How would this be done by specifically
Example -
Folder Structure :
RootFolder
|- 20210921131200
|- 20210920120000
|- 20210801021345
In the above case, It should return Folder 20210921131200 as its the latest Timestamp.
CodePudding user response:
- Create 2 variables (ex: check_date & latest_folder) and assign a sample minimum date value in the check_date variable to compare it with the folder date, and store the result in the other variable latest_folder.
- Using
Get Metadata
activity, get the list of folder names under the RootFolder.
Output of Get Metadata:
- Pass the output of
Get Metadata
activity toForEach
activity.
@activity('Get Metadata1').output.childitems
- Inside
ForEach
activity, usingIf Condition
activity check, the current folder name of ForEach is greater than the check_date variable value.
@greater(int(item().name),int(variables('check_date')))
If condition
is true, then pass the current item to the check_date variable. This will replace the sample value with the folder name.
- After looping all the folders, using Set variable activity, pass the check_date value to the latest_folder variable to get the latest folder name.
Output of Set Metadata2 holds the latest folder value in the latest_folder variable.
CodePudding user response:
You can use a combination of Get Metadata
activity and loops to find this
Example:
Sample folders:
Dataset for ParentFolder (in your case it is RootFolder )
1. use Get Metadata
activity to list the child folders under root folder
2. For each of the received childitems
, inside foreach activity
, append the folder names to a variable as int array.
@activity('Get Metadata1').output.childItems
3. lastly... identify the max value to get the latest folder
@string(max(variables('lastmodified')))