I am trying to find the most efficient way to process files in multiple folders based on a list of allowed files.
I have a list of allowed files that I should process.
The proces is as follows
- val allowedFiles = List("File1.json","File2.json","File3.json")
- Get list of folders in directory. For this I could use:
def getListOfSubDirectories(dir: File): List[String] =
dir.listFiles
.filter(_.isDirectory)
.map(_.getName)
.toList
- Loop through each folder from step 2. and get all files. For this I would use :
def getListOfFiles(dir: String):List[File] = {
val d = new File(dir)
if (d.exists && d.isDirectory) {
d.listFiles.filter(_.isFile).toList
} else {
List[File]()
}
}
- If file from step 3. are in list of allowed files call another method that process the file
So I need to loop through a first directory, get files, check if file need to be procssed and then call another functionn. I was thinking about double loop which would work but is the most efficient way. I know in scala I should be using resursive funstions but failed with this double recursive function with call to extra method.
Any ides welcome.
CodePudding user response:
Files.find()
will do both the depth search and filter.
import java.nio.file.{Files,Paths,Path}
import scala.jdk.StreamConverters._
def getListOfFiles(dir: String, targets:Set[String]): List[Path] =
Files.find( Paths.get(dir)
, 999
, (p, _) => targets(p.getFileName.toString)
).toScala(List)
usage:
val lof = getListOfFiles("/DataDir", allowedFiles.toSet)
But, depending on what kind of processing is required, instead of returning a List
you might just process each file as it is encountered.
import java.nio.file.{Files,Paths,Path}
def processFile(path: Path): Unit = ???
def processSelected(dir: String, targets:Set[String]): Unit =
Files.find( Paths.get(dir)
, 999
, (p, _) => targets(p.getFileName.toString)
).forEach(processFile)
CodePudding user response:
You can use Files.walk
The code would look like this (I didn't compile it, so it may have some typos)
import java.nio.file.{Files, Path}
import scala.jdk.StreamConverters._
def getFilesRecursive(initialFolder: Path, allowedFiles: Set[String]): List[Path] =
Files
.walk(initialFolder)
.filter(path => allowedFiles.contains(path.getFileName.toString.toLowerCase))
.toScala(List)