Home > OS >  Get list of files containing string(s) or pattern(s)
Get list of files containing string(s) or pattern(s)

Time:02-22

Is there a Gradle pattern for retrieving the list of files in a folder or set of folders that contain a given string, set of strings, or pattern?

My project produces RPMs and is using the Nebula RPM type (great package!). There are a couple of different kinds of sets of files that need post-processing. I am trying to generate the list of files that contain the strings that are the indicators for post-processing. For example, files that contain "@doc" need to be processed by the doc generator script. Files that contain "@HOSTNAME@" and "@HOSTFQDN@" need to be processed by sed to replace the strings with the actual host name or host fqdn.

The search root in the package will be src\main\resources. With the result the build script sets up the post-install script commands - something like:

postInstall('/opt/product/bin/postprocess.sh '   join(filesContainingDocs, " "))
postInstall('/bin/sed -i -e "s/@HOSTNAME@/$(hostname -s)/" -e s/@HOSTFQDN@/$(hostname)/" '   join(filesContainingHostname, " ")

I can figure out the postinstall syntax. I'm having difficulty finding the filter for any of the regular Gradle 'things' (i.e., FileTree) that operate on contents of files rather than names of files. How would I populate filesContainingDocs and filesContainingHostname - something along the lines of:

filesContainingDocs = FileTree('src/main/resources', { contents.matches('@doc') }
filesContainingHostname = FileTree('src/main/resources', { contents.matches('@(HOSTNAME|HOSTFQDN)@') }

While the post-process script could simply do the grep, the several RPMs in our product overlay each other and each RPM should only post-process the files it provides, so a general grep over the final installed folder is not workable - it would catch files provided by other RPMs. It seems to me that I ought to be able to, at build time, produce the correct static list of files from the bigger set of source files that comprise the given RPM's project.

It doesn't have to be FileTree - running a command like findstr /s /m /c:"@doc" src\main\resources\*.conf (alas, the build platform is Windows) produces the answer in stdout but I'm not sure how to get that result into an object Gradle can use to expand the result. (I also suspect there is a 'more Gradle way' to do this.)

The set of files, and the contents of those files, is generally fairly small.

CodePudding user response:

I'm having difficulty finding the filter for any of the regular Gradle 'things' (i.e., FileTree) that operate on contents of files rather than names of files.

You can apply any filter you can imagine on a Gradle file tree, in the end it is just Groovy (or Kotlin) code running in the JVM. Each Gradle FileTree is nothing more than a (lazily evaluated) collection of Java File objects. To filter those File objects, you can read their content, e.g. in the same way you would read them in Java. Groovy even provides a JDK enhancement for the Java class File that includes the simple method getText() for this purpose. Now you can easily filter for files that contain a certain string:

filesContainingDocs = fileTree('src/main/resources', {
    filter { file -> file.text.contains('@doc') }
}

Using Groovy, you can call getters like .getText() in the same way as accessing fields (.text in this case).

If a simple contains check is not enough, the Groovy JDK enhancements even provide the method matches(Pattern pattern) on CharSequence/string instances to perform a regular extension check:

filesContainingDocs = fileTree('src/main/resources', {
    filter { file -> file.text.matches('some regex') }
}
  • Related