Home > Net >  regex to get top parent directory without /
regex to get top parent directory without /

Time:10-13

Looking for a regex given a path/to/filename.ext I want to determine the topmost parent path.

examples: 'foo/bar/baz/file.ext' 'file2.ext' 'fooz/file3.ext'

should return 'foo' and 'fooz'

Language used is groovy.

CodePudding user response:

import java.nio.file.Path

// So given the inputs from the question
def inputs = [
        'foo/bar/baz/file.ext',
        'file2.ext',
        'fooz/file3.ext'
]

def results = inputs.collect {
    // Convert each of them to a Path
    Path.of(it).with {
        // If it has more than 1 element, return the first component
        // else return an empty string (it's just a filename)
        it.nameCount > 1 ? it.subpath(0, 1).toString() : ''
    }
}

// Check results
assert results == ['foo', '', 'fooz']

CodePudding user response:

In general, the proper way to extract components of a pathname is to use tools designed for just that; in the case of Groovy, that would be java.nio.file.Path and @tim_yates's answer.

But the asker requested a regular-expression-based solution, so here is one:

pat = ~'^[^/]*(?=/)'
for (str in ['foo/bar/baz/file.ext', 'file2.ext', 'fooz/file3.ext']) {
  print str   ': '
  if (m = str =~ pat) {
    println '"' m[0] '"'
  } else {
    println 'NONE'
  }
}

Which outputs this:

foo/bar/baz/file.ext: "foo"
file2.ext: NONE
fooz/file3.ext: "fooz"

The pattern ^[^/]* matches a (longest-possible) sequence of zero or more (*) non-slash characters ([^/]) starting at the beginning of the line (^).

That would be enough by itself, except for the requirement that a string such as file2.ext, with no slashes at all, not match. The lookahead assertion (?=/) causes the pattern to match only if the matched text is followed by a slash.

  • Related