Perl script to find a word and remove all char/word on the first search-CodePudding

Here is my input

/prj/mct/2.5/src/mode/session.v
/prj/act/data/1.6/src/log.v

Here I'm trying to find a numeric value from above path and I want to remove all data/path after finding the numeric value.

Expected output

/prj/mct/2.5/.
/prj/act/data/1.6/.

Can you please let me know how I should write a perl script for the same?

CodePudding user response：

The following one-liner gives the expected output:

perl -pe 's{([^0-9] [0-9.] /).*}{$1.}' input.txt

-p reads the input line by line, printing each line after processing;
s{}{} is the substitution, we're not using s/// because we want to match a slash and we don't like backslashed slashes as they're hard to read;
[0-9] matches a digit, ^ negates it, i.e. [^0-9] matches anything but a digit;
matches one or more occurrences of the preceding construct, e.g. [^0-9] matches one or more non-digits;
[0-9.] matches digits and dots, i.e. a version;
the (...) parentheses create a capture group, here we capture the whole beginning of each line up to the slash after the version;
we replace the whole line with just the captured part and add a dot.

CodePudding user response：

At its simplest, we can just match a numbers \d with a period \. in the middle, enclosed by slashes, keep that part \K and discard the rest .*:

perl -pe 's#/\d \.\d /\K.*#.#' path.txt

This will match your current test cases, but does require periods. If you have a single digit, we can make that part optional:

perl -pe 's#/\d (?:\.\d )?/\K.*#.#' path.txt

(?: ... )? a non capturing parenthesis (?: with a ? quantifier (match 0 or 1 times).

Using a character class is also an option, such as [\d.], but bear in mind that this can also match only periods, e.g. /../.

perl -pe 's#/[\d.] /\K.*#.#' path.txt