I'd like, ideally, not having to resort to capturing groups but rather, assert that the string starts/ends with some sequence and directly use the value matched by the regex.
Input:
map_Ks ./CarbonFiber_T.tga
Input definition:
- start of line
- maybe some spaces
- the string
map_Ks
(this is the class field I want to assign value to) - one or more spaces
- a valid file path, anything but 0x00-0x1F, 0x7C (this is the value I want to assign to the field)
- maybe some spaces
- end of line
Attempt 1: it works but result is in a captured group
(?:^\s map_K.\s )([^\x00-\x1F\x7C] )$
map_Ks ./CarbonFiber_T.tga
./CarbonFiber_T.tga
Attempt 2: it works, there are no groups but the match is the entire line (ideal usage)
(?=^\s map_K.\s )[^\x00-\x1F\x7C] $
map_Ks ./CarbonFiber_T.tga
Question:
Is this possible at all or am I asking the regex engine too much and simply should use capture groups?
CodePudding user response:
You need to replace the lookahead with a lookbehind and require the first char of the consumed pattern to be a non-whitespace char.
You can use
(?<=^\s map_K.\s )(?=\S)[^\x00-\x1F\x7C]*(?<=\S)(?=\s*$)
(?<=^\s map_K.\s )[^\x00-\x1F\x7C\s](?:[^\x00-\x1F\x7C]*[^\x00-\x1F\x7C\s])?(?=\s*$)
See the regex demo (or this regex demo). Details:
(?<=^\s map_K.\s )
- a positive lookbehind that matches a location that is immediately preceded with start of string, one or more whitespaces,map_K
, any one char other than LF char, one or more whitespaces(?=\S)
- a positive lookahead that requires the next char to be a non-whitespace char[^\x00-\x1F\x7C]
- one or more chars other than ASCII control chars(?<=\S)
- the previous char must be a non-whitespace char(?=\s*$)
- a positive lookahead requiring zero or more whitespaces at the end of string immediately on the right.
The [^\x00-\x1F\x7C\s](?:[^\x00-\x1F\x7C]*[^\x00-\x1F\x7C\s])?
regex part matches one char that is not a whitespace and not an ASCII control char and then an optional sequence of any zero or more chars other than ASCII control chars and then a single char that is not a whitespace and not an ASCII control char.
Just in case you want to adjust the file path regex part, please refer to What characters are forbidden in Windows and Linux directory names?