Home > Back-end >  How to use capture groups with the `\K` reset match?
How to use capture groups with the `\K` reset match?

Time:12-06

I found enter image description here

However, when adding a capture group (i.e., $1) using the regex (a\Kb), group $1 returns ab and not a:

enter image description here

Given the following string:

ab
cd

Using the regex (a\Kb)|(c\Kd) I would hope group $1 to contain b and group $2 to contain d, but that is not the case as it can be seen below:

enter image description here

I tried enter image description here

However, now the matches are both part of group $0, whereas I require them to be part of group $1 and $2, respectively. Do you have any ideas on how this can be achieved? I am using enter image description here

However, when applied to another example it fails. Therefore, for clarity, I am extending this question to cover the more complicated scenario discussed in the comments.

Suppose I have the following text in a markdown file:

- [x] Example task. | Task ends. [x] Another task.
- [x] ! Example task. | This ends. [x] ! Another task.

This is a sentence. [x] Task is here.
Other text. Another [x] ! Task is here.

|       | Task name     |    Plan     |   Actual    |      File      |
| :---- | :-------------| :---------: | :---------: | :------------: |
| [x]   | Task example. | 08:00-08:45 | 08:00-09:00 |  [[task-one]]  |
| [x] ! | Task example. | 08:00-08:45 | 08:00-09:00 |  [[task-one]]  |

I am interested in a single regex expression with two capture groups as follows:

  • group $1 (i.e., see selection below):

    • outside the table: capture everything after [x] (i.e., not followed by !) until a |

    • inside the table: capture everything after [x] (i.e., not followed by !) excluding the | symbols

      Matches for first capture group

  • group $2 (i.e., see selection below):

    • outside the table: capture everything after [x] ! until a |

    • inside the table: capture everything after [x] ! excluding the | symbols

      Mataches for the second capture group

I have the following regex (i.e., enter image description here

The regex for the matches inside the table is based on Wiktor Stribiżew's answer and explained here.

CodePudding user response:

If I understand what you are trying to match, use as a regex:

(?:[^|\s]\s*\[x\](?!\s*!)\s*\K([^!|\n]*))|(?:[^|\s]\s*\[x\]\s*!\s*\K([^|\n]*))

See Regex Demo

I removed some unnecessary escaping. But moreover:

For Group 1 matches (first alternative) before the |, note that I have after we have matched '[x]` the following negative lookahead assertion:

(?!\s*!)

This ensures that the [x] is not followed by 0 or more spaces followed by an exclamation mark. Only then do you want to match everything up to the next exclamation mark or newline as Group 1.

CodePudding user response:

Instead of \K, try to use control verbs (*SKIP)(*F):

(a(*SKIP)(*F)|b)|(c(*SKIP)(*F)|d)

Check the test case.

  • Related