I'm trying to category_id for purely numeric values, this works. I need to also capture category_name. For category_name, I need to capture until space or include space if it started with a double quote.
Sample user input string:
python c:192 c:1Stackoverflow c:"Stack Overflow2"
The desired captures should be these two values for category_name and the 192 for category_id.
Expected output:
1Stackoverflow
Stack Overflow2
The category_name must contain at least one non-digit, but can be all alpha with no digits.
This query partially works:
/c:(?<category_name>(?:")(?!\d )[^"] (?:")|(?!\d )[^ ] )/g
It doesn't capture the input 1Stackoverflow
, but does the quoted one. I need to remove the quotes:
(x.groups?.[key] ?? '').replace(/^\"/, '').replace(/\"$/, '')
The ?!\d
is an attempt to evade clashing with category_id, but does not appear to be working.
How can I capture category_name in both forms (one word and quote deliminated) without the quotes in the capture and working with a leading digit?
CodePudding user response:
To capture all 3 named groups in one regex use:
c:(?:(?<category_id>\d \b)|(?<category_name>\w |"[^"]*"))
RegEx Breakdown:
c:
: Matchc:
(?:
: Start non-capture group(?<category_id>\d \b)
: Named capture groupcategory_id
to match 1 digits followed by a word boundary|
: OR(?<category_name>\w |"[^"]*")
: Named capture groupcategory_name
to match 1 word characters or a quoted text
)
: End non-capture group
CodePudding user response:
If you want to remove the quotes immediately, I would suggest to use two different named groups for category_name with and without quotes:
c:(?:(?<category_name_q>"[^"] ")|(?<category_name>(?:\d*[a-zA-Z] )))
(category_name_q contains the previously quoted matches, but without quotes)