I'm trying to extract the ID of a Notion database from a URL e.g. the bold text in https://www.notion.so/anotioneer/
d77d1d19d4a943358898f2be65499d6a
?v=1dedd49c5403489ebb899a290111f858
.
I can match everything after anotioneer/
with anotioneer\/(. )
and everything before the ?
with .*(?=\?)
but I'm struggling to combine the two expressions.
CodePudding user response:
Would something like this work?
anotioneer\/(\w )(?:$|\?)
- Start with the last segment of the path:
anotioneer\/
. - Take one or more alphanumeric characters as a group:
(\w )
. - Match either the end of the line or the query string in a non-capturing group:
(?:$|\?)
.
Here's the valid sample data I used:
/anotioneer/d77d1d19d4a943358898f2be65499d6a?v=1dedd49c5403489ebb899a290111f858
/anotioneer/d79d1d19d4a943358898f2be65499d6a?v=3dedd49c5403489ebb899a290111f858
/anotioneer/d80d1d19d4a943358898f2be65499d6a?v=4dedd49c5403489ebb899a290111f858&t=123
One that doesn't match because there's an extra path segment between anotioneeer
and the ID:
/anotioneer/foo/d78d1d19d4a943358898f2be65499d6a?v=2dedd49c5403489ebb899a290111f858
And one that doesn't match because there's an extra path segment after the ID:
/anotioneer/d81d1d19d4a943358898f2be65499d6a/foo
Here's what the matches look like using this pattern. Note that you'll want to take the first group, not the whole match. That's why we used a non-capturing group for the end-of-line or query string segment.
Part | Location | Contents |
---|---|---|
Match 1 | 22-66 | anotioneer/d77d1d19d4a943358898f2be65499d6a? |
Group 1 | 33-65 | d77d1d19d4a943358898f2be65499d6a |
--- | --- | --- |
Match 2 | 123-167 | anotioneer/d79d1d19d4a943358898f2be65499d6a? |
Group 1 | 134-166 | d79d1d19d4a943358898f2be65499d6a |
--- | --- | --- |
Match 3 | 224-268 | anotioneer/d80d1d19d4a943358898f2be65499d6a? |
Group 1 | 235-267 | d80d1d19d4a943358898f2be65499d6a |