I have a bunch of markdown files, where I want to search for Ruby's double colon ::
outside of some code formatting (e.g. where I forgot to apply proper markdown). For example
`foo::bar`
hello `foo::bar` test
` example::with::whitespace `
```
Proper::Formatted
```
```
Module::WithIndendation
```
```
Some::Nested::Modules
```
```ruby
CodeBlock::WithSyntax
```
# Some::Class
## Another::Class Heading
some text
The regex only should match Some::Class
and Another::Class
, because they miss the surrounding backticks, and are also not within a multiline code fence block.
I have this regex, but it also matches the multi line block
[\s] [^`] (::)[^`] [\s]?
Any idea, how to exclude this?
EDIT:
It would be great, if the regex would work in Ruby, JS and on the command line for grep
.
CodePudding user response:
For the original input, you may use this regex in ruby to match ::
string
not preceded by a
`
andnot preceded by
`
followed a white-space:
Regex:
(?<!`\s)(?<!`)\b\w ::\w
RegEx Breakup:
(?<!
\s): Negative lookbehind to assert that <code>
and whitespace is not at preceding position(?<!
): Negative lookbehind to assert that <code>
is not at preceding position\b
: Match word boundary\w
: Match 1 word characters::
: Match a::
\w
: Match 1 word characters
You can use this regex in Javascript:
(?<!`\w*\s*|::)\b\w (?:::\w )
For gnu-grep
, consider this command:
grep -ZzoP '`\w*\s*\b\w ::\w (*SKIP)(*F)|\b\w ::\w ' file |
xargs -0 printf '%s\n'
Some::Class
Another::Class