Home > OS >  Regex to search for include directives
Regex to search for include directives

Time:10-30

Is it possible to write a regular expression which can find in string all #include "path/to/file" extracting paths and skipping comments (// ... and /* ... */) and how it would look like?

CodePudding user response:

This problem does actually require some memory and so is not solvable with pure regular expressions, however modern regex can function more like a CFG so you should be able to do it if you can use grouping/non-capturing expressions. Anyways, having had the displeasure of writing many, many, regular expressions like this, sometimes to parse entire coding languages, I assure you it won't be pretty, but since you're not worried about anything except comments it shouldn't be too bad. The regular expression would probably look something like this:

"(<!=/\*(?!(.|\n)*?/\*))" # check for open multiline comment
"(<!=//.*?)" # check for single line comment
"(<?=#include\s*?)\S*?$" # check for include directive and capture path

Note:

  • The actual regex is everything inside the double quotes, the structure and comments are for clarity.
  • I do not claim this regular expression to work, when parsing entire files like this regex can be very finnicky and there are many edge cases to consider, this was just to give you an idea of what it might look like.
  • This regex only captures a single path, not multiple.

CodePudding user response:

I've found a solution without memory:

^[ \t]*#[ \t]*include[ \t] ("[^"] "|<[^>] >)|\/(\*(.|[\r\n])*?\*\/)|(\/\/.*)

It will find either include or comment. So you need to consume only matches with first group.

  • Related