I want to write a function that simply validates the syntax of an XPath expression independent of any XML/HTML (so it would catch something like //p[text(}="Hello"]
as having a typo), but I can't seem to find a full fledged XPath specification.
CodePudding user response:
The W3C XPath Recommendations are the official specifications for XPath:
- XML Path Language (XPath) Version 1.0
- XML Path Language (XPath) 2.0 (Second Edition)
- XML Path Language (XPath) 3.1
If you must write an XPath parser from scratch, at least one of those will be necessary reading. If you do not really need to write an XPath parser from scratch, you might simply use an existing library and examine the return value or catch any parsing exceptions to determine the well-formedness of the passed XPath expression.
CodePudding user response:
In addition to the official specifications, if your aim is to write a syntax checker, the are tools to assist you in that, like grammars and parser generators for those grammars, such as https://bottlecaps.de/rex/ so using the grammars there, like the one for XPath 3.1, you can generate code to check the syntax in a lot of target languages, like for instance JavaScript (https://martin-honnen.github.io/xpath31fiddle/js/RExXPath31Fast.js) and then have your code use that parser with e.g. to check for syntax errors
function parseExample(xpath31Expression) {
try {
xpath31Parser = new RExXPath31Fast(xpath31Expression);
xpath31Parser.parse_XPath();
console.log(`${xpath31Expression} is fine.`);
}
catch (pe) {
if (pe instanceof xpath31Parser.ParseException) {
console.log(`Error in ${xpath31Expression}:`);
console.log(xpath31Parser.getErrorMessage(pe));
}
}
}
parseExample(`//p[.="Hello"]`);
parseExample(`//p[text(}="Hello"]`);
<script src="https://martin-honnen.github.io/xpath31fiddle/js/RExXPath31Fast.js"></script>