i created the following custom rule for PMD but when i run it, i get an error. if i replace the regex with a trivial regex like "a", it works. cannot understand what's wrong.
<?xml version="1.0"?>
<ruleset name="Custom Rules"
xmlns="http://pmd.sourceforge.net/ruleset/2.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://pmd.sourceforge.net/ruleset/2.0.0 https://pmd.sourceforge.io/ruleset_2_0_0.xsd">
<rule name="LongMethodName"
language="java"
message="Method name too long"
>
<description>
Method name should be composed by less that five words
</description>
<priority>4</priority>
<properties>
<property name="version" value="2.0" />
<property name="xpath">
<value>
<![CDATA[
//MethodDeclaration[count(tokenize(@Name, '(?<=[a-z])(?=[A-Z])')) 1 > 5]
]]>
</value>
</property>
</properties>
</rule>
</ruleset>
the error i get is the following. get an error for each file in the project i'm analyzing
Nov 04, 2022 10:45:42 PM net.sourceforge.pmd.RuleSet apply
WARNING: Exception applying rule LongMethodName on file /Users/francescobresciani/MSDE/1sem/software-design-modeling/sdem-ass2/fastjson-master/src/main/java/com/alibaba/fastjson/parser/SymbolTable.java, continuing with next rule
java.lang.RuntimeException: net.sf.saxon.trans.XPathException: Error at character 1 in regular expression "(?<=[a-z])(?=[A-Z])": expected ())
at net.sourceforge.pmd.lang.rule.xpath.SaxonXPathRuleQuery.initializeXPathExpression(SaxonXPathRuleQuery.java:272)
at net.sourceforge.pmd.lang.rule.xpath.SaxonXPathRuleQuery.evaluate(SaxonXPathRuleQuery.java:113)
at net.sourceforge.pmd.lang.rule.XPathRule.evaluate(XPathRule.java:176)
at net.sourceforge.pmd.lang.rule.XPathRule.apply(XPathRule.java:158)
at net.sourceforge.pmd.RuleSet.apply(RuleSet.java:670)
at net.sourceforge.pmd.RuleSets.apply(RuleSets.java:163)
at net.sourceforge.pmd.SourceCodeProcessor.processSource(SourceCodeProcessor.java:209)
at net.sourceforge.pmd.SourceCodeProcessor.processSourceCodeWithoutCache(SourceCodeProcessor.java:118)
at net.sourceforge.pmd.SourceCodeProcessor.processSourceCode(SourceCodeProcessor.java:100)
at net.sourceforge.pmd.SourceCodeProcessor.processSourceCode(SourceCodeProcessor.java:62)
at net.sourceforge.pmd.processor.PmdRunnable.call(PmdRunnable.java:89)
at net.sourceforge.pmd.processor.PmdRunnable.call(PmdRunnable.java:30)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: net.sf.saxon.trans.XPathException: Error at character 1 in regular expression "(?<=[a-z])(?=[A-Z])": expected ())
at net.sf.saxon.java.JRegularExpression.<init>(JRegularExpression.java:70)
at net.sf.saxon.java.JavaPlatform.compileRegularExpression(JavaPlatform.java:198)
at net.sf.saxon.functions.Matches.tryToCompile(Matches.java:218)
at net.sf.saxon.functions.Tokenize.maybePrecompile(Tokenize.java:45)
at net.sf.saxon.functions.Tokenize.simplify(Tokenize.java:36)
at net.sf.saxon.expr.ExpressionVisitor.simplify(ExpressionVisitor.java:159)
at net.sf.saxon.expr.FunctionCall.simplifyArguments(FunctionCall.java:100)
at net.sf.saxon.expr.FunctionCall.simplify(FunctionCall.java:88)
at net.sf.saxon.expr.ExpressionVisitor.simplify(ExpressionVisitor.java:159)
at net.sf.saxon.expr.BinaryExpression.simplify(BinaryExpression.java:45)
at net.sf.saxon.expr.ArithmeticExpression.simplify(ArithmeticExpression.java:42)
at net.sf.saxon.expr.ExpressionVisitor.simplify(ExpressionVisitor.java:159)
at net.sf.saxon.expr.BinaryExpression.simplify(BinaryExpression.java:45)
at net.sf.saxon.expr.ExpressionVisitor.simplify(ExpressionVisitor.java:159)
at net.sf.saxon.expr.FilterExpression.simplify(FilterExpression.java:130)
at net.sf.saxon.expr.ExpressionVisitor.simplify(ExpressionVisitor.java:159)
at net.sf.saxon.expr.SlashExpression.simplify(SlashExpression.java:122)
at net.sf.saxon.expr.ExpressionVisitor.simplify(ExpressionVisitor.java:159)
at net.sf.saxon.expr.ExpressionTool.make(ExpressionTool.java:74)
at net.sf.saxon.sxpath.XPathEvaluator.createExpression(XPathEvaluator.java:167)
at net.sourceforge.pmd.lang.rule.xpath.SaxonXPathRuleQuery.initializeXPathExpression(SaxonXPathRuleQuery.java:269)
i tested the regex on regex101 and it works.
i tested the XPath expression on xpather and it looks valid
i tested the XPath expression on freeformatter and it looks NOT valid. it says: Unable to perform XPath operation. Syntax error at char 1 in regular expression: No expression before quantifier
the following is the snippet i checked the XPath rule against
<root>
<MethodDeclaration Name="shortName"/>
<MethodDeclaration Name="thisMethodNameIsVeryVeryLong"/>
</root>
the following is the exact string i input in xpather and freeformatter
//MethodDeclaration[count(tokenize(@Name, '(?<=[a-z])(?=[A-Z])')) 1 > 5]
CodePudding user response:
The leading character in your regular expression, (
, marks the start of a group.
The next character, ?
, is a "quantifier" (like *
or
); it specifies how many times the preceding expression may occur (it means "either zero or one"). But there is no preceding expression.
Are you trying to match the literal character ?
? If so, you should escape it like so \?
.
Are you trying to match the literal character (
either zero or one time? If so, you should escape that character \(
so it's not interpreted as the start of the group.
CodePudding user response:
I guess, that the Positive Lookbehind ((?<=...)
) and Positive Lookahead ((?=...)
) is not supported by the Regex Syntax in XPath (at least XPath 2.0). The supported Regex syntax is described here: https://www.w3.org/TR/xmlschema-2/#regexs - and the tokenize function in XPath 2.0 is here: https://www.w3.org/TR/xquery-operators/#func-tokenize - This doc also contains some more infos about the regex in XPath.
Searching for an alternative solution, I found the question Regex to split camel case
The general idea is: First replace every lower case character immediately followed by an upper case character with the same characters including a space in between and then tokenize (splitting) the resulting string by that space.
This works with the following XPath:
//MethodDeclaration[count(tokenize(string(replace(@Name, '([a-z])([A-Z])', '$1 $2')), ' ')) 1 > 5]
The conversion to a string (the string(...)
function call) is only required by http://xpather.com/ - but it doesn't hurt. The expression also works with PMD.
Here's the example: http://xpather.com/f9SD9WMX