This page describes XPath rule support in more details

This page describes some points of XPath rule support in more details. See also the tutorial about how to write an XPath rule.

XPath version

PMD supports three XPath versions for now: 1.0, 2.0, and 1.0 compatibility mode. The version can be specified with the version property in the rule definition, like so:

<property version="2.0" /> <!-- or "1.0", or "1.0 compatibility" -->

The default has always been version 1.0.

As of PMD version 6.22.0, XPath versions 1.0 and the 1.0 compatibility mode are deprecated. XPath 2.0 is superior in many ways, for example for its support for type checking, sequence values, or quantified expressions. For a detailed but approachable review of the features of XPath 2.0 and above, see the Saxon documentation.

It is recommended that you migrate to 2.0 before 7.0.0, but we expect to be able to provide an automatic migration tool when releasing 7.0.0. See the migration guide below.

DOM representation of ASTs

XPath rules view the AST as an XML-like DOM, which is what the XPath language is defined on. Concretely, this means:

  • Every AST node is viewed as an XML element
    • The element has for local name the value of getXPathNodeName for the given node
  • Some Java getters are exposed as XML attributes on those elements
    • This means, that documentation for attributes can be found in our Javadocs. For example, the attribute @SimpleName of the Java node EnumDeclaration is backed by the Java getter getSimpleName.

Value conversion

To represent attributes, we must map Java values to XPath Data Model (XDM) values. The conversion depends on the XPath version used.

XPath 1.0

On XPath 1.0 we map every Java value to an xs:string value by using the toString of the object. Since XPath 1.0 allows many implicit conversions this works, but it causes some incompatibilities with XPath 2.0 (see the section about migration further down).

XPath 2.0

XPath 2.0 is a strongly typed language, and so we use more precise type annotations. In the following table we refer to the type conversion function as conv, a function from Java types to XDM types.

Java type T XSD type conv(T)
int xs:integer
long xs:integer
double xs:decimal
float xs:decimal
boolean xs:boolean
String xs:string
Character xs:string
Enum<E> xs:string (uses Object::toString)
List<E> conv(E)* (a sequence type)
⚠️ List support is deprecated with 6.25.0. See below.

The same conv function is used to translate rule property values to XDM values.

Migrating from 1.0 to 2.0

XPath 1.0 and 2.0 have some incompatibilities. The XPath 2.0 specification describes them precisely. Those are however mostly corner cases and XPath rules usually don’t feature any of them.

The incompatibilities that are most relevant to migrating your rules are not caused by the specification, but by the different engines we use to run XPath 1.0 and 2.0 queries. Here’s a list of known incompatibilities:

  • The namespace prefixes fn: and string: should not be mentioned explicitly. In XPath 2.0 mode, the engine will complain about an undeclared namespace, but the functions are in the default namespace. Removing the namespace prefixes fixes it.
    • fn:substring("Foo", 1)substring("Foo", 1)
  • Conversely, calls to custom PMD functions like typeIs must be prefixed with the namespace of the declaring module (pmd-java).
    • typeIs("Foo")pmd-java:typeIs("Foo")
  • Boolean attribute values on our 1.0 engine are represented as the string values "true" and "false". In 2.0 mode though, boolean values are truly represented as boolean values, which in XPath may only be obtained through the functions true() and false(). If your XPath 1.0 rule tests an attribute like @Private="true", then it just needs to be changed to @Private=true() when migrating. A type error will warn you that you must update the comparison. More is explained on issue #1244.
    • "true", 'true'true()
    • "false", 'false'false()
  • In XPath 1.0, comparing a number to a string coerces the string to a number. In XPath 2.0, a type error occurs. Like for boolean values, numeric values are represented by our 1.0 implementation as strings, meaning that @BeginLine > "1" worked —that’s not the case in 2.0 mode.
    • @ArgumentCount > '1'@ArgumentCount > 1
  • In XPath 1.0, the expression /Foo matches the children of the root named Foo. In XPath 2.0, that expression matches the root, if it is named Foo. Consider the following tree:
    Foo
    └─ Foo
    └─ Foo
    

    Then /Foo will match the root in XPath 2, and the other nodes (but not the root) in XPath 1. See eg an issue caused by this in Apex, with nested classes.

Rule properties

See Defining rule properties

PMD extension functions

PMD provides some language-specific XPath functions to access semantic information from the AST.

On XPath 2.0, the namespace of custom PMD function must be explicitly mentioned.

All languages

Functions available to all languages are in the namespace pmd.

Function name Description (click for details)
fileName Returns the simple name of the current file

pmd:fileName() as xs:string

Returns the current simple file name, without path but including the extension. This can be used to write rules that check file naming conventions.
Since
PMD 6.38.0
Remarks
The requires the context node to be an element
Examples
//b[pmd:fileName() = 'Foo.xml']
Matches any <b> tags in files called Foo.xml.
startLine Returns the start line of the given node

pmd:startLine(xs:element) as xs:int

Returns the line where the node starts in the source file. Line numbers are 1-based.
Since
PMD 6.44.0
Remarks
The function is not context-dependent, but takes a node as its first parameter. The function is only available in XPath 2.0.
Parameters
element as xs:element
Any element node
Examples
//b[pmd:startLine(.) > 5]
Matches any <b> node which starts after the fifth line.
endLine Returns the end line of the given node

pmd:endLine(xs:element) as xs:int

Returns the line where the node ends in the source file. Line numbers are 1-based.
Since
PMD 6.44.0
Remarks
The function is not context-dependent, but takes a node as its first parameter. The function is only available in XPath 2.0.
Parameters
element as xs:element
Any element node
Examples
//b[pmd:endLine(.) == pmd:startLine(.)]
Matches any <b> node which doesn't span more than one line.
startColumn Returns the start column of the given node (inclusive)

pmd:startColumn(xs:element) as xs:int

Returns the column number where the node starts in the source file. Column numbers are 1-based. The start column is inclusive.
Since
PMD 6.44.0
Remarks
The function is not context-dependent, but takes a node as its first parameter. The function is only available in XPath 2.0.
Parameters
element as xs:element
Any element node
Examples
//b[pmd:startColumn(.) = 1]
Matches any <b> node which starts on the first column of a line
endColumn Returns the end column of the given node (exclusive)

pmd:endColumn(xs:element) as xs:int

Returns the column number where the node ends in the source file. Column numbers are 1-based. The end column is exclusive.
Since
PMD 6.44.0
Remarks
The function is not context-dependent, but takes a node as its first parameter. The function is only available in XPath 2.0.
Parameters
element as xs:element
Any element node
Examples
//b[pmd:startLine(.) = pmd:endLine(.) and pmd:endColumn(.) - pmd:startColumn(.) = 1]
Matches any <b> node which spans exactly one character

Java

Java functions are in the namespace pmd-java.

Function name Description (click for details)
typeIs Tests a node's static type

pmd-java:typeIs(xs:string) as xs:boolean

Returns true if the context node's static Java type is a subtype of the given type. This tests for the resolved type of the Java construct, not the type of the AST node. For example, the AST node for a literal (e.g. 5d) has type ASTLiteral, however this function will compare the type of the literal (eg here, double) against the argument.
Remarks
The context node must be a TypeNode
Parameters
javaQualifiedName as xs:string
The qualified name of a Java class, possibly with pairs of brackets to indicate an array type. Can also be a primitive type name.
Examples
//FormalParameter[pmd-java:typeIs("java.lang.String[]")]
Matches formal parameters of type String[] (including vararg parameters)
//VariableDeclaratorId[pmd-java:typeIs("java.lang.List")]
Matches variable declarators of type List or any of its subtypes (including e.g. ArrayList)
typeIsExactly Tests a node's static type, ignoring subtypes

pmd-java:typeIsExactly(xs:string) as xs:boolean

Returns true if the context node's static type is exactly the given type. In particular, returns false if the context node's type is a subtype of the given type.
Remarks
The context node must be a TypeNode
Parameters
javaQualifiedName as xs:string
The qualified name of a Java class, possibly with pairs of brackets to indicate an array type. Can also be a primitive type name.
Examples
//VariableDeclaratorId[pmd-java:typeIsExactly("java.lang.List")]
Matches variable declarators of type List (but not e.g. ArrayList)
metric Computes and returns the value of a metric

pmd-java:metric(xs:string) as xs:decimal?

Returns the value of the metric as evaluated on the context node
Remarks
The context node must be a ASTAnyTypeDeclaration or a MethodLikeNode
Parameters
metricKey as xs:string
The name of an enum constant in JavaOperationMetricKey or JavaClassMetricKey