Introduction to writing PMD rules | PMD Source Code Analyzer

Writing your own PMD rules

PMD is a framework to perform code analysis. You can create your own rules to check for patterns specific to your codebase, or the coding practices of your team.

How rules work: the AST

Before running rules, PMD parses the source file into a data structure called an abstract syntax tree (AST). This tree represents the syntactic structure of the code, and encodes syntactic relations between source code elements. For instance, in Java, method declarations belong to a class: in the AST, the nodes representing method declarations will be descendants of a node representing the declaration of their enclosing class. This representation is thus much richer than the original source code (which, for a program, is just a chain of characters), or the token chain produced by a lexer (which is e.g. what Checkstyle works on). For example:

Sample code (Java) AST

Sample code (Java)	AST
`class Foo extends Object { }`	`└─ CompilationUnit └─ TypeDeclaration └─ ClassOrInterfaceDeclaration "Foo" ├─ ExtendsList │ └─ ClassOrInterfaceType "Object" └─ ClassOrInterfaceBody`

class Foo extends Object {

}

└─ CompilationUnit
   └─ TypeDeclaration
      └─ ClassOrInterfaceDeclaration "Foo"
         ├─ ExtendsList
         │  └─ ClassOrInterfaceType "Object"
         └─ ClassOrInterfaceBody

Conceptually, PMD rules work by matching a “pattern” against the AST of a file. Rules explore the AST and find nodes that satisfy some conditions that are characteristic of the specific thing the rule is trying to flag. Rules then report a violation on these nodes.

Discovering the AST

ASTs are represented by Java classes deriving from Node. Each PMD language has its own set of such classes, and its own rules about how these classes relate to one another, based on the grammar of the language. For example, all Java AST nodes extend JavaNode.

The structure of the AST can be discovered through

the Rule Designer
the AST dump feature

Writing new rules

PMD supports two ways to define rules: using an XPath query, or using a Java visitor. XPath rules are much easier to set up, since they’re defined directly in your ruleset XML, and are expressive enough for nearly any task.

On the other hand, some parts of PMD’s API are only accessible from Java, e.g. accessing the usages of a declaration. And Java rules allow you to do some complicated processing, to which an XPath rule couldn’t scale.

In the end, choosing one strategy or the other depends on the difficulty of what your rule does. I’d advise to keep to XPath unless you have no other choice.

XML rule definition

New rules must be declared in a ruleset before they’re referenced. This is the case for both XPath and Java rules. To do this, the rule element is used, but instead of mentioning the ref attribute, it mentions the class attribute, with the implementation class of your rule.

For Java rules: this is the class extending AbstractRule (transitively)
For XPath rules: this is net.sourceforge.pmd.lang.rule.XPathRule

Example:

<rule name="MyJavaRule"
      language="java"
      message="Violation!"
      class="com.me.MyJavaRule" >
    <description>
        Description
    </description>
    <priority>3</priority>
</rule>

Note: In PMD 7, the language attribute will be required on all rule elements that declare a new rule. Some base rule classes set the language implicitly in their constructor, and so this is not required in all cases for the rule to work. But this behavior will be discontinued in PMD 7, so missing language attributes are reported beginning with PMD 6.27.0 as a forward compatibility warning.

Resource index

To learn how to write a rule:

Your First Rule introduces the basic development process of a rule with a running example
Writing XPath Rules explains a bit more about XPath rules and our XPath API
Writing Java Rules describes how to write a rule in Java

To go further:

Defining Properties describes how to make your rules more configurable with rule properties
Testing your Rules introduces our testing framework and how you can use it to safeguard the quality of your rule

Tags: