1. Start with a new sub-module.
- See pmd-swift for examples.
2. Implement an AST parser for your language
- ANTLR will generate the parser for you based on the grammar file. The grammar file needs to be placed in the
src/main/antlr4in the appropriate sub package
astof the language. E.g. for swift, the grammar file is Swift.g4 and is placed in the package
3. Create AST node classes
- The individual AST nodes are generated, but you need to define the common interface for them.
- You need a need to define the supertype interface for all nodes of the language. For that, we provide
SwiftNodeas an example.
- Additionally, you need several base classes:
- a language specific inner node - these nodes represent the production rules from the grammar.
In Antlr, they are called “ParserRuleContext”. We call them “InnerNode”. Use the
base class from pmd-core
BaseAntlrInnerNode. And example is
- a language specific root node - this provides the root of the AST and our parser will return
subtypes of this node. The root node itself is a “InnerNode”.
- a language specific terminal node.
- a language specific error node.
- a language specific inner node - these nodes represent the production rules from the grammar. In Antlr, they are called “ParserRuleContext”. We call them “InnerNode”. Use the base class from pmd-core
- In order for the generated code to match and use our custom classes, we have a common ant script, that fiddles with
the generated code. The ant script is
antlr4-wrapper.xmland does not need to be adjusted - it has plenty of parameters to set. The ant script is added in the language module’s
pom.xmlwhere the parameters are set (e.g. name of root name class). Have a look at Swift’s example:
- You can add additional methods in your “InnerNode” (e.g.
SwiftInnerNode) that are available on all nodes. But on most cases you won’t need to do anything.
4. Generate your parser
- Make sure, you have the property
- This is just a matter of building the language module. ANTLR is called via ant, and this step is added
to the phase
generate-sources. So you can just call e.g.
./mvnw generate-source -pl pmd-swiftto have the parser generated.
- The generated code will be placed under
target/generated-sources/antlr4and will not be committed to source control.
- You should review the swift pom.
5. Create a TokenManager
- This is needed to support CPD (copy paste detection)
- We provide a default implementation using
- You must create your own “AntlrTokenizer” such as we do with
If you wish to filter specific tokens (e.g. comments to support CPD suppression via “CPD-OFF” and “CPD-ON”) you can create your own implementation of
AntlrTokenFilter. You’ll need to override then the protected method
getTokenFilter(AntlrTokenManager)and return your custom filter. See the tokenizer for C# as an exmaple:
If you don’t need a custom token filter, you don’t need to override the method. It returns the default
AntlrTokenFilterwhich doesn’t filter anything.
6. Create a PMD parser “adapter”
- Create your own parser, that adapts the ANLTR interface to PMD’s parser interface.
- We provide a
AntlrBaseParserimplementation that you need to extend to create your own adapter as we do with
7. Create a rule violation factory
- This is an optional step. Most like, the default implementation will do what you need.
The default implementation is
- The purpose of a rule violation factory is to create a rule violation instance for your handler (spoiler). In case you want to provide additional data in your rule violation, you can create a custom one. However, adding additional date here is discouraged, as you would need a custom renderer to actually use this additional data. Such extensions are not language agnostic.
8. Create a version handler
- Now you need to create your version handler, as we did with
- This class is sort of a gateway between PMD and all parsing logic specific to your language. It has 2 purposes:
getRuleViolationFactorymethod returns an instance of your rule violation factory (see step #7). By default, this returns the default rule violation factory.
getParserreturns an instance of your parser adapter (see step #6). That’s the only method, that needs to be implemented here.
9. Create a parser visitor adapter
- A parser visitor adapter is not needed anymore with PMD 7. The visitor interface now provides a default implementation.
- The visitor for ANTLR based AST is generated along the parser from the ANTLR grammar file. The
base interface for a visitor is
- The generated visitor class for Swift is called
- In order to help use this visitor later on, a base visitor class should be created.
SwiftVisitorBaseas an example.
10. Create a rule chain visitor
- This step is not needed anymore. For using rule chain, there is no additional adjustment necessary anymore in the languages.
- This feature has been merged into AbstractRule via the overridable method
AbstractRule#buildTargetSelector. Individual rules can make use of this optimization by overriding this method and return an appropriate RuleTargetSelector.
11. Make PMD recognize your language
- Create your own subclass of
net.sourceforge.pmd.lang.BaseLanguageModule, see Swift as an example:
- Add your default version with
addDefaultVersionin your language module’s constructor.
- Add for each additional version of your language a call to
- Create the service registration via the text file
src/main/resources/META-INF/services/net.sourceforge.pmd.lang.Language. Add your fully qualified class name as a single line into it.
12. Create an abstract rule class for the language
- You need to create your own
AbstractRulein order to interface your language with PMD’s generic rule execution.
AbstractSwiftRuleas an example.
- While the rule basically just extends
AntlrBaseRulewithout adding anything, every language should have its own base class for rule. This helps to organize the code.
- All other rules for your language should extend this class. The purpose of this class is to provide a visitor
via the method
buildVisitor()for analyzing the AST. The provided visitor only implements the visit methods for specific AST nodes. The other node types use the default behavior and you don’t need to care about them.
13. Create rules
- Creating rules is already pretty well documented in PMD - and it’s no different for a new language, except you may have different AST nodes.
- PMD supports 2 types of rules, through visitors or XPath.
- To add a visitor rule:
- You need to extend the abstract rule you created on the previous step, you can use the swift
as an example. Note, that all rule classes should be suffixed with
Ruleand should be placed in a package the corresponds to their category.
- You need to extend the abstract rule you created on the previous step, you can use the swift rule UnavailableFunctionRule as an example. Note, that all rule classes should be suffixed with
- To add an XPath rule you can follow our guide Writing XPath Rules.
14. Test the rules
- See UnavailableFunctionRuleTest for example. Each rule has it’s own test class.
- You have to create the category rule set for your language (see pmd-swift/src/main/resources/bestpractices.xml for example)
- When executing the test class
- this triggers the unit test to read the corresponding XML file with the rule test data
- This test XML file contains sample pieces of code which should trigger a specified number of violations of this rule. The unit test will execute the rule on this piece of code, and verify that the number of violations matches.
- this triggers the unit test to read the corresponding XML file with the rule test data (see
To verify the validity of all the created rulesets, create a subclass of
RuleSetFactoryTestin pmd-swift for example). This will load all rulesets and verify, that all required attributes are provided.
Note: You’ll need to add your ruleset to
categories.properties, so that it can be found.