First of all, thanks for the contribution!
Happily for you, to add CPD support for a new language is now easier than ever!
All you need to do is follow this few steps:
- Create a new module for your language, you can take GO as an example
-
Create a Tokenizer
- For Antlr grammars you can take the grammar from here and extend AntlrTokenizer taking Go as an example
public class GoTokenizer extends AntlrTokenizer { @Override protected AntlrTokenManager getLexerForSource(SourceCode sourceCode) { CharStream charStream = AntlrTokenizer.getCharStreamFromSourceCode(sourceCode); return new AntlrTokenManager(new GolangLexer(charStream), sourceCode.getFileName()); } }
- For JavaCC grammars you should subclass JavaCCTokenizer wich has many examples you could follow, you should also take the Python implementation as reference
- For any other scenario you can use AnyTokenizer
-
Create your Language class
public class GoLanguage extends AbstractLanguage { public GoLanguage() { super("Go", "go", new GoTokenizer(), ".go"); } }
Pro Tip: Yes, keep looking at Go!You are almost there!
-
Update the list of supported languages
-
Write the fully-qualified name of your Language class to the file
src/main/resources/META-INF/services/net.sourceforge.pmd.cpd.Language
-
Update the test that asserts the list of supported languages by updating the
SUPPORTED_LANGUAGES
constant in BinaryDistributionIT
-
-
Please don’t forget to add some test, you can again.. look at Go implementation ;)
If you read this far, I’m keen to think you would also love to support some extra CPD configuration (ignore imports or crazy things like that)
If that’s your case , you came to the right place! -
You can add your custom properties using a Token filter
-
For Antlr grammars all you need to do is implement your own AntlrTokenFilter
And by now, I know where you are going to look…
WRONG
Why do you want GO to solve all your problems?
You should take a look to Kotlin token filter implementation
-
For non-Antlr grammars you can use BaseTokenFilter directly or take a peek to Java’s token filter
-