ANTLR4, .NET Core 2.1, and C#: Getting Started

August 28, 2018 by Michael

As much as I would like to criticize your decision making skills, I’ve already come to the conclusion to use ANTLR4 myself. 

Michael

For some reason (valid or not), you’ve decided you need to write a domain specific language using ANTLR4 and the interpreter to go along with it.  As much as I would like to criticize your decision making skills, I’ve already come to the same conclusion to use ANTLR4 myself.  Rather than browbeating you for a questionable path to victory, I’ll just show you what I did to accomplish the same task.

This will be a series of posts that eventually end with a working .NET Core 2.1 project that will actually do something with your DSL. If you would prefer to learn from the code, you can find it on GitHub.


Getting Started with ANTLR4

While this is a tutorial about using ANTLR4 in a .NET world, we’re going to end up using a bit of Java to test our grammar files and to generate the C# files we need.  Don’t worry, you don’t actually have to write any Java; however, you will need the JDK.

Setting up the Java Development Kit

At the time of writing, you can find the current version of the Java SE downloads here (you want to download the Java Platform JDK):

When prompted you’ll want to select the download package that matches your development platform (be sure to clicking on Accept License Agreement).  I will be using the JDK for 64 bit Windows (Version 10.0.2).  It’s a large download:  perhaps it’s time to get some coffee or a soda…

Once your download has completed, go through the installation process using the default settings.  We aren’t doing anything fancy here with Java, we just need the tooling to generate the output we need from ANTLR.

After the installation has finished, you’ll need to add the JDK folder to your computer’s PATH system variable.  This will enable you to run javac from any folder.  You can find details about updating your system’s PATH variable here.  In my case I needed to add the following location to my system’s PATH variable:

C:\Program Files\Java\jdk-10.0.2\bin

With the JDK downloaded and the PATH system variable updated, you’re now ready to start setting up the ANTLR4 tooling…


Setting up the ANTLR4 Tooling

The first thing you’ll need to do is to head over to the ANTLR website and download the latest version of the ANTLR Java binaries.  In this tutorial we’ll be using version 4.7.1.  You can find a link to the JAR file here (you’ll find the link to the Complete ANTLR 4.7.1 Java binaries JAR about half way down the page):

If Chrome (or any browser) tells you that the file can harm your computer and that it is dangerous, ignore the warning and keep the file anyway.

Once the download is complete, move the file to the folder where you keep all of your third party Java libraries.  If you’re primarily a .NET developer like me, that folder probably doesn’t exist yet.  I created a folder named JavaLib off of my root folder.  Move your newly downloaded JAR file to that folder.

Next you’ll need to modify your system’s CLASSPATH variable so that the TestRig can find the JAR file you just downloaded.  If this is your first time using a third party Java library you will most likely need to create the system variable.  You can read about setting up the CLASSPATH environment variable in Windows 10 here.

Note, you should have the semicolon at the end of the CLASSPATH value.  If you do not, you will most likely see an error about being unable to load the grammar class as a lexer or a parser.

The last thing you need to do is to create two batch files.  These files will save you some typing when you need to run the ANTLR tool or the TestRig. The contents of the two BAT files are as follows:

REM antlr4.bat
java org.antlr.v4.Tool %*

And…

REM grun.bat
java org.antlr.v4.gui.TestRig %*

Place both of these batch files in a folder and add that folder to your systems PATH variable so that you can run them from any directory on your system.

Once the BAT files have been created and the folder added to your system’s PATH variable, open up a Command Prompt window and test the two BAT files you just created.  You should see output similar to what you see below:

C:\Users\mjay>antlr4
C:\Users\mjay>java org.antlr.v4.Tool
ANTLR Parser Generator  Version 4.7.1
 -o ___              specify output directory where all output is generated
 -lib ___            specify location of grammars, tokens files
 -atn                generate rule augmented transition network diagrams
 -encoding ___       specify grammar file encoding; e.g., euc-jp

And…

C:\Users\mjay>grun
C:\Users\mjay>java org.antlr.v4.gui.TestRig
java org.antlr.v4.gui.TestRig GrammarName startRuleName
  [-tokens] [-tree] [-gui] [-ps file.ps] [-encoding encodingname]
  [-trace] [-diagnostics] [-SLL]
  [input-filename(s)]
Use startRuleName='tokens' if GrammarName is a lexer grammar.
Omitting input-filename makes rig read from stdin.

Assuming you saw the expected results, you are ready to write your first grammar file…


Testing the Process…

In order to give everything a full test we’ll create a simple grammar file, generate the Java files, compile the Java source, and use the TestRig to verify the results.

In a folder of your choice (go ahead, be creative), create a new file named Calculator.g4 using whatever text editor your like (in a later post I’ll provide details about setting up various tools to edit grammar files).  Paste the following content into the Calculator.g4 file:

grammar Calculator;
expression: operand (OPERATOR operand)+;

operand: DIGIT | LPAREN operand (OPERATOR operand)+ RPAREN;

LPAREN: '(';
RPAREN: ')';

OPERATOR: ADD | SUBTRACT | MULTIPLY | DIVIDE;

ADD: '+';
SUBTRACT: '-';
MULTIPLY: '*';
DIVIDE: '/';

DIGIT: [0-9]+;
WS : [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines

The file name and the grammar name best be exactly the same:  including casing.  In this example the file name must start with a capital C because the grammar name starts with a capital C.

With your changes saved, open a Command Prompt window and navigate to the folder where you created the Calculator.g4 file.  Run the following commands:

> antlr4 Calculator.g4
> javac Calculator*.java
> grun Calculator expression -tree
(4+4)-((3-1)*7)
^Z

> grun Calculator r -gui
(4+4)-((3-1)*7)
^Z

If you get an error message from any of the commands above, please review the text above and make sure you didn’t missing something along the way.

The first command antlr4 Calculator.g4 generates the Java files you’ll need to build the class file that the TestRig uses for parsing (we’ll get the C# files in the next installment of this series).  Note that the generated files will all begin with the name of the grammar you created (in this case, Calculator).

After the Java files are generated you need to compile them into the class files that the TestRig uses to actually parse the input you provide.  That’s what the javac Calculator*.java command does.

With the Java compiled into class files you are ready to run the TestRig via the grun.bat file we created earlier in this tutorial:  grun Calculator expression -tree.  The first parameter passed to the TestRig is the name of the grammar to load.  The second parameter is the name of the rule to be used to start parsing.  The -tree option is used to display the parse tree in LISP notation.  At this point the TestRig will allow you to type in a simple mathematical expression followed by the ENTER, CTRL+Z, and finally ENTER one more time.  The CTRL+Z tells the TestRig you are done proving input.

Assuming all went well, you should see the following output

ANTLR4 Console output
ANTLR4 Console output

Note that you can use the -tokens option to display a list of tokens found in the input string.  This can be helpful when attempting to debug your grammar.

The final command grun Calculator expression -gui follows the same process as the previous command; however instead of displaying a parse tree in LISP notation you should see a nice graphic display of the parse tree.

At this point, assuming all has gone well, you should be pretty eager to actually do something in .NET Core.  That subject will be handled in the next post in this series.  Until then, if you have any questions or comments don’t hesitate to reach out.  Thanks for reading along.

Helpful Links

If you had trouble following along with the post, you’ll find some links below to additional resources that might help clear things up:


Discussion


Leave a Reply

Your email address will not be published. Required fields are marked *