Teach you how to implement a DSL language using ANTLR and Go (Part 5): Error handling

Permalink to this article – https://ift.tt/tv1Vah4

Whether it is a client application or a cloud application, one thing that must be done well in order to be used in a production environment is error handling . In previous articles in this series, we designed grammars and grammars , built and validated semantic models , but we did not pay special attention to error handling. In this article, we will add this part.

The DSL design and implementation process has the following main links. In different links, the main objects of error handling we pay attention to are different. As shown below:

In grammar design and verification , we pay more attention to the correctness of grammar design. The wrong grammar will cause the parsing method to fail in the example, but this link is before the Parser code is produced. We use the debugging tools provided by ANTLR to debug the correctness of the grammar, and we do not need to write code for error handling.

In the process of parsing the grammar and building the grammar tree , since the grammar problem has been solved, the generated Parser can parse the correct grammar example. At this point, error handling mainly focuses on how to handle syntax errors.

In the process of assembling the semantic model and executing the semantic model , we focus on the rationality of the element values ​​used to assemble the semantic model. Taking windowsRange as an example, in the semantic model, it has two elements low and max, which represent windowsRange as [low, max]. But if the value of low in your source code is greater than the value of max, it is legal from a grammar point of view and can be parsed by grammar. But at a semantic level, this just doesn’t make sense. In the assembly of the semantic model and execution, we need to find out such problems, report errors and deal with them.

In this article, we will briefly explain the ideas and methods of error handling in the latter two links.

1. Error handling of syntax parsing

Syntax parsing is like compilation of static languages ​​or parsing of dynamic languages. If a syntax error is found, the location of the syntax error in the source code and related auxiliary information are provided. The ErrorListener interface and an empty implementation of DefaultErrorListener are provided in ANTLR’s Go runtime:

 // github.com/antlr/antlr4/runtime/Go/antlr/error_listener.go type ErrorListener interface { SyntaxError(recognizer Recognizer, offendingSymbol interface{}, line, column int, msg string, e RecognitionException) ReportAmbiguity(recognizer Parser, dfa *DFA, startIndex, stopIndex int, exact bool, ambigAlts *BitSet, configs ATNConfigSet) ReportAttemptingFullContext(recognizer Parser, dfa *DFA, startIndex, stopIndex int, conflictingAlts *BitSet, configs ATNConfigSet) ReportContextSensitivity(recognizer Parser, dfa *DFA, startIndex, stopIndex, prediction int, configs ATNConfigSet) }

The SyntaxError method in the ErrorListener interface is exactly what we need in this link, it can help us catch syntax errors when parsing syntax examples.

Parser has built-in implementations of ErrorListener, such as antlr.ConsoleErrorListener. However, this Listener will not output anything during the parsing process of the source code example, and has no sense of existence. We need to customize an ErrorListener implementation that can prompt error syntax information.

Here’s a Go version of the VerboseErrorListener I implemented with reference to the Java example in the ANTLR4 Definitive Guide:

 // tdat/error_listener.go type VerboseErrorListener struct { *antlr.DefaultErrorListener hasError bool } func NewVerboseErrorListener() *VerboseErrorListener { return new(VerboseErrorListener) } func (d *VerboseErrorListener) HasError() bool { return d.hasError } func (d *VerboseErrorListener) SyntaxError(recognizer antlr.Recognizer, offendingSymbol interface{}, line, column int, msg string, e antlr.RecognitionException) { p := recognizer.(antlr.Parser) stack := p.GetRuleInvocationStack(p.GetParserRuleContext()) fmt.Printf("rule stack: %v ", stack[0]) fmt.Printf("line %d: %d at %v : %s\n", line, column, offendingSymbol, msg) d.hasError = true }

In the process of parsing the source code, Parser will call back the SyntaxError method of VerboseErrorListener when it finds a syntax error. Each parameter passed in by SyntaxError contains the detailed information of the syntax error. We just need to assemble the information in a certain format and output it as above. .

In addition, a hasError boolean field is added to the VerboseErrorListener, which is used to identify whether there is a syntax error during the parsing of the source file, and the program can select the subsequent execution path according to this error identification.

The following is the usage of VerboseErrorListener in the main function:

 func main() { ... ... lexer := parser.NewTdatLexer(input) stream := antlr.NewCommonTokenStream(lexer, 0) p := parser.NewTdatParser(stream) el := NewVerboseErrorListener() p.RemoveErrorListeners() p.AddErrorListener(el) tree := p.Prog() if el.HasError() { return } ... ... }

As you can see from the above code, after creating a TdatParser instance, before parsing the source code (p.Prog()), we need to delete its default built-in ErrorListener, and then add our own VerboseErrorListener instance. After that, the main function decides whether to continue the downward execution according to whether the VerboseErrorListener contains the status of monitoring syntax errors, and terminates the program if a syntax error is found.

We add a syntax example sample5-invalid.t with syntax errors:

 // tdat/samples/sample5-invalid.t r0006: Aach { |1,3| ($speed < 50e) and (($temperature + 1) < 4) or ((roundDown($salinity) <= 600.0) or (roundUp($ph) > 8.0)) } => ();

Let the tdat program parse sample5-invalid.t, we get the following results:

 $./tdat samples/sample5-invalid.t input file: samples/sample5-invalid.t rule: enumerableFunc line 2: 7 at [@2,8:11='Aach',<29>,2:7] : mismatched input 'Aach' expecting {'Each', 'None', 'Any'} rule: conditionExpr line 2: 32 at [@13,33:33='e',<29>,2:32] : extraneous input 'e' expecting ')'

We see that the program prints the details of the syntax problem and stops execution.

2. Error handling in assembly and execution of semantic model

Different from the relatively fixed error handling in grammar parsing, the error forms at the semantic level are more diverse, and the distribution position is relatively light. There may be semantic problems at each parse rule, as mentioned earlier. The problem of low > high of windowsRange. Another example is that the field specified in the result cannot be found in the incoming data.

Whether it is assembling the semantic model or executing the semantic model, it is all tree traversal. The traversal function has recursion, and the level may be very deep, so the traditional error as the return value is not suitable. The best way is to combine the method of panic+recover. When there is a problem with the semantics of a certain link, directly panic, and then capture the panic through recover at the upper layer, and then return the error information carried by the panic in error mode. Let’s take the semantic problem of windowRange as an example to see how errors are handled during the assembly and execution of the semantic model.

First, let’s transform the ExitWindowsWithLowAndHighIndex method of ReversePolishExprListener. When low > high is found after parsing, panic is thrown:

 // tdat/reverse_polish_expr_listener.go func (l *ReversePolishExprListener) ExitWindowsWithLowAndHighIndex(c *parser.WindowsWithLowAndHighIndexContext) { s := c.GetText() s = s[1 : len(s)-1] // remove two '|' t := strings.Split(s, ",") if t[0] == "" { l.low = 1 } else { l.low, _ = strconv.Atoi(t[0]) } if t[1] == "" { l.high = windowsRangeMax } else { l.high, _ = strconv.Atoi(t[1]) } if l.high < l.low { panic(fmt.Sprintf("windowsRange: low(%d) > high(%d)", l.low, l.high)) } }

In order not to capture panic directly in main, we will traverse the tree with the original statement:

 antlr.ParseTreeWalkerDefault.Walk(l, tree)

Moved to a new function extractReversePolishExpr, we capture panic in extractReversePolishExpr and return the error to the main function in the form of a normal error:

 // tdat/main.go func extractReversePolishExpr(listener antlr.ParseTreeListener, t antlr.Tree) (err error) { defer func() { if x := recover(); x != nil { err = fmt.Errorf("semantic tree assembly error: %v", x) } }() antlr.ParseTreeWalkerDefault.Walk(listener, t) return nil }

In the main function, we use extractReversePolishExpr like this:

 // tdat/main.go func main() { ... ... l := NewReversePolishExprListener() err = extractReversePolishExpr(l, tree) if err != nil { fmt.Printf("%s\n", err) return } ... ... }

When extractReversePolishExpr returns an error, it means that there is a problem with the process of extracting the reverse Polish, and we will terminate the program.

Next, we will construct an example of semantic error samples/sample6-windowrange-invalid.t to see the process of catching semantic errors in the above program:

 // samples/sample6-windowrange-invalid.t r0006: Each { |3,1| ($speed < 50) and (($temperature + 1) < 4) or ((roundDown($salinity) <= 600.0) or (roundUp($ph) > 8.0)) } => ();

Let’s run our new program:

 $./tdat samples/sample6-windowrange-invalid.t input file: samples/sample6-windowrange-invalid.t semantic tree assembly error: windowsRange: low(3) > high(1)

We see: the program successfully catches the expected semantic error.

In the subsequent execution of the semantic model, the Evaluate function of the semantic package also uses defer + recover to capture the panic that may occur during the evaluation of the expression tree, and returns it to the caller in the form of error. Even semantic problems that are not caught during the assembly process will also be caught once a semantic execution error is caused.

Since the principle is the same, the error handling of the semantic model execution process will not be described here.

3. Summary

In this article, we supplement the error handling in the process of designing and implementing DSL, and give corresponding error handling solutions for the two links of syntax parsing and semantic model assembly and execution.

In Domain-Specific Languages, Martin Fowler writes: “Parsing and generating output is the easy part of writing a compiler, the real hard part is giving better error messages”. Error handling plays a very important role in DSL-based processing engines, and a good error handling design is of great benefit to the problem diagnosis, evolution and maintenance of subsequent engines.

The code covered in this article can be downloaded here – https://ift.tt/ZbQMf7K.


“Gopher Tribe” Knowledge Planet aims to create a high-quality Go learning and advanced community! High-quality first published Go technical articles, “three-day” first published reading rights, analysis of the current situation of Go language development twice a year, read the fresh Gopher daily 1 hour in advance every day, online courses, technical columns, book content preview, must answer within six hours Guaranteed to meet all your needs about the Go language ecosystem! In 2022, the Gopher tribe will be fully revised, and will continue to share knowledge, skills and practices in the Go language and Go application fields, and add many forms of interaction. Everyone is welcome to join!

img{512x368}

img{512x368}

img{512x368}

img{512x368}

I love texting : Enterprise-level SMS platform customization development expert https://51smspush.com/. smspush : A customized SMS platform that can be deployed within the enterprise, with three-network coverage, not afraid of large concurrent access, and can be customized and expanded; the content of the SMS is determined by you, no longer bound, with rich interfaces, long SMS support, and optional signature. On April 8, 2020, China’s three major telecom operators jointly released the “5G Message White Paper”, and the 51 SMS platform will also be newly upgraded to the “51 Commercial Message Platform” to fully support 5G RCS messages.

The famous cloud hosting service provider DigitalOcean released the latest hosting plan. The entry-level Droplet configuration is upgraded to: 1 core CPU, 1G memory, 25G high-speed SSD, and the price is 5$/month. Friends who need to use DigitalOcean can open this link : https://ift.tt/vxAkbnq to open your DO host road.

Gopher Daily Archive Repository – https://ift.tt/IUqztFT

my contact information:

  • Weibo: https://ift.tt/APQK1za
  • WeChat public account: iamtonybai
  • Blog: tonybai.com
  • github: https://ift.tt/CnwYP6r
  • “Gopher Tribe” Planet of Knowledge: https://ift.tt/An3OL7U

Business cooperation methods: writing, publishing books, training, online courses, partnership entrepreneurship, consulting, advertising cooperation.

© 2022, bigwhite . All rights reserved.

This article is reprinted from https://tonybai.com/2022/05/30/an-example-of-implement-dsl-using-antlr-and-go-part5/
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment