Last week, I read this article – How to Write a Lexer in Go , I found that it is not so difficult to design a configuration file parser by this article’s mind-set. Then I try to write a fluent-bit configuration parser and got this Fluent-Bit configuration parser for Golang .
In this article, I want to introduce how to parse Fluent-bit configuration .conf
file, and the thinking behind it is suitable for any other format file.
Fluent-bit configuration format and schema
[FIRST_SECTION] Key1 some value Key2 another value [SECOND_SECTION] KeyN 3.14
Here is a classic mode configuration of Fluent-bit, it includes two key parts:
- Section
- Key/value pair
First of all, we need to define a struct which represents the Fluent-bit configuration file.
type FluentBitConf struct { Sections [] Section } type Section struct { Name string Entries [] Entry } type Entry struct { Key string Value interface {} }
Once we have a struct, the next step is to parse tokens from file and save their values into golang struct. We can copy the logic of lexer to develop our own fluentbit parser.
In a lexer program, the target charectors which we want to parse out are called “Token”, Token is also the keyword which our parser program are searching for. A parser program will read charactors in a file one by one, whenever it found a token, parser save the value between tokens into the final structure and go ahead.
Parse a single token
If we want to parse Section, we have to make parser read charactors one by one and stop at [
charator, which means the beginning of a Section. Parser must save current state as t_section
and keep parser reading until ]
charactor, the word between [
and ]
is the Section value we need to persist into go struct.
// define some tag to tell parser state const ( t_section = iota ) func ( parser * FluentBitConfParser ) Parse () * FluentBitConf { var currSection * Section = nil for { // read charector one by one r , _ , err := parser . reader . ReadRune () if err != nil { // stop at the end of file if err == io . EOF { if currSection != nil { parser . Conf . Sections = append ( parser . Conf . Sections , * currSection ) } return parser . Conf } return parser . Conf } switch r { case '\n' : continue case '[' : // save last config item if currSection != nil { parser . Conf . Sections = append ( parser . Conf . Sections , * currSection ) } // create new config item currSection = & Section { Name : "" , Entries : [] Entry {}, } parser . token = t_section default : if unicode . IsSpace ( r ) { continue } // here is important function, read the charectors after token-chareactor and save them into struct strValue , _ := parser . parseString () switch parser . token { case t_section : currSection . Name = strValue parser . token = t_entry_key } } }
In function parser.parseString()
, we have to read unitl the end of a value (for section, it’s ]
), then return the value.
func ( parser * FluentBitConfParser ) parseString () ( string , error ) { var val string = "" if err := parser . reader . UnreadRune (); err != nil { return "" , err } for { r , _ , err := parser . reader . ReadRune () if err != nil { if err == io . EOF { return val , nil } return "" , err } if parser . token == t_section && r == ']' { return val , nil } val = val + string ( r ) } }
That’s all logic for parsing a section. To parse key/value pair is the same process, just note to make parser know which state it is and save values between whitespace or \n
, you can see the code at the github repo .
Conclusion
To parse a configuration file, we have to
- Defining token (key charectors)
- Reading charectors and looking for token
- Saving current state to tell parser which struct the following charectors belong
This article is reprinted from: https://sund.site/posts/2022-5-8_lexer_design/
This site is for inclusion only, and the copyright belongs to the original author.