Welcome to the Tsonnet series!
If you're just joining, you can check out how it all started in the first post of the series.
In the previous post, we fixed error handling during lexical analysis:
![bitmaybewise](https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F187316%2Faed95d8a-d5c5-4d3e-8dab-513ed83b24c3.jpeg)
Tsonnet #8 - Graceful error handling for string parsing
Hercules Lemke Merscher ・ Feb 11
Now, let's continue building our interpreter by adding support for identifiers.
What is an identifier?
An identifier is a sequence of characters that serves as a name for various programming constructs, such as variables, functions, classes, or modules.
When we added object literals to Tsonnet, we used strings as object attributes. However, it's more common to use identifiers for attribute names. Let's modify our object literal sample to use identifiers for some attributes:
diff --git a/samples/literals/object.jsonnet b/samples/literals/object.jsonnet
index 043f131..cb4c52a 100644
--- a/samples/literals/object.jsonnet
+++ b/samples/literals/object.jsonnet
@@ -2,7 +2,7 @@
"int_attr": 1,
"float_attr": 4.2,
"string_attr": "Hello, world!",
- "null_attr": null,
- "array_attr": [1, false, {}],
- "obj_attr": { "a": true, "b": false, "c": { "d": [42] } }
+ null_attr: null,
+ array_attr: [1, false, {}],
+ obj_attr: { "a": true, "b": false, "c": { "d": [42] } }
}
Running this code currently breaks our parser:
$ dune exec -- tsonnet samples/literals/object.jsonnet
Fatal error: exception Tsonnet__Parser.MenhirBasics.Error
Let's fix this by implementing identifier support.
Adding identifiers
First, we need to add a new expression type to our AST:
diff --git a/lib/ast.ml b/lib/ast.ml
index 55ddd52..d34a152 100644
--- a/lib/ast.ml
+++ b/lib/ast.ml
@@ -13,6 +13,7 @@ type expr =
| Null
| Bool of bool
| String of string
+ | Ident of string
| Array of expr list
| Object of (string * expr) list
| BinOp of bin_op * expr * expr
Next, we need to define the lexical rules for identifiers. An identifier can start with an underscore or a letter, followed by any number of alphanumeric characters or underscores:
diff --git a/lib/lexer.mll b/lib/lexer.mll
index bbf3b66..bebf28a 100644
--- a/lib/lexer.mll
+++ b/lib/lexer.mll
@@ -14,6 +14,8 @@ let exp = ['e' 'E']['-' '+']? digit+
let float = '-'? digit* frac? exp?
let null = "null"
let bool = "true" | "false"
+let letter = ['a'-'z' 'A'-'Z']
+let id = (letter | '_') (letter | digit | '_')*
rule read =
parse
@@ -34,6 +36,7 @@ rule read =
| '-' { SUBTRACT }
| '*' { MULTIPLY }
| '/' { DIVIDE }
+ | id { ID (Lexing.lexeme lexbuf) }
| _ { raise (SyntaxError ("Unexpected char: " ^ Lexing.lexeme lexbuf)) }
| eof { EOF }
and read_string buf =
The lexer reads characters from the input and wraps them in the ID
token type. The parser needs a few more changes to handle these new tokens:
diff --git a/lib/parser.mly b/lib/parser.mly
index 2b6db25..a224ea3 100644
--- a/lib/parser.mly
+++ b/lib/parser.mly
@@ -16,6 +16,7 @@
%token ADD SUBTRACT MULTIPLY DIVIDE
%left ADD SUBTRACT
%left MULTIPLY DIVIDE
+%token <string> ID
%token EOF
%start <Ast.expr> prog
@@ -32,6 +33,7 @@ expr:
| NULL { Null }
| b = BOOL { Bool b }
| s = STRING { String s }
+ | id = ID { Ident id }
| LEFT_SQR_BRACKET; values = list_fields; RIGHT_SQR_BRACKET { Array values }
| LEFT_CURLY_BRACKET; attrs = obj_fields; RIGHT_CURLY_BRACKET { Object attrs }
| e1 = expr; ADD; e2 = expr { BinOp (Add, e1, e2) }
@@ -44,7 +46,9 @@ list_fields:
vl = separated_list(COMMA, expr) { vl };
obj_field:
- k = STRING; COLON; v = expr { (k, v) };
+ | k = STRING; COLON; v = expr { (k, v) }
+ | k = ID; COLON; v = expr { (k, v) }
+ ;
obj_fields:
obj = separated_list(COMMA, obj_field) { obj };
We add the ID
token type that will be parsed as a string. The new rule to match ID
is straightforward. Finally, we update obj_field
to handle both string and identifier keys.
The last step is to update our Tsonnet.interpret
and Json.expr_to_yojson
functions to handle the new Ast.expr
type:
diff --git a/lib/json.ml b/lib/json.ml
index 9b3596f..26d930c 100644
--- a/lib/json.ml
+++ b/lib/json.ml
@@ -9,6 +9,7 @@ let rec expr_to_yojson : expr -> (Yojson.t, string) result = function
| Null -> ok `Null
| Bool b -> ok (`Bool b)
| String s -> ok (`String s)
+ | Ident id -> ok (`String id)
| Array values ->
let expr_to_list expr' = to_list (expr_to_yojson expr') in
let results = values |> List.map expr_to_list |> List.concat in
diff --git a/lib/tsonnet.ml b/lib/tsonnet.ml
index 0e525e2..ae6eb91 100644
--- a/lib/tsonnet.ml
+++ b/lib/tsonnet.ml
@@ -32,7 +32,7 @@ let interpret_bin_op (op: bin_op) (n1: number) (n2: number) : expr =
(** [interpret expr] interprets and reduce the intermediate AST [expr] into a result AST. *)
let rec interpret (e: expr) : (expr, string) result =
match e with
- | Null | Bool _ | String _ | Number _ | Array _ | Object _ -> ok e
+ | Null | Bool _ | String _ | Number _ | Array _ | Object _ | Ident _ -> ok e
| BinOp (Add, String a, String b) -> ok (String (a^b))
| BinOp (op, e1, e2) ->
let* e1' = interpret e1 in
Conclusion
With these changes, we've successfully added identifier support to Tsonnet! This is a crucial feature that paves the way for more advanced language constructs. In upcoming posts, we'll build upon this foundation to add even more interesting features.
Stay tuned for the next post in the series!
Thanks for reading Bit Maybe Wise! Subscribe and join me in building Tsonnet, one feature at a time. No compiler theory degree required -- just curiosity and a love for coding!
Top comments (0)