Hercules Lemke Merscher

Posted on Feb 12 • Originally published at bitmaybewise.substack.com

Tsonnet #9 - ID please

#tsonnet #jsonnet #compiler

Welcome to the Tsonnet series!

If you're just joining, you can check out how it all started in the first post of the series.

In the previous post, we fixed error handling during lexical analysis:

Tsonnet #8 - Graceful error handling for string parsing

Hercules Lemke Merscher ・ Feb 11

#tsonnet #jsonnet #compiler

Now, let's continue building our interpreter by adding support for identifiers.

What is an identifier?

An identifier is a sequence of characters that serves as a name for various programming constructs, such as variables, functions, classes, or modules.

When we added object literals to Tsonnet, we used strings as object attributes. However, it's more common to use identifiers for attribute names. Let's modify our object literal sample to use identifiers for some attributes:

diff --git a/samples/literals/object.jsonnet b/samples/literals/object.jsonnet
index 043f131..cb4c52a 100644
--- a/samples/literals/object.jsonnet
+++ b/samples/literals/object.jsonnet
@@ -2,7 +2,7 @@
     "int_attr": 1,
     "float_attr": 4.2,
     "string_attr": "Hello, world!",
-    "null_attr": null,
-    "array_attr": [1, false, {}],
-    "obj_attr": { "a": true, "b": false, "c": { "d": [42] } }
+    null_attr: null,
+    array_attr: [1, false, {}],
+    obj_attr: { "a": true, "b": false, "c": { "d": [42] } }
 }

Running this code currently breaks our parser:

$ dune exec -- tsonnet samples/literals/object.jsonnet
Fatal error: exception Tsonnet__Parser.MenhirBasics.Error

Let's fix this by implementing identifier support.

Adding identifiers

First, we need to add a new expression type to our AST:

diff --git a/lib/ast.ml b/lib/ast.ml
index 55ddd52..d34a152 100644
--- a/lib/ast.ml
+++ b/lib/ast.ml
@@ -13,6 +13,7 @@ type expr =
   | Null
   | Bool of bool
   | String of string
+  | Ident of string
   | Array of expr list
   | Object of (string * expr) list
   | BinOp of bin_op * expr * expr

Next, we need to define the lexical rules for identifiers. An identifier can start with an underscore or a letter, followed by any number of alphanumeric characters or underscores:

diff --git a/lib/lexer.mll b/lib/lexer.mll
index bbf3b66..bebf28a 100644
--- a/lib/lexer.mll
+++ b/lib/lexer.mll
@@ -14,6 +14,8 @@ let exp = ['e' 'E']['-' '+']? digit+
 let float = '-'? digit* frac? exp?
 let null = "null"
 let bool = "true" | "false"
+let letter = ['a'-'z' 'A'-'Z']
+let id = (letter | '_') (letter | digit | '_')*

 rule read =
   parse
@@ -34,6 +36,7 @@ rule read =
   | '-' { SUBTRACT }
   | '*' { MULTIPLY }
   | '/' { DIVIDE }
+  | id { ID (Lexing.lexeme lexbuf) }
   | _ { raise (SyntaxError ("Unexpected char: " ^ Lexing.lexeme lexbuf)) }
   | eof { EOF }
 and read_string buf =

The lexer reads characters from the input and wraps them in the ID token type. The parser needs a few more changes to handle these new tokens:

diff --git a/lib/parser.mly b/lib/parser.mly
index 2b6db25..a224ea3 100644
--- a/lib/parser.mly
+++ b/lib/parser.mly
@@ -16,6 +16,7 @@
 %token ADD SUBTRACT MULTIPLY DIVIDE
 %left ADD SUBTRACT
 %left MULTIPLY DIVIDE
+%token <string> ID
 %token EOF

 %start <Ast.expr> prog
@@ -32,6 +33,7 @@ expr:
   | NULL { Null }
   | b = BOOL { Bool b }
   | s = STRING { String s }
+  | id = ID { Ident id }
   | LEFT_SQR_BRACKET; values = list_fields; RIGHT_SQR_BRACKET { Array values }
   | LEFT_CURLY_BRACKET; attrs = obj_fields; RIGHT_CURLY_BRACKET { Object attrs }
   | e1 = expr; ADD; e2 = expr { BinOp (Add, e1, e2) }
@@ -44,7 +46,9 @@ list_fields:
   vl = separated_list(COMMA, expr) { vl };

 obj_field:
-  k = STRING; COLON; v = expr { (k, v) };
+  | k = STRING; COLON; v = expr { (k, v) }
+  | k = ID; COLON; v = expr { (k, v) }
+  ;

 obj_fields:
     obj = separated_list(COMMA, obj_field) { obj };

We add the ID token type that will be parsed as a string. The new rule to match ID is straightforward. Finally, we update obj_field to handle both string and identifier keys.

The last step is to update our Tsonnet.interpret and Json.expr_to_yojson functions to handle the new Ast.expr type:

diff --git a/lib/json.ml b/lib/json.ml
index 9b3596f..26d930c 100644
--- a/lib/json.ml
+++ b/lib/json.ml
@@ -9,6 +9,7 @@ let rec expr_to_yojson : expr -> (Yojson.t, string) result = function
   | Null -> ok `Null
   | Bool b -> ok (`Bool b)
   | String s -> ok (`String s)
+  | Ident id -> ok (`String id)
   | Array values ->
     let expr_to_list expr' = to_list (expr_to_yojson expr') in
     let results = values |> List.map expr_to_list |> List.concat in
diff --git a/lib/tsonnet.ml b/lib/tsonnet.ml
index 0e525e2..ae6eb91 100644
--- a/lib/tsonnet.ml
+++ b/lib/tsonnet.ml
@@ -32,7 +32,7 @@ let interpret_bin_op (op: bin_op) (n1: number) (n2: number) : expr =
 (** [interpret expr] interprets and reduce the intermediate AST [expr] into a result AST. *)
 let rec interpret (e: expr) : (expr, string) result =
   match e with
-  | Null | Bool _ | String _ | Number _ | Array _ | Object _ -> ok e
+  | Null | Bool _ | String _ | Number _ | Array _ | Object _ | Ident _ -> ok e
   | BinOp (Add, String a, String b) -> ok (String (a^b))
   | BinOp (op, e1, e2) ->
     let* e1' = interpret e1 in

Conclusion

With these changes, we've successfully added identifier support to Tsonnet! This is a crucial feature that paves the way for more advanced language constructs. In upcoming posts, we'll build upon this foundation to add even more interesting features.

Stay tuned for the next post in the series!

Thanks for reading Bit Maybe Wise! Subscribe and join me in building Tsonnet, one feature at a time. No compiler theory degree required -- just curiosity and a love for coding!

DEV Community

Tsonnet #9 - ID please

Tsonnet #8 - Graceful error handling for string parsing

Hercules Lemke Merscher ・ Feb 11

What is an identifier?

Adding identifiers

Conclusion

Top comments (0)

Read next

Manager's Guide: AWS Tri-Secret Secure in Snowflake

Etherspot Brings Powerful Account Abstraction Infrastructure to Celo Developers

How to Write Technical Documentation in 2025: A Step-by-Step Guide

Unlocking the Power of Large Language Models: Corporate Strategies for Fine-Tuning