For the C3 language I’m working on I wanted to improve on C’s integers.
Recent languages have gravitated towards removing many implicit casts: Rust, Swift, Go, Zig and Odin all fall into this camp.
Studying Go in particular is illuminating: they recognize that removing implicit casts creates usability issues, and changes numeric literals to be BigInts, implicitly convertible into any sufficiently large integer type.
Swift, Zig and Odin all pick up this idea, but in slightly different ways.
Zig uses what it calls peer type resolution to describe how the conversion from “compile time integer” (the BigInt representation) occurs in various circumstances, such as in binary expressions. This is basically picking a common type that all sub expressions can coerce into. Here are some examples: As an example, adding a variable of type i32
with a constant “123” will convert the constant from BigInt to i32
. The common type here is i32
and this is the type the constant will be cast to. When adding i32
and i64
the common type is instead i64
and so on.
This peer type resolution breaks down in expressions like:
var y : i32 = if (x > 2) 1 else 2;
In this example 1 and 2 are both BigInt types, but the expression is a runtime one and needs a definite integer type.
In Go, this situation is resolved by falling back on a default type size: int
. In Go this is a 32-bit or 64-bit value depending on platform.
Zig doesn't have that, so other strategies must be employed. As of 0.6.0, Zig will parse the above, but will not accept either of the following.
if (x > 2) 1 else 2;
var y : i32 = 1 + if (x > 2) 1 else 2;
Peer type resolution might also at times create odd results. Consider the following Zig code:
var foo : i16 = 0x7FFF;
var bar : i32 = foo + 1;
// The above is equivalent to:
var foo : i16 = 0x7FFF;
var temp : i16 = foo + 1;
var bar : i32 = temp;
Because of peer type resolution this would overflow. One could imagine a behaviour more similar to what C would often give you by resolving the add foo + 1
after casting both to the type of bar
.
For Odin, Swift and Go this example is more obvious, because there are no widening conversions. In Go for example we would have to write this:
var foo int16 = 0x7FFF
var bar int32 = int32(x + 1)
In this case it's clear that in x + 1
neither x
nor 1
are int32, so the fact that bar
returns -32768
is expected.
It's important to emphasize that these are conventions, there is not strictly any right or wrong, rather it's about a trade off between convenience and how prone it is to cause bugs.
C3 – like Zig – allows safe widening conversions, and so it needs to decide whether it should follow the Zig behaviour or the "convert first" approach. To me the Zig behaviour is a bit counter-intuitive. I'd prefer the widening to happen before the addition. Here we might consider int a = b + c
where b
and c
are signed chars. Is it intuitive that this add should overflow on b
and c
both equal to 64
? I think it's better to avoid possible overflows if possible, and doing widening first helps that.
To achieve this we use bi-directional type checking. This works by "pushing down" the expected type and optionally casting to the expected type. As we saw in the Zig example, it does that for assignment, but retains peer type resolution when we nest deeper into expressions.
To illustrate this, here is some C3 code, it should look fairly familiar – note that like in Java, C# and D the sizes of the types are well defined. An int
is always 32 bits:
short foo = 0x7FFF;
int bar = foo + 1;
// The above is equivalent to:
short foo = 0x7FFF;
int bar = cast(foo, int) + cast(1, int);
What happens during semantic analysis in C3 is this:
- Found declaration of
bar
- Analyse the init expression to the declaration, pass down the type
int
- The init expression is a binary add, analyse left side, pass down the type
int
to the analysis. - Left side is smaller than
int
, promote toint
. - Analyse right side, passing down the type
int
. - Right side is compile time integer, try to convert to
int
(this will be a compile time error if the value does not fit in an integer). - Binary sub expressions are resolved. Find a common type of both sides, which in this case is
int
. - Implicitly cast binary expression to
int
if necessary (not necessary in this case). - Check the type of the binary expression if it matches
int
So let's have a look at a case when conversion isn't possible and a compile time error occurs.
long foo = 1;
int bar = foo + 2;
Semantic analysis is in this case.
- Found declaration of
bar
- Analyse the init expression to the declaration, pass down the type
int
- The init expression is a binary add, analyse left side, pass down the type
int
to the analysis. - Left side is
long
, it cannot be implicitly cast toint
. - Analyse right side, passing down the type
int
. - Right side is compile time integer, try to convert to
int
(this will be a compile time error if the value does not fit in an integer). - Binary sub expressions are resolved. Find if the common type which is
long
in this case. - Since the right hand side is
int
, cast tolong
. - The resulting binary expression now has type
long
. - Check the type of the binary expression if it matches
int
, it doesn't and cannot be implicitly converted toint
either. A compile time error "long
cannot be implicitly converted toint
" is displayed.
Given the above examples it might seem like one could simply do away with any "peer type resolution".
However, there are cases where top down resolution fails. Here is an example:
short x = ...
if (x + 1 < 100) { ... }
In this case we don't have a type hint. This is what happens when resolving the comparison.
- Analyse left hand size, with type passed down as
NULL
. - Analyse
x + 1
with type passed down asNULL
. - Left hand side is
short
and right hand side iscompile time integer
. The common type isshort
- Right hand side in addition is implicitly cast to
short
. - Left hand side in the comparison has the type
short
. - Analyse right hand side of comparison, this is a
compile time integer
. - Find a common type between
short
andcompile time integer
, which isshort
. - Implicitly cast right hand side in comparison to
short
- Analysis is complete.
That is not to say that these two methods are sufficient(!) here is an example from gingerBill (author of Odin): ((x > 1 ? 3000 : 2 + 1000) == 2)
. The problem here is that all values are compile time integers, so no type hint can be found anywhere.
This can either be considered a compile time error because it's "under typed", or default to some integer type: either the register sized integer (Go's strategy) or some other similar strategy. C3 currently picks the first option (compile time error) because of the difficulty of picking a good default that doesn't accidentally cause odd behaviour and due to how uncommon this is.
Top comments (0)