Chapter 6: Types in Detail
Where you will learn the truth behind types.
We've talked about Lux types already, but only in a very high-level way.
On this chapter, you'll see how types are constructed, and hopefully that will give you some insight to understand better the subjects of later chapters.
(type: #rec Type
(#HostT Text (List Type))
#VoidT
#UnitT
(#SumT Type Type)
(#ProdT Type Type)
(#LambdaT Type Type)
(#BoundT Nat)
(#VarT Nat)
(#ExT Nat)
(#UnivQ (List Type) Type)
(#ExQ (List Type) Type)
(#AppT Type Type)
(#NamedT Ident Type))
This is the type of types.
Crazy, right?
But as I've said before, Lux types are values like any other.
Type
is a variant type, which just means that there are multiple options for type values.
Also, you may have noticed that
#rec
tag in the definition. You need to add it whenever you're defining a recursive type that takes no parameters.So, the definition of
List
doesn't need it, but the definition ofType
does.
Let's go over each of them.
(#HostT Text (List Type))
This is what connects Lux's type-system with the host platform's.
These types represent classes (in the JVM), with their respective parameters, if they have them (as would be the case for ArrayList<User>
in the JVM).
#VoidT
#UnitT
You've already met unit, but what is that "void" thingie?
Well, remember when I told you that unit was the empty tuple? You can think of void as the empty variant. But more on that in a bit.
(#SumT Type Type)
(#ProdT Type Type)
You may have noticed that none of those options are called #TupleT
or #VariantT
.
The reason is that variants and tuples are just names for mathematical constructs called "sums" and "products".
Funny names, right?
Well, mathematicians see variants as a way of "adding" types and tuples as a way of "multiplying" types, Of course, it's a bit difficult to picture that if you're thinking of numbers.
But a way to see variants is as an "OR" operation for types: you get this option OR that option. Conversely, tuples are like an "AND" operation for types: you get this type AND that type.
But, you may be wondering: "why do #SumT
and #ProdT
only take 2 types, instead of a list like #HostT
does?"
Well, as it turns out, you don't need a list of types to implement variants and tuples, because you can actually chain #SumT
and #ProdT
with other instances of themselves to get the same effect.
What do I mean?
Well, let me show you. To the left, you'll see the type as it's written in normal Lux code, and to the right you'll see the type value it generates.
[] => #UnitT
[Bool] => Bool
[Bool Int] => (#ProdT Bool Int)
[Bool Int Real] => (#ProdT Bool (#ProdT Int Real))
[Bool Int Real Char] => (#ProdT Bool (#ProdT Int (#ProdT Real Char)))
(|) => #VoidT
(| Bool) => Bool
(| Bool Int) => (#SumT Bool Int)
(| Bool Int Real) => (#SumT Bool (#SumT Int Real))
(| Bool Int Real Char) => (#SumT Bool (#SumT Int (#SumT Real Char)))
You can see where this is going.
If I have a way to to pair up 2 types, and I can nest that, then I can chain things as much as I want to get the desired length.
What is a tuple/variant of 1 type? It's just the type itself; no pairing required.
And what happens when the tuple/variant has 0 types?
That's when #UnitT
and #VoidT
come into play, being the empty tuple and the empty variant respectively.
You might observe that you've been taught how to create unit values (with
[]
), but not how to create void values.The reason is that it's technically impossible to make instances of void.
Think of it this way: there is only 1 possible instance of unit, which is
[]
. That means every[]
in Lux is actually the same value (kind of like every 5 is the same 5 everywhere). For void, there are 0 possible instances (hence, it's name).It's odd that there exists a type without instances; but, just like unit, it comes in handy in certain circumstances.
This embedding means that [true 123 456.789 #"X"]
is the same as [true [123 456.789 #"X"]]
, and the same as [true [123 [456.789 #"X"]]]
.
It also means 5
is the same as [5]
, and [[5]]
, and [[[[[5]]]]]
.
As far as the compiler is concerned, there are no differences.
That might sound crazy, but there are some really cool benefits to all of this. If you're curious about that, you can check out Appendix E for more information on how Lux handles this sort of stuff.
(#LambdaT Type Type)
Now that we have discussed variant and tuple types, it shouldn't come as a surprise that a similar trick can be done with function types.
You see, if you can implement functions of 1 argument, you can implement functions of N arguments, where N >= 1.
All I need to do is to embed the rest of the function as the return value to the outer function.
It might sound like this whole business of embedding tuples, variants and functions inside one another must be super inefficient; but trust me: Lux has taken care of that.
The Lux compiler features many optimizations that compile things down in a way that gives you maximum efficiency. So, to a large extent, these embedded encodings are there for the semantics of the language, but not as something that you'll pay for at run-time.
One of the cool benefits of this approach to functions is Lux's capacity to have partially applied functions.
Yep, that's a direct consequence of this theoretical model.
(#BoundT Nat)
This type is there mostly for keeping track of type-parameters in universal and existential quantification.
We'll talk about those later.
But, suffice it to say that #BoundT
helps them do their magic.
(#VarT Nat)
These are type variables.
They are used during type-inference by the compiler, but they're also part of what makes universal quantification (with the All
macro) able to adjust itself to the types you use it with.
Type-variables start unbound (which means they're not associated with any type), but once they have been successfully matched with another type, they become bound to it, and every time you use them afterwards it's as if you're working with the original type.
Type-variables, however, can't be "re-bound" once they have been set, to avoid inconsistencies during type-checking.
(#ExT Nat)
An existential type is an interesting concept (which is related, but not the same as existential quantification).
You can see it as a type that exists, but is unknown to you. It's like receiving a type in a box you can't open.
What can you do with it, then? You can compare it to other types, and the comparison will only succeed if it is matched against itself.
It may sound like a useless thing, but it can power some advanced techniques.
(#UnivQ (List Type) Type)
This is what the All
macro generates.
This is universal quantification.
That (List Type)
you see there is meant to be the "context" of the universal quantification.
It's kind of like the environment of a function closure, only with types.
The other Type
there is the body of the universal quantification.
To understand better what's going on, let's transform the type of our iterate-list
function from Chapter 5 into its type value.
(All [a b] (-> (-> a b) (List a) (List b)))
=>
(#UnivQ #;Nil (#UnivQ #;Nil (-> (-> (#BoundT +3) (#BoundT +1)) (List (#BoundT +3)) (List (#BoundT +1))))
I didn't transform the type entirely to avoid unnecessary verbosity.
As you can see, I do the same embedding trick to have universal quantification with multiple parameters.
Also, a
and b
are just nice syntactic labels that get transformed into #BoundT
instances.
The reason the type-parameters have those IDs is due to a technique called De Bruijn Indices. You can read more about it here: https://en.wikipedia.org/wiki/De_Bruijn_index.
(#ExQ (List Type) Type)
Existential quantification works pretty much the same way as universal quantification.
It's associated macro is Ex
.
Whereas universal quantification works with type-variables, existential quantification works with existential types.
(#AppT Type Type)
This is the opposite of quantification.
#AppT
is what you use to parameterize your quantified types; to customize them as you need.
With #AppT
, (List Int)
transforms into (#AppT List Int)
.
For multi-parameter types, like Dict
(from lux/data/dict
), (Dict Text User)
would become (#AppT (#AppT Dict Text) User)
.
As you can see, the nesting is slightly different than how it is for tuples, variant and functions.
(#NamedT Ident Type)
#NamedT
is less of a necessity and more of a convenience.
The type-system would work just fine without it, but users of the language probably wouldn't appreciate it while reading documentation or error messages.
#NamedT
is what gives the name "List" to the List
type, so you can actually read about it everywhere without getting bogged down in implementation details.
You see, Lux's type system is structural in nature, rather than nominal (the dominating style in programming languages).
That means all that matters is how a type is built; not what you call it.
That implies 2 types with different names, but the exact same value, would actually type-check in your code.
That may sound odd (if you come from Java or other languages with nominal types), but it's actually very convenient and enables you to do some pretty nifty tricks.
For more information on that, head over to Appendix E.
#NamedT
gives Lux's type-system a bit of a nominal feel for the convenience of programmers.
Regarding Error Messages
When you get error messages from the type-checker during your coding sessions, types will show up in intuitive ways most of the time, with a few exceptions you might want to know.
Existential types show up in error messages like ⟨e:246⟩
(where 246 is the ID of the type).
Whereas type-variables show up like ⌈v:278⌋
.
Those types tend to show up when there are errors in the definition of some polymorphic function.
You may be tired of reading about types, considering that they are (to a large degree) an implementation detail of the language.
However, one of the key features of Lux is that types can be accessed and manipulated by programmers (often in macros) to implement various powerful features.
In the next chapter, you'll get acquainted with one such feature.
See you in the next chapter!