Why Type Systems Matter
I’ve written most of my code in dynamically typed languages such as Python or PHP. But ever since dabbling with Rust, I’ve developed a passion for static type systems.
It began to feel very natural to me; like a totally new way to express myself.
Types are here to help
With types, you communicate your guarantees and expectations. Both, to the machine and other developers. Types express intent.
As a programmer, you’ve probably gained some intuition about types.
=
You might guess that sentence
is a string. It’s in quotes, after all. It gets a little more tricky if the type gets inferred from some other location.
=
Is sentence
still a string? Uhm… we don’t know. It depends on the type of x
. Maybe x
is a number, and so sentence
is also a number? Maybe x
used to be a string but during refactoring it is now a byte array? Fun times had by all. 🎉
What about this one?
= # Size in bytes
Here, we express a file size as a string.
While this might work, it’s an unsettling idea.
Even simple calculations might lead to unexpected results:
=
=
= +
# prints '50003000'
How can we fix that?
We can safely assume that a file size is always a number. To be more precise, it must be a positive, natural number. There can be no negative file size, and our smallest block of memory is one byte (on all but the most obscure systems). And since we’re dealing with a discrete machine here, we know it can only be a filesize the computer can handle. If we only could express all of this in a precise way…?
This is where type systems enter the stage.
In Rust, you could define a File
type with a field named size
.
The usize
gives you the guarantee to be always big enough to hold any pointer into memory (on 64 bit computers usize = u64
). Now there is no more ambiguity about the type of size
. You can’t even create an invalid file object:
// Error: `size` can't be a string.
let weird_file = File ;
The type system will prevent invalid state. It will simply not allow you to break your own rules. It will hold you accountable for your design choices. Dare I say it: it becomes an extension of your brain. After some time you start to rely on the type checker. “If it compiles, it runs” is a powerful mantra.
Types improve readability and provide context
Consider the following Python snippet:
=
return
What does 0
represent? We can’t say. We lack the context!
The story gets a little clearer once we define an enum type like this:
= 0
= 1
Our example from above becomes
=
return
In a larger codebase, FileStatus.OPEN
is much easier to search for than 0
.
Note: The native enum type was introduced very late in the history of Python. It serves as a nice example of how enhancing the type system can help improve readability.
When you combine different types, magic happens.
All pieces suddenly fall into place when you choose your types wisely. Out of nowhere, the compiler will start checking your design decisions and if all your types work well together. It will point out flaws in your mental model. This gives you a great amount of confidence during refactoring.
For example, let’s think about sorting things. When I think of sorting, I first think about a list of numbers:
# [1,2,3,4,5]
That’s the happy path. How about this one?
Ouch. This can’t work because 1
is a single number and not a collection! If we forget to check the type before we pass it to sorted
, we get an error while the program runs.
In Python 2, this would result in [1, 'fish']
(because strings will be compared by length)
Edit: Reddit user jcdyer3 pointed out that the reason is that when incomparable types are compared, they sort by their type, so all ints will come before all strings. It’s a CPython implementation detail).
Since Python 3, this throws an Exception.
: not and
Much better! One less source of error. The problematic thing is though, that this happens at runtime. That’s because of Python’s dynamic typing. We could have avoided that with a statically typed language.
Looks scary but it really isn’t.
We define a function named sorted
which takes one input parameter named collection
.
The type of collection
consists of four parts:
- The
&
means that we “borrow” the collection, we don’t own it. After the function returns, it will still exist. It won’t be cleaned up. - The
mut
means that the collection is mutable. We are allowed to modify it. [T]
indicates that we expect a list/slice/vector as input. Everything else will be rejected at compile time (before the program even runs).PartialOrd
is the magic sauce. It is a trait, which is something like an interface. It means that all elementsT
in the collection must be partially ordered.
All of this information helps the compiler to prevent us from shooting ourselves in the foot. And we can understand the inputs and outputs of the function without looking elsewhere.
Takeaways
- Types force developers to do their homework and think about the guarantees and limitations of their code.
- Don’t think of types as constraints, think of them as a safety net which will protect you from your own flawed mental models.
- Always choose the type which most precisely expresses your intentions.
- If there is no perfect type in the standard library, create your own from simpler types.
Following these rules, I found that I was magically guided towards the most elegant representation of my ideas. My code became much more idiomatic.
- 💬 Comments on Hacker News, Reddit.
Thanks for reading! I mostly write about Rust and my (open-source) projects. If you would like to receive future posts automatically, you can subscribe via RSS.