By Artyom Kazak aka @availablegreen
December 25, 2020
𧱠This post is written in Brick β a platform for writing pretty public notes, which I co-founded. If you try it out, I'd love to hear feedback on Twitter or at artyom@brick.do.
Linear types are coming to GHC. For a brief moment in time, everyone will feel like they are Haskell beginners again.
This is a rare opportunity.
After reading this post, you will be able to answer some immediate questions about them, and explain some of the gotchas β thus becoming the go-to linear types person at your $dayjob (unless you work at Tweag).
Bringing linear types into your $dayjob codebase will make you popular. Everyone will ask for your help with the code you wrote. A nice confidence boost. I am being snarky but genuine here. Go for it.
After a while the novelty will fade off, and you will want to repeat the success. New GHC flags, new Cabal features, new language extensions. You will start reading GHC proposals in your spare time. Keeping up with all the latest gossip. "Good question, I'll ask @mboes about this", you will be saying nonchalantly.
In the long run you might even quit your job and open a Haskell consultancy, thus creating a direct competitor to my own Haskell consultancy. Which raises the question of why am I even writing this.
Yay!
Linear types, as implemented in yet-unreleased GHC 9.0, allow us to set constraints like "this function must use its argument exactly once", denoted by using %1 ->
instead of ->
.
id :: a %1 -> a
There are other things, but they are not implemented yet, so we'll skip them.
The 1
thing in the signature above is called a multiplicity. There might be other multiplicities later, and even user-defined multiplicities. Again, we'll skip them. Also, the %
thing has recently been generalized into something called "modifiers", except that %1
will not be a modifier, but other things will. This is exciting β they might become Java-like annotations at some point, and annotations/decorators are genuinely great.
If you have Nix installed, you can run the following command and enter a shell with the latest build of GHC HEAD. It should download about 200 MB, you won't have to build GHC from scratch, and it won't pollute your system.
nix run -f https://github.com/mpickering/ghc-artefact-nix/archive/master.tar.gz ghcHEAD
-XLinearTypes
?No.
The reason linear types are interesting is that it turns out they can be used to express two broad sets of invariants when designing an API. Specifically:
X
, you can choose to apply doSomething
or doSomethingElse
to it, but not both".do
β but one that you can, as the API designer, choose to enable only when it makes sense.So, by annotating certain functions in the API as linear, you make the API harder to misuse by enforcing a specific flow that the API user must adhere to.
Normal Haskell types place constraints on what functions are allowed to be composed. Linear types place constraints on what data flows you are allowed to build.
Well! For instance, we can encode a neat state machine for network sockets β preventing us from doing send
before the socket has been bound, or closing a socket twice.
Note: the same technique can be used for more complicated state machines. If one of the goals of Haskell is to make it possible to write tricky, error-prone code without thinking β and still get it right β then linear types definitely help this goal.
Let's introduce a socket type, tagged with its state on the type level:
data State = Unbound | Bound | Listening | Connected
data Socket (s :: State) = ...
data SocketAddress = ...
Wonderful. Now some socket operations β I am going to skip many of them and only do four:
import System.IO.Linear (IO)
create ::
IO (Socket Unbound)
connect ::
Socket Unbound %1 ->
SocketAddress ->
IO (Socket Connected)
send ::
Socket Connected %1 ->
ByteString ->
IO (Socket Connected, Ur Int) -- will explain 'Ur' later
close ::
forall s. Socket s ->
IO (Ur ())
As you can see, if you connect
to a socket, you
Unbound
socket,Connected
socket.So you can't initialize the same socket twice. You can't access the previous Unbound
socket anymore.
connect
uses its argument exactly once β this is true β but it doesn't imply that nobody else can use the same socket.Right.
That's why we are also going to introduce a different Monad
class that prohibits this. This won't replace the normal Monad
, and for a while won't even be in base
. Instead, we will have a package called linear-base
, defining something like this:
class Monad m where
(>>=) ::
m a %1 ->
(a %1 -> m b) %1 -> -- the callback must only use its argument once!
m b
With this definition, code like do s <- create; connect s addr; connect s addr
won't compile, because it desugars as follows:
create >>= \s ->
connect s addr >>= \_ -> -- 's' is used for the first time
connect s addr -- 's' is used for the second time!
The linear variant of IO
will not have a normal Monad
instance, thus ensuring that you have to use this alternative >>=
that gives you better guarantees.
That's what the Ur
is for. (It's short for Unrestricted
.)
If you get a value wrapped into Ur
, you can unwrap it and use the contained value as many times as you want.
do s <- create
connect s addr
(s, Ur k) <- send s bs
-- Here we can use 'k' (the number of sent bytes) as much as we like,
-- although we probably don't care one way or the other.
Ur
is defined as follows:
data Ur a where
Ur :: a -> Ur a
The reason it works is that the arrow in the constructor definition is not restricted, so the thing inside the Ur
is also not restricted and can be used several times.
There will also be a class Movable
β also in linear-base
β that lets us use primitive values like Bool
, Int
, etc, regardless of their linearity, going from a
to Ur a
. Like this:
class Dupable a => Movable a where
move :: a %1 -> Ur a
instance Movable Bool where
move True = Ur True
move False = Ur False
Ur
works. The explanation was semi-convincing, but I think it only works if you already know what it tries to say.Okay, let's try again.
The constructor Ur
has the type a -> Ur a
.
With linear types in GHC, pattern-matching will literally just look at the constructor type on the left-hand side, and use it to see how many times the constructor's fields can be used on the right-hand side.
->
arrow after it, the value can be used many times (or zero times). This is the case with Ur
.%1 ->
arrow after it, the value must be used exactly once.Now the funny thing is β by default, all constructors are actually going to get linear types, even without enabling -XLinearTypes
. This includes everything in base
.
data Maybe a = Nothing | Just a
-- In our brave new world:
-- Just :: a %1 -> Maybe a
So, in a pattern match like case m of Just x -> ...
, GHC will look at the type of Just
, and ensure that if m
was only allowed to be used once, then x
shall also only be allowed to be used once. If m
didn't have any linearity constraints, though, then nothing changes.
Now, in order to override this, you have to use the GADT syntax β where you can specify the arrows explicitly and decide which fields should be linear and which shouldn't. This is what we've done with Ur
.
It won't β the linear types proposal authors already checked on a large codebase β but only thanks to a huge hack.
Linear types don't have subtyping β you can't pass a %1 ->
function if a ->
function is expected. So, if a function takes a parameter of type (a -> Maybe a)
, you can no longer pass Just
to it.
To work around this, when a constructor is used as a term, GHC will change all linear arrows in its type into %p ->
, which stands for a multiplicity-polymorphic arrow β both ->
and %1 ->
are acceptable. And then everything will be fine. Other usages of the constructor will remain linear.
Thanks to this and other reasons, linear types will be a desperately needed fresh source of little-known facts and gotchas, thus ensuring that there will be demand for new Haskell consultancies for years to come.
Ur
, and then using it more than once?No. If you try, GHC will catch you.
f :: a %1 -> Ur a
f a = Ur a
-- error:
-- β’ Couldn't match type β'Manyβ with β'Oneβ
-- arising from multiplicity of βaβ
-- β’ In an equation for βfβ: f a = Ur a
Specifically, these are the conditions under which an argument x
has been used linearly:
- Returning
x
unmodified- Passing
x
to a linear function- Pattern-matching on
x
and using each argument exactly once in the same fashion.- Calling it as a function and using the result exactly once in the same fashion.
Since Ur
is specifically not a linear function, you can't do this.
Monad
β will GHC also patch do
on the fly to use the modified Monad
version? Or not? Or what will happen?Other than constructor types, GHC will not patch anything else on the fly. Nothing in base
will change, either. Even the types of functions like ($)
and (.)
won't change, so using linear types with base
will be painful. Hence linear-base
.
As I mentioned, there will be two different Monad
classes. You will have to specify that you want the linear Monad
version, which you can do with the new -XQualifiedDo
extension. (The extension is usable for many things, but linear types was one of the goals.)
import qualified Control.Monad.Linear as Linear
import qualified System.IO.Linear as Linear
main :: Linear.IO ()
main = Linear.do
...
There will also be two Functor
classes, for functors-as-containers (non-linear) and functors-as-control-structures (linear). See A Tale of Two Functors for an explanation.
Unfortunately, -XQualifiedDo
does not replace return
with Linear.return
, so you'll still have to remember about writing that. Or just use linear types everywhere in the module and enable -XRebindableSyntax
.
Look at the socket example again. If you are allowed to ignore the socket, you can easily forget to close it.
Relaxing the constraint to "at most once" gives you affine types. The difference is discussed in the Affine types instead of linear types section of the linear types proposal. The motivating example is described in Making Two Garbage Collectors Be Good Neighbours β the idea is that if you have a scarce resource that should really be freed as soon as it's not necessary anymore, something like resourcet
becomes not enough.
seq
. I will just seq
a socket and then not close it.If seq
had the type a %1 -> b %1 -> b
, you'd be able to write this:
forget :: a %1 -> ()
forget a = a `seq` ()
However, you actually can't, because seq
has a different type β see the definition in linear-base:
-- | @seq x y@ only forces @x@ to head normal form, therefore is not guaranteed
-- to consume @x@ when the resulting computation is consumed. Therefore, @seq@
-- cannot be linear in its first argument.
seq :: a -> b %1 -> b
seq x = Unsafe.toLinear (Prelude.seq x)
You will be able to pretend you've used a value β see Consumable
β but only for types where it would have already been possible with case
anyway.
First of all β the full form of the linear arrow guarantee is "the argument is consumed exactly once if the result is consumed exactly once". If you throw an exception, the result won't be consumed, so it's fine not to consume the argument.
Admittedly, this gets complicated in the presence of catch
:
(do s <- create
s <- connect addr s
error "haha"
close s -- "using" s
)
`catch` \_ -> putStrLn "fooled you"
Technically you have used s
, but in reality you haven't β and yet the code will execute successfully. The API guarantee has been broken.
My understanding is that linear types do not solve this problem. You'd have to use a different monad instead of Linear.IO
β let's call it RIO
β that makes sure all resource-allocating actions are paired with finalizers, and runs those finalizers during catch
.
See the section Non-termination, exceptions & catch of the linear types proposal for more details on this.
Note that linear types do prevent you from forgetting to close the socket β you still have to write close
no matter what. But you still need to do resource tracking if you have exceptions in sight.
Safe resource handling is already a nice enough problem to tackle β and it pops up in many other places, e.g. see Safe Memory Management In inline-java
Using Linear Types.
However, it turns out linear types can be used for more things.
One example is Compact Normal Forms + Linear Types = Efficient Network Communication. I don't understand it well, but it seems like a very tricky piece of code that represents data in a packed, pointer-less way, can be written with better compile-time guarantees if we are allowed to use linear types.
Another example is fusion. With linear types, we can construct arrays in pure code without allocating a new array during each step. You can do this with vector
at the moment β but it relies on GHC rewrite rules and you can't statically ensure that it doesn't do any unnecessary allocations. With linear types, we get better guarantees. Look at push and pull arrays in linear-base
to see how those are implemented and what the signatures are going to be.
Streaming libraries should become safer. I suppose this is related to Sharing, Space Leaks, and Conduit and friends, but I am not sure.
It also looks like some impure APIs can become pure with the help of linear types. From the linear types paper:
Basically, the pure interface takes an update function
u : Scene β Scene
which is tasked with returning the next state that the screen will display. The scene is first converted to a pure tree where each node keeps, along with the pure data, a pointer to its imperative counterpart when it applies, orNothing
for new nodes.data Node = Node { payload :: Int , ref :: Maybe (IORef ImperativeNode) , children :: [Node] }
On each frame, SpriteKit applies
u
to the current scene, and checks if a noden
was updated. If it was, it applies the update directly ontoref n
or creates a new imperative node.Things can go wrong though: if the update function duplicates any proxy node, one gets the situation where two nodes
n
andn'
can point to the same imperative sourceref n
=ref n'
, but have different payloads. In this situation theScene
has become inconsistent and the behaviour of SpriteKit is unpredictable. In the api of Chakravarty and Keller [2017], the burden of checking non-duplication is on the programmer. Using linear types, we can switch that burden to the compiler [...]
Whether this is a good idea is a different question, but since people are already writing unsafe pure APIs, at least those can become safe pure APIs now.
Linear types seemed very dubious when they first appeared, but now I feel like they might be a good thing after all.
The cost is not that high β I think it's less than 2000 changed lines in GHC overall! β and it will let people experiment with marrying more weird type-level features with production code, which is what I not-so-secretly think should be Haskell's only goal. All other goals can be relegated to Boring Haskell.
Merry Christmas!
linear-base
-XLinearTypes
section in GHC documentationAt the moment, does GHC compile linearly-typed code more efficiently/differently than nonlinear code?
I think not. See Does this proposal improve performance? from the linear types wiki.
I am interested in whether this might change eventually. Comment or tweet at me if you know more.