Generics are simple: write your own ToJSON

🎉I am writing these notes at Brick, a magical mystery no-bullshit publishing platform. Turns out writing goes much faster when I don't have to hit “Publish” or do git commit.

You can use it too — check it out at Brick.do.


Abstract

Our goal is to replicate the magical machinery that lets us write instance ToJSON Foo and it Just Works.

After reading this post, generics will become your secret weapon.

A warning

Generics are not easy. They are simple to understand, but not easy to use.

Introduction

Truth is a cake (and the cake is a lie). The first layer of this cake is: you can't just look at a record and iterate through its fields. People come from Python or JS and ask “okay, how do I do this”. Nope. You can't.

The second layer of the truth cake: you actually can, with generics. Generics provide a uniform representation for data types. If you can work with that representation, you can work with any data type. Cf. reflection in Java.

Different generic libraries have different uniform representations. The GHC.Generics representation is the most common, but generics-eot is much simpler, so we'll use that one.

⚠️A nice thing about GHC.Generics is that it balances the representation, so if you want to access an arbitrary field in a record or something, it will be faster. For many usecases it doesn't matter, though.

The generics-eot representation

Generics-eot represents everything with Eithers and nested tuples. Let's use a small two-field record as an example, data Order:

> :set -XDeriveGeneric
> import GHC.Generics

> data Order = Order { id :: String, merchant :: String } deriving (Show, Generic)

Eot is a type family that gives an Eithers-of-tuples (ha) representation for any type with a Generic instance. So, Eot Order gives us a representation of Order.

The patient, eitherized upon a table, looks thus:

> import Generics.Eot

> :kind! Eot Order
Eot Order :: *
= Either ([Char], ([Char], ())) Void

⚠️If you get Generics.Eot.Eot.EotG followed by a lot of gibberish, just repeat the command and you'll get the nice Either representation. Seems to be a bug in GHCi.

Note that every constructor is represented as a branch of Either, and every field is represented as a first element of a tuple.

  • Nested tuples are terminated with (). Here we have two fields, and hence (String, (String, ())).
  • Nested Eithers are terminated with Void — a type with no values. The most important bit of intuition here is that Either a Void is equivalent to a. If we had two constuctors, the representation would look like Either ... (Either ... Void).

To convert an Order into its representation, we will use toEot.

> toEot (Order "GD-837" "Apparel Etc")
Left ("GD-837",("Apparel Etc",()))

To go back, we will use fromEot. We also have to specify what type we want to get — you can't guess the type from the uniform representation alone (duh).

> fromEot (Left ("GD-837",("Apparel Etc",()))) :: Order
Order {id = "GD-837", merchant = "Apparel Etc"}

Finally, with datatype we can get the names of fields and constructors.

> :set -XTypeApplications

> datatype (Proxy @Order)
Datatype
  { datatypeName = "Order",
    constructors =
      [ Constructor
          { constructorName = "Order",
            fields = Selectors ["id", "merchant"]
          }
      ]
  }

Now we have everything we need. The time has come to murder and create.

ToJSON and the type class menagerie

Representations of different types are themselves different types, so we must create a type class to handle them. Several type classes, even.

😕Type classes are heavy. It would be great if I did not have to create an ad-hoc type class every time I want to match on a type. Are there any relevant GHC proposals?

The outer layer is the GToJSON class. Its only method, gToJSON, takes type metadata and the Eot representation, and gives us a Value.

{-# LANGUAGE MonoLocalBinds #-}      -- oh god
{-# LANGUAGE FlexibleInstances #-}   -- please no
{-# LANGUAGE FlexibleContexts #-}    -- why so many
{-# LANGUAGE RankNTypes #-}          -- bring back -fglasgow-exts
{-# LANGUAGE TypeApplications #-}    -- huh
{-# LANGUAGE ScopedTypeVariables #-} -- Glasgow is a funny word, innit

import Data.Proxy
import Data.Void
import qualified Data.Text as Text
import qualified Data.Aeson as Aeson
import qualified Generics.Eot as Eot

-- This uses gToJSON
genericToJSON 
  :: forall a. (Eot.HasEot a, GToJSON (Eot.Eot a)) 
  => a -> Aeson.Value
genericToJSON val = gToJSON (Eot.datatype (Proxy @a)) (Eot.toEot val)

-- Okay here we go
class GToJSON eot where
  gToJSON :: Eot.Datatype -> eot -> Aeson.Value

I will do a simple version that assumes there is only one constructor. All types with one constructor are represented as Either ... Void, so that's what our instance has to match on.

-- The fabled "simple version"
instance GFieldsToJSON tup => GToJSON (Either tup Void) where
  gToJSON typeInfo (Left tup) =
    let -- Assuming one constructor, what are its fields?
        [Eot.Constructor _name fields] = Eot.constructors typeInfo
        
        -- 'gFieldsToJSON' extracts values from the nested tuple.
        -- I will show it in the next snippet of code.
        values :: [Aeson.Value]
        values = gFieldsToJSON tup

     in case fields of
          -- No fields at all - eh, just return a unit
          Eot.NoFields -> Aeson.toJSON ()
          
          -- Unnamed fields - alright, return an array
          Eot.NoSelectors _ -> Aeson.toJSON values
          
          -- Named fields - create an object
          Eot.Selectors names -> Aeson.object $
            zipWith (Aeson..=) (map Text.pack names) values

We need to introduce another type class to get fields from the tuple. So, gFieldsToJSON has to give us a Value for each element of the tuple.

class GFieldsToJSON tup where
  gFieldsToJSON :: tup -> [Aeson.Value]
  
-- Base case
instance GFieldsToJSON () where
  gFieldsToJSON () = []
  
-- Recursive case
instance (Aeson.ToJSON a, GFieldsToJSON b) => GFieldsToJSON (a, b) where
  gFieldsToJSON (a, b) = Aeson.toJSON a : gFieldsToJSON b

Okay. Does genericToJSON work? It does.

> Aeson.encode $ genericToJSON (Order "GD-837" "Apparel Etc")
"{\"merchant\":\"Apparel Etc\",\"id\":\"GD-837\"}"

Default method implementations

“But what happens when you write an empty ToJSON instance?”.

You have to look at the source. ToJSON has a default method implementation that requires Generic (see -XDefaultSignatures):

class ToJSON a where
  toJSON :: a -> Value
  default toJSON :: ...
  toJSON = genericToJSON defaultOptions

So genericToJSON (Aeson's version, not ours) is used automatically when the instance is empty.

FromJSON?

Oh god, writing ToJSON was boring enough. There will be no FromJSON.

The promise of creation, left unfulfilled. But at least we did some murder.

Alternatives to generics-eot

generics-sop is a library similar to generics-eot, but it provides combinators you can use instead of explicit recursion. I don't want to learn a new set of combinators, but if you do — with generics-sop your code will likely be shorter.

There is also simplistic-generics, which provides a nicer interface to GHC.Generics. Haven't tried it yet, but might be good.

Another option is one-liner.

Conclusion

You're now an intermediate-level Haskeller. This alone was enough. Huh.

Congrats!