Flexible Data with Aeson

At a certain point, our Haskell programs have to be compatible with other programs running on the web. This is especially useful given the growing usage of micro-services as an architecture. Regardless, it's very common to be transferring data between applications on different stacks. You’ll need some kind of format that allows you to transfer your data and read it on both ends.

There are many different ways of doing this. But much of the current eco-system of web programming depends on the JSON format. JSON stands for “JavaScript Object Notation”. It is a way of encoding data that is compatible with Javascript programs. But it is also a useful serialization system that any language can parse. In this article, we’ll explore how to use this format for our data in Haskell.

JSON 101

JSON encodes data in a few different types. There are four basic types: strings, numbers, booleans, and null values. These all functions in a predictable way. Each line here is a valid JSON value.

“Hello”

4.5

3

true

false

null

JSON then offers two different ways of combining objects. The first is an array. Arrays contain multiple values, and represent them with a bracket delimited list. Unlike Haskell lists, you can put multiple types of objects in a single array. You can even put arrays in your arrays.

[1, 2, 3]

[“Hello”, false, 5.5]

[[1,2], [3,4]]

Coming from Haskell, this kind of structure seems a little sketchy. After all, the type of whatever is in your array is not clear. But you can think of multi-type arrays as tuples, rather than lists, and it makes a bit more sense. The final and most important way to make an object in JSON is through the object format. Objects are delimited by braces. They are like arrays in that they can contain any number of values. However, each value is also assigned to a string name as a key-value pair.

{
  “Name” : “Name”,
  “Age” : 23,
  “Grades” : [“A”, “B”, “C”]
}

These objects are infinitely nest-able, so you can have arrays of objects of arrays and so on.

Encoding a Haskell Type

So that’s all nice, but how do we actually use this format to transfer our Haskell data? We’ll let’s start with a dummy example. We’ll make a Haskell data type and show a couple possible JSON interpretations of that data. First, our type and some sample values:

data Person = Person
  { name :: String
  , age :: Int
  , occupation :: Occupation 
  } deriving (Show)

data Occupation = Occupation
  { title :: String
  , tenure :: Int
  , salary :: Int 
  } deriving (Show)

person1 :: Person
person1 = Person
  { name = “John Doe”
  , age = 26
  , occupation = Occupation
    { title = “teacher”
    , tenure = 5
    , salary = 60000
    }
  }

person2 :: Person
person2 = Person
  { name = “Jane Doe”
  , age = 25
  , occupation = Occupation
    { title = “engineer”
    , tenure = 4
    , salary = 90000
    }
  }

Now there are many different ways we can choose to encode these values. The most basic way might look something like this:

{
  “name” : “John Doe”,
  “age” : 26,
  “occupation” : {
    “title” : “teacher”,
    “tenure” : 5,
    “salary” : 60000
    }
}

{
  “name” : “Jane Doe”,
  “age” : 25,
  “occupation” : {
    “title” : “engineer”,
    “tenure” : 4,
    “salary” : 90000
  }
}

Now, we might want to help our friends using dynamically typed languages. To do this, we could provide some more context around the type contained in the object. This format might look something more like this:

{
  “type” : “person”,
  “contents” : {
    “name” : “John Doe”,
    “age” : 26,
    “occupation” = {
        “type” : “Occupation”,
        “contents” : {
        “title” : “teacher”,
        “tenure” : 5,
        “salary” : 60000
        }
      }
  }
}

Either way, we ultimately have to decide on a format with whoever we’re trying to interoperate with. It’s likely though that you’ll be working with an external API that’s already set their expectations. You have to make sure you match them.

Data.Aeson

Now that we know what our target values are, we need a Haskell representation for them. This comes from the Data.Aeson library (named for the father of the mythological character "Jason"). This library contains the type Value. It encapsulates all the basic JSON types in its constructors. (I’ve substituted a couple type synonyms for clarity):

data Value =
  Object (HashMap Text Value) |
  Array (Vector Value) |
  String Text |
  Number Scientific |
  Bool Bool |
  Null

So we see all six elements represented. So when we want to represent our items, we’ll do so using these constructors. And like with many useful Haskell ideas, we’ll use typeclasses here to provide some structure. We'll use two typeclasses: ToJSON and FromJSON. We’ll first go over converting our Haskell types into JSON. We have to define the toJSON function on our type, which will turn it into a Value.

In general, we’ll want to stick our data type into an object, and this object will have a series of “Pairs”. A Pair is the Data.Aeson representation of a key-value pair, and it consists of a Text and another value. Then we’ll combine these pairs together into a JSON Object by using the object function. We’ll start with the Occupation type, since this is somewhat simpler.

{-# LANGUAGE OverloadedString #-}

import Data.Aeson (ToJSON(..), Value(..), object, (.=))

...

instance ToJSON Occupation where
  toJSON :: Occupation -> Value
  toJSON occupation = object
    [ “title” .= toJSON (title occupation)
    , “tenure” .= toJSON (tenure occupation)
    , “salary” .= toJSON (salary occupation)
    ]

The .= operator creates a Pair for us. All our fields are simple types, which already have their own ToJSON instances. This means we can use toJSON on them instead of the Value constructors. Note also we use the overloaded strings extension (as introduced here) since we use string literals in place of Text objects. Once we’ve defined our instance for the Occupation type, we can call toJSON on an occupation object. This makes it easy to define an instance for the Person type.

instance ToJSON Person where
  toJSON person = object
    [ “name” .= toJSON (name person)
    , “age” .= toJSON (age person)
    , “occupation” .= toJSON (occupation person)
    ]

And now we can create JSON values from our data type! In general, we’ll also want to be able to parse JSON values and turn those into our data types. We'll use monadic notation to encapsulate the possibility of failure. If the keys we are looking for don’t show up, we want to throw an error:

import Data.Aeson (ToJSON(..), Value(..), object, (.=), (.:), FromJSON(..), withObject)

...

instance FromJSON Occupation where
  parseJSON = withObject “Occupation” $ \o -> do
    title_ <- o .: “title”
    tenure_ <- o .: “tenure”
    salary_ <- o .: “salary”
    return $ Occupation title_ tenure_ salary_

instance FromJSON Person where
  parseJSON = withObject “Person” $ \o -> do
    name_ <- o .: “name”
    age_ <- o .: “age”
    occupation_ <- o .: “occupation”
    return $ Person name_ age_ occupation_

A few notes here. The parseJSON functions are defined through eta reduction. This is why there seems to be no parameter. The .: operator can grab any data type that conforms to FromJSON itself. Just as we could use toJSON with simple types like String and Int, we can also parse them right out of the box. Plus we described how to parse an Occupation, which is why we can use the operator on the occupation field. Also, the first parameter to the withObject function is an error message we’ll get if our parsing fails. One last note is that our FromJSON and ToJSON instances are inverses of each other. This is definitely a good property for you to enforce within your own API definitions.

Now that we have these instances, we can see the JSON bytestrings for our different objects:

>> Data.Aeson.encode person1
"{\"age\":26,\"name\":\"John Doe\",\"occupation\":{\"salary\":60000,\"tenure\":5,\"title\":\"teacher\"}}"
>> Data.Aeson.encode person2
"{\"age\":25,\"name\":\"Jane Doe\",\"occupation\":{\"salary\":90000,\"tenure\":4,\"title\":\"engineer\"}}"

Deriving Instances

You might look at the instances we’ve derived and think that the code looks boilerplate-y. In truth, these aren’t particularly interesting instances. Keep in mind though, an external API might have some weird requirements. So it’s good to know how to create these instances by hand. Regardless, you might be wondering if it’s possible to derive these instances in the same way we can derive Eq or Ord. We can, but it’s a little complicated. There are actually two ways to do it, and they both involve compiler extensions. The first way looks more familiar as a deriving route. We’ll end up putting deriving (ToJSON, FromJSON) in our types. Before we do that though, we have to get them to derive the Generic typeclass.

Generic is a class that allows GHC to represent your types at a level of generic constructors. To use it, you first need to turn on the compiler extension for DeriveGeneric. This lets you derive the generic typeclass for your data. You then need to turn on the DeriveAnyClass extension as well. Once you have done this, you can then derive the Generic, ToJSON and FromJSON instances for your types.

{-# LANGUAGE DeriveGeneric     #-}
{-# LANGUAGE DeriveAnyClass    #-}

…

data Person = Person
  { name :: String
  , age :: Int
  , occupation :: Occupation
  } deriving (Show, Generic, ToJSON, FromJSON)

data Occupation = Occupation
  { title :: String
  , tenure :: Int
  , salary :: Int
  } deriving (Show, Generic, ToJSON, FromJSON)

With these definitions in place, we’ll get the same encoding output we got with our manual instances:

>> Data.Aeson.encode person1
"{\"age\":26,\"name\":\"John Doe\",\"occupation\":{\"salary\":60000,\"tenure\":5,\"title\":\"teacher\"}}"
>> Data.Aeson.encode person2
"{\"age\":25,\"name\":\"Jane Doe\",\"occupation\":{\"salary\":90000,\"tenure\":4,\"title\":\"engineer\"}}"

As you can tell, this is a super cool idea and it has widespread uses all over the Haskell ecosystem. There are many useful typeclasses with a similar pattern to ToJSON and FromJSON. You need the instance to satisfy a library constraint. But the instance you'll write is very boilerplate-y. You can get a lot of these instances by using the Generic typeclass together with DeriveAnyClass.

The second route involves writing Template Haskell. Template Haskell is another compiler extension allowing GHC to generate code for you. There are many libraries that have specific Template Haskell functions. These allow you to avoid a fair amount of boilerplate code that would otherwise be very tedious. Data.Aeson is one of these libraries.

First you need to enable the Template Haskell extension. Then import Data.Aeson.TH and you can use the simple function deriveJSON on your type. This will give you some spiffy new ToJSON and FromJSON instances.

{-# LANGUAGE TemplateHaskell #-}

import Data.Aeson.TH (deriveJSON, defaultOptions)

data Person = Person
  { name :: String
  , age :: Int
  , occupation :: Occupation
  } deriving (Show)


data Occupation = Occupation
  { title :: String
  , tenure :: Int
  , salary :: Int
  } deriving (Show)

-- The two apostrophes before a type name is template haskell syntax
deriveJSON defaultOptions ''Occupation
deriveJSON defaultOptions ''Person

We’ll once again get similar output:

>> Data.Aeson.encode person1
"{\"name\":\"John Doe\",\"age\":26,\"occupation\":{\"title\":\"teacher\",\"tenure\":5,\"salary\":60000}}"
>> Data.Aeson.encode person2
"{\"name\":\"Jane Doe\",\"age\":25,\"occupation\":{\"title\":\"engineer\",\"tenure\":4,\"salary\":90000}}"

Unlike deriving these types wholecloth, you actually have options here. Notice we passed defaultOptions initially. We can change this and instead pass some modified options. For instance, we can prepend our field names with the type if we want:

deriveJSON (defaultOptions { fieldLabelModifier = ("occupation_" ++)}) ''Occupation
deriveJSON (defaultOptions { fieldLabelModifier = ("person_" ++)}) ''Person

Resulting in the output:

>> Data.Aeson.encode person1
"{\"person_name\":\"John Doe\",\"person_age\":26,\"person_occupation\":{\"occupation_title\":\"teacher\",\"occupation_tenure\":5,\"occupation_salary\":60000}}"

Template Haskell can be convenient. It reduces the amount of boilerplate code you have to write. But it will also make your code take longer to compile. There’s also a price in accessibility when you use template Haskell. Most Haskell newbies will recognize deriving syntax. But if you use deriveJSON, they might be scratching their heads wondering where exactly you've defined the JSON instances.

Encoding, Decoding and Sending over Network

Once we’ve defined our different instances, you might be wondering how we actually use them. The answer depends on the library you’re using. For instance, the Servant library makes it easy. You don’t need to do any serialization work beyond defining your instances! Servant endpoints define their return types as well as their response content types. Once you've defined these, the serialization happens behind the scenes. Servant is an awesome library if you need to make an API. You should check out the talk I gave at BayHac 2017 if you want a basic introduction to the library. Also take a look at the code samples from that talk for a particular example.

Now, other libraries will require you to deal with JSON bytestrings. Luckily, this is also pretty easy once you’ve defined the FromJSON and ToJSON instances. As I’ve been showing in examples through the article, Data.Aeson has the encode function. This will take your JSON enabled type and turn it into a ByteString that you can send over the network. Very simple.

>> Data.Aeson.encode person1
"{\"name\":\"John Doe\",\"age\":26,\"occupation\":{\"title\":\"teacher\",\"tenure\":5,\"salary\":60000}}"

As always, decoding is a little bit trickier. You have to account for the possibility that the data is not formatted correctly. You can call the simple decode function. This gives you a Maybe value, so that if parsing is unsuccessful, you’ll get Nothing. In the interpreter, you should also make sure to specify the result type you want from decode, or else you’ll get that it tries to parse it as (), which will fail.

>> let b = Data.Aeson.encode person1
>> Data.Aeson.decode b :: Maybe Person
Just (Person {name = "John Doe", age = 26, occupation = Occupation {title = "teacher", tenure = 5, salary = 60000}})

To better handle error cases though, you should prefer eitherDecode. This will give you a an error message if it fails.

>> Data.Aeson.eitherDecode b :: Either String Person
Right (Person {name = "John Doe", age = 26, occupation = Occupation {title = "teacher", tenure = 5, salary = 60000}})
>> let badString = "{\"name\":\"John Doe\",\"occupation\":{\"salary\":60000,\"tenure\":5,\"title\":\"teacher\"}}"
>> Data.Aeson.decode badString :: Maybe Person
Nothing
>> Data.Aeson.eitherDecode badString :: Either String Person
Left "Error in $: key \"age\" not present"

Summary

So now we know all the most important points about serializing our Haskell data into JSON. The first step is to define ToJSON and FromJSON instances for the types you want to serialize. These are straightforward to write out most of the time. But there are also a couple different mechanisms for deriving them. Once you’ve done that, certain libraries like Servant can use these instances right out of the box. But handling manual ByteString values is also easy. You can simply use the encode function and the various flavors of decode.

Perhaps you’ve always thought to yourself, “Haskell can’t possibly be useful for web programming.” I hope this has opened your eyes to some of the possibilities. You should try Haskell out! Download our Getting Started Checklist for some pointers to some useful tools!

Maybe you’ve done some Haskell already, but you're worried that you’re getting way ahead of yourself by trying to do web programming. You should download our Recursion Workbook. It’ll help you solidify your functional programming skills!

Previous
Previous

Taking a Close look at Lenses

Next
Next

Smart Data with Conduits