Representing JSON structures in Go

Tags Go

Following some period of time reading and answering Stack Overflow questions about Go, last year I wrote the Go JSON Cookbook post to summarize some of the common issues with Go and JSON. Since then, I noticed a particular kind of JSON-related question that keeps recurring, so I want to address it directly in a separate post that will be easier to link to than specific sections in a large "cookbook".

The question is commonly accompanied by a snippet of JSON data, asking how to map it onto Go. Typically, the JSON data is some sort of API response, but it could really be anything.

Here's a sample piece of JSON we'll be working with in this post. It's much simpler than what you'd encounter "in the wild", but it should generalize well to other structures:

{ "attrs": [ { "name": "color", "count": 9 }, { "name": "family", "count": 127 }], "fruits": [ { "name": "orange", "sweetness": 12.3, "attr": {"family": "citrus"} } ]

The first approach, which should be the default in most cases is the full-fidelity struct representation. This means a completely type-safe mapping from the JSON object to a Go struct. It's easy enough to write such struct types by hand, but tools exist that make this easier. For example, here's the struct generated by JSON-to-Go:

type AutoGenerated struct { Attrs []struct { Name string `json:"name"` Count int `json:"count"` } `json:"attrs"` Fruits []struct { Name string `json:"name"` Sweetness float64 `json:"sweetness"` Attr struct { Family string `json:"family"` } `json:"attr"` } `json:"fruits"`

Parsing our JSON into this struct and iterating over all fruits to find their sweetness is a simple matter of [1]:

var ag AutoGenerated
if err := json.Unmarshal(jsonText, &ag); err != nil { log.Fatal(err)
for _, fruit := range ag.Fruits { fmt.Printf("%s -> %f\n", fruit.Name, fruit.Sweetness)

This isn't much more verbose than you'd find in dynamic languages, and yet it's completely type safe. Moreover, the JSON is automatically validated to match out struct upon parsing.

The Go JSON package supports parsing into a generic representation using reflection. This is akin to a map-of-lists-of-maps or similar nomenclature in dynamic languages; generally, assuming the top-level JSON fields are strings (a safe bet), we can ask json.Unmarshal to parse the JSON into a map[string]interface{}. Under the hood, the parser will create concrete Go types according to what it encounters in JSON, but the Go compiler has no way of knowing what those types are at compile time, so we'll have to use runtime type assertions to unravel them.

Here's a complete example that prints out all the fruit names and their sweetness, without using a struct declaration:

var m map[string]interface{}
if err := json.Unmarshal(jsonText, &m); err != nil { log.Fatal(err)
} fruits, ok := m["fruits"]
if !ok { log.Fatal("'fruits' field not found")
fslice, ok := fruits.([]interface{})
if !ok { log.Fatal("'fruits' field not a slice")
} for _, f := range fslice { fmap, ok := f.(map[string]interface{}) if !ok { log.Fatal("'fruits' element not a map") } name, ok := fmap["name"] if !ok { log.Fatal("fruits element has no 'name' field") } sweetness, ok := fmap["sweetness"] if !ok { log.Fatal("fruits element has no 'sweetness' field") } fmt.Printf("%s -> %f\n", name, sweetness)

Note the huge amount of error checking require here - we are forced to carefully check for the existence of every field and the type of every value we need on the way to the actual field/value pairs we're interested in.

This seems like a lot of code, but this is very similar to what dynamic languages are doing behind the scenes when similar code is written. For example, we could write the same code as follows:

var m map[string]interface{}
if err := json.Unmarshal(jsonText, &m); err != nil { log.Fatal(err)
} fruits := m["fruits"].([]interface{})
for _, f := range fruits { fruit := f.(map[string]interface{}) fmt.Printf("%s -> %f\n", fruit["name"], fruit["sweetness"])

Which isn't very different from how it'd look in, say, Python (besides the type assertions). Notice what's missing? The error checking. Instead, Go will panic at runtime if something is not as expected. This is, once again, similar to dynamic languages which typically use exceptions. In Go you could achieve roughly the same effect by letting the code panic in case of errors and using recover to catch the panic at a higher level; this is appealing but has its own problems.

So the verbosity of the code is not due to its untyped nature; it's mostly due to the Go explicit error checking norms.

So far we've seen a fully type-safe representation of a JSON object and an "untyped" representation. An interesting compromise is using a hybrid representation, in which we proceed in an untyped manner until we hit the interesting piece of JSON, which we then want to represent with an actual struct.

For our fruits example, we can define the following struct:

type Fruit struct { Name string `json:"name"` Sweetness float64 `json:"sweetness"` Attr map[string]string `json:"attr"`

And write this parsing code:

var m map[string]json.RawMessage
if err := json.Unmarshal(jsonText, &m); err != nil { log.Fatal(err)
} fruitsRaw, ok := m["fruits"]
if !ok { log.Fatal("expected to find 'fruits'")
} var fruits []Fruit
if err := json.Unmarshal(fruitsRaw, &fruits); err != nil { log.Fatal(err)
for _, fruit := range fruits { fmt.Printf("%s -> %f\n", fruit.Name, fruit.Sweetness)

Note the usage of json.RawMessage. What we're telling json.Unmarshal in the first call is: parse the object into a map with string keys, but leave the values unparsed. It's important to do this because if we set the values to interface{}, json.Unmarshal will parse them into concrete types, or maps/slices thereof, as we've seen in the generic representation. Keeping the values as json.RawMessage instead lets us delay the parse to until we know a more concrete type to parse to - like Fruit in this case.

Such delayed parsing can have additional benefits like performance. We may be parsing a large JSON object but are only interested in a single key; using map[string]json.RawMessage tells the parser not to parse the values. We can later only parse the interesting values, not wasting resources on others.

The hybrid approach is often useful when we're interested in a small part of a complicated JSON structure which we don't care to fully validate. Moreover, some JSON structures actually don't have a matching static type (e.g. fields can have different types based on other fields). In these cases using a full-fidelity struct representation of the JSON may be either cumbersome or infeasible, and the hybrid approach is a good compromise to expose the pieces of data we actually care about more conveniently.