maintenance
JSON Autotype
By: Michał J. Gajda
Introduction
Static typing promises safer code, and easier scaling to large projects. Polymorphic types with classes promise high level of abstraction typical of scripting languages, and at the same time strong discipline.
However not all our data is initially typed. Most common data format used by web apps is dynamically typed JSON format. Web API descriptions are usually just free form text with examples given as JSON documents.
How one is supposed to type that?
JSON-Autotype project promises to solve this problem automatically. It takes a couple of JSON documents and produces valid type declaration. It also provides a parser and printer for this JSON format, and is guaranteed that sample inputs will always parse.
It is also much faster than inferring the structure of JSON documents by your own human eyes. Some of our users have used it on megabytes and gigabytes of documents^[Please contact us, if you are interested in funding work on scaling JSON parsing to large documents without compromising type safety, and with minimal change of interface.].
Example
Take an example JSON document:
{
"colorsArray":[{
"colorName":"red",
"hexValue":"#f00"
},
{
"colorName":"green",
"hexValue":"#0f0"
},
{
"colorName":"blue",
"hexValue":"#00f"
}
]
}
It will be used to generate the following Haskell^[Or Elm. Please let us know if you are interested to fund work on other languages like PureScript, Scala, or F#.] type declaration:
data ColorsArray = ColorsArray {
colorsArrayHexValue :: Text,
colorsArrayColorName :: Text
} deriving (Show,Eq)
data TopLevel = TopLevel {
topLevelColorsArray :: ColorsArray
} deriving (Show,Eq)
Consider a case of ambiguous types, like the following:
{
"parameter":[{
"parameterName" :"apiVersion",
"parameterValue":1
},
{
"parameterName" :"failOnWarnings",
"parameterValue":false
},
{
"parameterName" :"caller",
"parameterValue":"site API"
}]
}
Typing uses :|:
union type (similar to Either
but without a tag):
data Parameter = Parameter {
parameterParameterValue :: Bool :|: Int :|: Text,
parameterParameterName :: Text
}
data TopLevel = TopLevel {
topLevelParameter :: Parameter
}
Theory behind the machine
We use union type system without polymorphic variables to infer types of JSON documents. It translates each value into a type:
data Type =
TBool | TString | TInt | TDouble
| TNull
| TUnion (Set Type)
| TObj Dict
| TArray Type
Then it uses TUnion
of a set of types to reconcile values that sometimes have different primitive
JSON types. (TObj
represents dictionary object.)
Then it uses unification to reconcile types between different values.
Another successful use of mathematical theory to simplify practical software problem.
References
Details are described in the presentation for Engineers.SG and Monadic Warsaw. But you may prefer to look at the source code yourself!
Other articles about JSON Autotype:
- 24 days of Hackage 2015
- 24 dias de Hackage 2015
- Aelve guide to JSON lists JSON Autotype as alternative instance generator.