Why JSON isn’t a Good Configuration Language

Why JSON isn’t a Good Configuration Language

Many projects use JSON for configuration files. Perhaps the most obvious example is the package.json file used by npm and yarn, but there are many others, including CloudFormation (originally JSON only, but now supports YAML as well) and composer (PHP).

However, JSON is actually a pretty terrible configuration language for a number of reasons. Don’t get me wrong — I like JSON. It is a flexible format that is relatively easy for both machines and humans to read, and it’s a pretty good data interchange and storage format. But as a configuration language, it falls short.

Why is JSON popular as a config language?

There are several reasons why JSON is used for configuration files. The biggest reason is probably that it is easy to implement. Many languages have JSON support in the standard library, and those that don’t almost certainly have an easy-to-use JSON package readily available. Then there is the fact that developers and users are probably already familiar with JSON and don’t need to learn a new configuration format to use the product. And that’s not to mention all the existing tooling for JSON, including syntax highlighting, auto-formatting, validation tools, etc.

These are actually all pretty good reasons. It’s too bad that this ubiquitous format is so ill-suited for configuration.

The problems with JSON

Lack of comments

One feature that is absolutely vital for a configuration language is comments. Comments are necessary to annotate what different options are for and why a particular value was chosen and—perhaps most importantly—to temporarily comment out parts of the config while using a different configuration for testing and debugging. If you think of JSON as a data interchange format, then it doesn’t really make sense to have comments.

There are, of course, workarounds for adding comments to JSON. One common workaround is to use a special key in an object for a comment, such as “//” or”__comment”. However, this syntax isn’t very readable, and in order to include more than one comment in a single object, you need to use unique keys for each. Douglas Crockford (the inventor of JSON) suggests using a preprocessor to remove comments. If you are using an application that requires JSON configuration, I recommend that you do just that, especially if you already have any kind of build step before the configuration is used. Of course that does add some additional work to editing configuration, so if you are creating an application that parses a configuration file, don’t depend on your users being able to use that.

Some JSON libraries do allow comments as input. For example, Ruby’s JSON module and the Java Jackson library with the JsonParser.Feature.ALLOW_COMMENTS feature enabled will handle JavaScript-style comments just fine in JSON input. However, this is non-standard, and many editors don’t properly handle comments in JSON files, which makes editing them a little harder.

Overly strict

The JSON specification is pretty restrictive. Its restrictiveness is part of what makes it easy to implement a JSON parser, but in my opinion, it also hurts the readability and, to a lesser extent, writability by humans.

Low Signal to Noise

Compared to many other configuration languages, JSON is pretty noisy. There is a lot of punctuation that doesn’t aid human readability, although it does make it easier to write implementations for machines. In particular, for configuration files, the keys in objects are almost always identifiers, so the quotation marks around the keys are redundant.

Also, JSON requires curly braces around the entire document, which is part of what makes it an (almost) subset of JavaScript and helps delimit different objects when multiple objects are sent over a stream. But, for a configuration file, the outermost braces are just useless clutter. The commas between key-value pairs are also mostly unnecessary in config files. Generally, you will have a single key-value pair per line, so it would make sense to accept a newline as a delimiter.

Speaking of commas, JSON doesn’t accept trailing commas. If you need commas after each pair, it should at least accept trailing commas, since trailing commas make adding new entries to the end easier and lead to cleaner commit diffs.

Long Strings

Another problem with JSON as a configuration format is it doesn’t have any support for multi-line strings. If you want newlines in the string, you have to escape them with “\n”, and what’s worse, if you want a string that carries over onto another line of the file, you are just out of luck. If your configuration doesn’t have any strings that are too long to fit on a line, this isn’t a problem. However, if your configuration includes long strings, such as the description of a project or a GPG key, you probably don’t want to put it on a single line with “\n” escapes instead of actual newlines.

Numbers

In addition, JSON’s definition of a number can be problematic in some scenarios. As defined in the JSON spec, numbers are arbitrary precision finite floating point numbers in decimal notation. For many applications, this is fine. But if you need to use hexadecimal notation or represent values like infinity or NaN, then TOML or YAML would be able to handle the input better.


{
  "name": "example",
  "description": "A really long description that needs multiple lines.\nThis is a sample project to illustrate why JSON is not a good configuration format. This description is pretty long, but it doesn't have any way to go onto multiple lines.",
  "version": "0.0.1",
  "main": "index.js",
  "//": "This is as close to a comment as you are going to get",
  "keywords": ["example", "config"],
  "scripts": {
    "test": "./test.sh",
    "do_stuff": "./do_stuff.sh"
  },
  "bugs": {
    "url": "https://example.com/bugs"
  },
  "contributors": [{
    "name": "John Doe",
    "email": "johndoe@example.com"
  }, {
    "name": "Ivy Lane",
    "url": "https://example.com/ivylane"
  }],
  "dependencies": {
    "dep1": "^1.0.0",
    "dep2": "3.40",
    "dep3": "6.7"
  }
}

What you should use instead

The configuration language you choose will depend on your application. Each language has different pros and cons, but here are some choices to consider. They are all languages that are designed for configuration first and would each be a better choice than a data language like JSON.

TOML

TOML is an increasingly popular configuration language. It is used by Cargo (Rust build tool), pip (Python package manager), and dep (golang dependency manager). TOML is somewhat similar to the INI format, but unlike INI, it has a standard specification and well-defined syntax for nested structures. It is substantially simpler than YAML, which is attractive if your configuration is fairly simple. But if your configuration has a significant amount of nested structure, TOML can be a little verbose, and another format, such as YAML or HOCON, may be a better choice.

name = "example"
description = """
A really long description that needs multiple lines.
This is a sample project to illustrate why JSON is not a \
good configuration format. This description is pretty long, \
but it doesn't have any way to go onto multiple lines."""

version = "0.0.1"
main = "index.js"
# This is a comment
keywords = ["example", "config"]

[bugs]
url = "https://example.com/bugs"

[scripts]

test = "./test.sh"
do_stuff = "./do_stuff.sh"

[[contributors]]
name = "John Doe"
email = "johndow@example.com"

[[contributors]]
name = "Ivy Lane"
url = "https://example.com/ivylane"

[dependencies]

dep1 = "^1.0.0"
# Why we depend on dep2
dep2 = "3.40"
dep3 = "6.7"

HJSON

HJSON is a format based on JSON but with greater flexibility to make it more readable. It adds support for comments, multi-line strings, unquoted keys and strings, and optional commas. If you want the simple structure of JSON but something more friendly for configuration files, HJSON is probably the way to go. There is also a command line tool that can convert HJSON to JSON, so if you are using a tool that requires plain JSON, you can write your configuration in HJSON and convert it to JSON as a build step. JSON5 is another option that is pretty similar to HJSON.

{
  name: example
  description: '''
  A really long description that needs multiple lines.
  
  This is a sample project to illustrate why JSON is 
  not a good configuration format.  This description 
  is pretty long, but it doesn't have any way to go 
  onto multiple lines.
  '''
  version: 0.0.1
  main: index.js
  # This is a a comment
  keywords: ["example", "config"]
  scripts: {
    test: ./test.sh
    do_stuff: ./do_stuff.sh
  }
  bugs: {
    url: https://example.com/bugs
  }
  contributors: [{
    name: John Doe
    email: johndoe@example.com
  } {
    name: Ivy Lane
    url: https://example.com/ivylane
  }]
  dependencies: {
    dep1: ^1.0.0
    # Why we have this dependency
    dep2: "3.40"
    dep3: "6.7"
  }
}

HOCON

HOCON is a configuration designed for the Play framework but is fairly popular among Scala projects. It is a superset of JSON, so existing JSON files can be used. Besides the standard features of comments, optional commas, and multi-line strings, HOCON supports importing from other files, referencing other keys of other values to avoid duplicate code, and using dot-delimited keys to specify paths to a value, so users do not have to put all values directly in a curly-brace object.

name = example
description = """
A really long description that needs multiple lines.

This is a sample project to illustrate why JSON is 
not a good configuration format.  This description 
is pretty long, but it doesn't have any way to go 
onto multiple lines.
"""
version = 0.0.1
main = index.js
# This is a a comment
keywords = ["example", "config"]
scripts {
  test = ./test.sh
  do_stuff = ./do_stuff.sh
}
bugs.url = "https://example.com/bugs"
contributors = [
  {
    name = John Doe
    email = johndoe@example.com
  }
  {
    name = Ivy Lane
    url = "https://example.com/ivylane"
  }
]
dependencies {
  dep1 = ^1.0.0
  # Why we have this dependency
  dep2 = "3.40"
  dep3 = "6.7"
}

YAML

YAML (YAML Ain’t Markup Language) is a very flexible format that is almost a superset of JSON and is used in several conspicuous projects such as Travis CI, Circle CI, and AWS CloudFormation. Libraries for YAML are almost as ubiquitous as JSON. In addition to support of comments, newline delimiting, multi-line strings, bare strings, and a more flexible type system, YAML also allows you to reference earlier structures in the file to avoid code duplication.

The main downside to YAML is that the specification is pretty complicated, which results in inconsistencies between different implementations. It also treats indentation levels as syntactically significant (similar to Python), which some people like and others don’t. It can also make copy and pasting tricky. See YAML: probably not so great after all for a more complete description of downsides to using YAML.

name: example
description: >
  A really long description that needs multiple lines.
  
  This is a sample project to illustrate why JSON is not a good 
  configuration format. This description is pretty long, but it 
  doesn't have any way to go onto multiple lines.
version: 0.0.1
main: index.js
# this is a comment
keywords:
  - example
  - config
scripts: 
  test: ./test.sh
  do_stuff: ./do_stuff.sh
bugs: 
  url: "https://example.com/bugs"
contributors:
  - name: John Doe
    email: johndoe@example.com
  - name: Ivy Lane
    url: "https://example.com/ivylange"
dependencies:
  dep1: ^1.0.0
  # Why we depend on dep2
  dep2: "3.40"
  dep3: "6.7"

Scripting language

If your application is written in a scripting language such as Python or Ruby, and you know the configuration comes from a trusted source, the best option may be to simply use a file written in that language for your configuration. It’s also possible to embed a scripting language such as Lua in compiled languages if you need a truly flexible configuration option. Doing so gives you the full flexibility of the scripting language and can be simpler to implement than using a different configuration language. The downside to using a scripting language is it may be too powerful, and of course, if the source of the configuration is untrusted, it introduces serious security problems.

Write your own

If for some reason a key-value configuration format doesn’t meet your needs, and you can’t use a scripting language due to performance or size constraints, then it might be appropriate to write your own configuration format. But if you find yourself in this scenario, think long and hard before making a choice that will not only require you to write and maintain a parser but also require your users to become familiar with yet another configuration format.

Conclusion

With so many better options for configuration languages, there’s no good reason to use JSON. If you are creating a new application, framework, or library that requires configuration choose something other than JSON.

Related material

31 Comments

  1. There is a format called EDN, created by Clojure available in Java and other JVM languages but shamefully not broadly known.

  2. Richard S.July 17, 2018 at 1:21 am

    There is also the Groovy configscript format which consumes Groovy scripts as configurations, so you have all the features of the Groovy language, plus the fact that it’s in the Groovy standard library and that you can configure the compiler that reads the script deserves a mention in my opinion: http://mrhaki.blogspot.com/2009/10/groovy-goodness-using-configslurper.html

  3. Jim WilliamsJuly 17, 2018 at 4:03 am

    XML is another choice for a configuration language. It shares many of the negative points of JSON like “signal to noise”. Also many of the positive points like available editors and parsers.

    Value is realized when XML is used with a data type document and a competent XML editor. In that case, it is a breeze to verify the file’s syntax. When a DTD is present many editors can prompt for elements and attributes, assisting the author of the config file.

  4. Jeff GroomJuly 17, 2018 at 7:46 am

    What format would you recommend to use in a lucidchart diagram? Currently, there is key/value for each drawing object. It would be nice if a yaml, hocon, or json could be associated to store a more complex set of data.

  5. Tyler DavisJuly 17, 2018 at 10:00 am

    Hey Jeff, I’m a developer on the data and automation team here at Lucidchart. I am interested in your use case and what type of data you are trying to visualize and attach to your shapes. Do you have time within the next week for a short phone call? You can grab some time on my calendar at https://calendly.com/lucidlaura/lucid-feature-dev. If you’re able to set aside some time, I’d love to say thank you by sending you a $25 Amazon gift card. Thanks for your help!

  6. […] JSON isn’t a good configuration language 6 by fanf2 | 1 comments on Hacker News. […]

  7. Dave CunninghamJuly 19, 2018 at 3:40 pm

    Jsonnet (jsonnet.org) is another option. It’s designed for configuration but is a full blown programming language with lots of construct for generating / abstracting config to avoid duplication, etc. It generates JSON so can be used with existing tools that accept JSON or YAML, like Cloud Formation, etc.

  8. One minor edit for correctness – the creator of JSON is Douglas Crockford not David.
    https://en.wikipedia.org/wiki/JSON

  9. Thayne McCombsJuly 27, 2018 at 9:57 am

    Fixed

  10. Definitely agree that using a full scripting langauge is far better than using configuration languages. In the rare case the configuration files are coming from an external source, yeah, use something that isn’t executable, but for everything else a full programming language is better than some configuration language that constrains you and that will be unfamiliar to most people.

  11. […] Why JSON isn’t a Good Configuration Language […]

  12. […] >> Why Json Isn’t A Good Configuration Language [lucidchart.com] […]

  13. You can just include comments in a json file like
    {
    “comment”:”This is express”,
    “text”:”tesr”
    }

  14. Thayne McCombsNovember 28, 2018 at 2:11 pm

    That works sometimes. But it has a lot of limitations:

    • It only works inside of objects. You can’t use it inside an array.
    • The comment can only be one line
    • If you need multiple comments in the same object, you have to use unique keys for each
    • If the application requires all keys to be known keywords, it will reject json with this kind of comment
    • If the object is a mapping from arbitrary keys to values, then this kind of comment would end up in your comment getting used as a value (or causing an error), which could mean comments are valid in some places, but not others.
    • The comment looks just like normal content, which makes it not stand out when reading the code
  15. There is also http://thindf.org

  16. I agree that some of the popular formats can cause some pain but in the cloud where you use a configuration management tool like puppet you can script your configuration files regardless, making things easy. I would advise against using something obscure as you may regret it later. Always try to choose depending on your scaling potential.

  17. We got a shiny new format known as dhall that is worth checking out. It’s the most extensive and powerful config language I know of:

    https://dhall-lang.org/
    https://github.com/dhall-lang/dhall-lang

  18. I just wrote a new language called “generic config-oriented language”. It is like perl syntax, supports variables, variable overrides, and supports if, for, functions and types in JSON. It’s pretty simple but powerful. It always outputs JSON but doesn’t have I/O support, so no “too powerful” issues and pretty secure. I hope I can open source it someday (there could be legal issues as this is developed during working time, basically it longs to my company I think).

  19. iphone gratuitMay 2, 2019 at 8:35 pm

    It’s a shame you don’t have a donate button! I’d definitely donate to this fantastic blog!
    I guess for now i’ll settle for bookmarking and adding your
    RSS feed to my Google account. I look forward to new updates
    and will share this site with my Facebook
    group. Chat soon!

  20. I’ve always thought INI was still the best option and good to see that TOML is an evolutionary advancement on that. json is horrible for config. xml is structured but noisy. Ini is simple and intuitive. It’s not standard but doesn’t really need to be anyway as long as your file can read it. Simple categorized key=value pairs and even arrays can be handled with INI. You can use any delimiter you like for multiple dimensions or just leave it all at single dimension.

  21. Jacek KrawczykJuly 24, 2019 at 2:47 am

    @dss: I cannot agree more with you. You can create your own standard for the INI files like slash or backslash to delimit one part of the tree from another. I have written a framework where I cannot be on the wrong side while using date or number formats on different computers. The only problem is with the recurrent keys/sections. If I have an irregular structure, I am going for XML…

  22. XML was mentioned before, but it got a rather light touch.

    Is like to mention that XML has an additional feature. It comes with validation of the existence and type of entries, in the form of XSD or other validation add ons.

    This means you can print error messages when an element is missing, added in a place where it is not used, or set to an invalid value without writing your own validator.

    Many XML editors will even use this information to auto complete away some of the wordiness.

    It is also versioned, making an application which supports old and new configuration files relatively easy to build.

    XML isn’t perfect, but it has a lot of features.

  23. In my opinion, the problem with all these config languages is that they are not turing complete. That means that they can only ever be static. So what at first seems like a staticly defined variable… when you later need it to be dependent on some criteria… you are screwed.

    In my opinion, the best config language is whatever language you are programming in. If the language you are programming in isn’t a good config language, then it probably isn’t a good language at all. That’s why Groovy has gained some popularity as a config language, it’s Java that’s made a bit more config friendly.

  24. Thayne McCombsSeptember 30, 2019 at 5:17 pm

    > In my opinion, the best config language is whatever language you are programming in. If the language you are programming in isn’t a good config language, then it probably isn’t a good language at all.

    I don’t think that is necessarily true. For example, I don’t think compiled languages make good config languages, because needing to compile your config is not a very good UX. Also, I don’t think it is always a wise decision to require users to know the language the application was written in in order to configure it.

  25. Good critique and collection of alternatives. However, the deeper questions are:
    * what should be configurable and why?
    * where should the application look for configuration? File? Environment variables? Network?
    * does the application need to restart to reread its configuration?
    * what is the difference between configuration data, mostly static application data and constants? Where should each be defined?

  26. […] WHY JSON ISN’T A GOOD CONFIGURATION LANGUAGE – good article that explains why yaml is better for configuration than json. When I originally encountered yaml I was like WTF another damn BS PITA I have to learn. Honestly I don’t learn this more than what I need to get things done. Your brain only remembers about 23% of what you learn 12 months later anyways. […]

  27. I freakin’ hate YAML with the power of 300 main-sequence stars. Sure, JSON is cumbersome for configurations… but how about good ol’ XML??? Even the bulk of Microsoft tools can consume and emit XML. Sure, XML is slightly “bloated” due to all the tags… but it is INFINITELY more readable than YAML. Also, libraries that consume XML have been around since the discovery of fire.

    Honestly, what can YAML do that XML can’t but with more clarity for a HUGE swath of EXISTING developers and system admins?

    Honestly it feels like the people who invent and shove these “configuration languages” down our throats are just trying to stir the pot for sake of stirring the pot.

  28. Aman JiangMay 22, 2021 at 10:47 pm

    You can take a look at the Eclog format, and maybe you will like it.

    https://www.eclog.org/

  29. Michael HeislerOctober 1, 2021 at 5:08 am

    On the contrary!
    I haven’t found a better config language for my needs by now than JSON.
    All the data that is configurable is stored in nested objects and as JSON is an Object Notation it is a perfect choice. Besides, I do not want users to write config files manually, so comments are not necessary. And I do not want a config file to be an executable it has to store data! And if a value depends on other values then the logic in my app should solve that. Could you please give an good example of a config language where you may mirror a data tree e.g. like the DOM in HTML which has less noise.

  30. Thayne McCombsOctober 1, 2021 at 1:35 pm

    > I do not want users to write config files manually

    This article is assuming that the configuration file is written, or at least edited by a human. If the file is only read and written by a machine, it’s a whole different beast. At that point you could even use a binary format like SQLite, bdb, levelDB/rocksDB, etc.

    > Could you please give an good example of a config language where you may mirror a data tree e.g. like the DOM in HTML which has less noise

    There are many: HJSON, JSON5 (and its successors), HOCON, dhall, TOML, just to name a few.

  31. Although I would appreciate comments in JSON files, too, a config file is not a manual. And not a notebook to write exhaustive texts.
    And JSON has the big advantage that there exists supersets, like YAML etc, so you can switch to a better suited format if necessary, without changing old config files.
    And JSON has jsonschema and jsonpath, so it is as powerful as XML but with a better SNR.

Your email address will not be published.