Here's an argument in favor of NULL values, as previously discussed and rejected in #30.
I believe that in configuring a system, the most important things are:
Therefore, I think it's important to be able to define, and comment, keys for which you don't yet have a value. A TOML document should be able to act as a specification for the possible configuration. It may be preferable not to define a value in the TOML config - say, in order to set a reasonable default at runtime. But, it is important to specify that such a value _can_ be set. This is typically done by commenting out the key, and that seems ugly.
Put another way, it's the difference between hash[key].nil? and hash.key?(key) in Ruby or hash[key] == null and hash[key] === undefined in JavaScript. I think it's important, to aid in the downstream validation and use of the data provided by a TOML document.
Disclosure: my own take of this whole situation is levels, which defines a way to merge multiple inputs into a final configuration. When adding TOML support in rcarver/levels#3 I realized that we have a fundamental disagreement here. In all other ways, TOML is the ideal format for levels configuration.
As far as the syntax, I don't have a strong opinion. I think I'm leaning toward a lack of value because it doesn't introduce a new keyword, and it resembles what you'd do in bash.
this_is_null =
Note that this necessarily adds a new NULL type to the spec. (A type containing precisely one value. Otherwise known as the unit type.) A new type isn't so much a big deal, but it bullies itself into the type of all other types in TOML. Namely, an integer is no longer just an integer. It's an integer or NULL.
I think the added type complicates things. It means that every valid TOML parser has to differentiate between non-existence and NULL. This complicates types in static languages.
@rcarver - Could you maybe elaborate on why it is important for a TOML file to have knowledge of the set of definable keys? (As opposed to this information being in the application, or perhaps defined in a TOML array somewhere.)
@BurntSushi the or case and static languages are good points. I'm still pondering the implications myself, thanks.
In practice, I find that _something_ needs to define the set of possible keys. Again, in practice, they tend to accumulate over time and it's difficult to track. I'm thinking about both traditional applications and also provisioning tools (Chef) that use lots and lots of configuration variables. I like that the config file can act as the one place that defines the possible keys. I like that the app can enforce that the key is defined in the config file. If it's defined as NULL, the app can provide a default value if appropriate.
To put this all into perspective, I think we should look at the use of TOML data. TOML parses to a hash, which I understand to return NULL when an undefined key is read (coming from Ruby). Here are some examples to consider how an application might want to treat various cases.
When NULL is not allowed
[user]
username = "rcarver"
# name = "example name"
Obviously, this works:
config["user"].key?("username") # => true
config["user"]["username"] # => "rcarver"
Generally reading an undefined key returns NULL.
config["user"].key?("name") # => false
config["user"]["name"] # => nil
Alternatively, and application could choose to raise an error:
config["user"].key?("name") # => false
config["user"]["name"] # raises exception
When NULL is allowed
[user]
username = "rcarver"
name = # "example name"
We can safely read the key, and still decide between the options above for both undefined and null keys.
config["user"].key?("name") # => true
config["user"]["name"] # => nil
So, if we agree that a hash returns NULL for an undefined key, an application already has to deal with "value _or_ NULL" case. Adding NULL support to TOML lets an application differentiate between "no value" and "undefined" if it chooses to do so.
All that said, I do agree that it complicates TOML. I'll think on this some more. Happy to hear more perspectives here.
@rcarver
I'm not sure how NULL gives an application the ability to enforce that a key is defined. Doesn't that ability exist anyway? If the key isn't defined, then a default value can be given.
I think I just have a fundamentally different opinion about where the Truth of which keys are available should be known. I don't believe it belongs in a configuration file (controlled by users). I'll leave this point to be debated by others.
With that said, I still want to make the typing implications of NULL values clear for anyone else that wants to weigh in.
So, if we agree that a hash returns NULL for an undefined key, an application already has to deal with "value or NULL" case.
Almost all implementations of a hash table provide a way to distinguish between keys that are defined and keys that map to a NULL value. (The lone exception that I know of is Lua.) Namely, the possibility of non-existence is handled by the type of the hash rather than the values stored in the hash. In this way, non-existence does not creep into the type of any value, as it is handled implicitly in the type of a hash.
With NULL values, every parser has to distinguish between non-existence and NULL for _every_ value.
In dynamic languages, this isn't an unreasonable burden. Indeed, the distinction is even difficult to notice. Mostly because dynamic languages allow _any_ type to contain NULL values (they've allowed it to be a big bully). In static languages, not all types can have NULL values.
Most static languages have facilities to handle such things, but it becomes a burden when they must be anticipated for all types.
@BurntSushi I completely agree with the typing implications of NULL. In fact, most of the time I would take your position.Two things continue to have me question that in this context:
NULL. I hope this thread at least shows what that means. I'm glad to have had this discussion. At this point I could go either way, whatever @mojombo thinks aligns with the overall goals of TOML.
My initial intuition tells me that NULL should be avoided for the general historical reasons most of us are familiar with. I'd be curious if there was evidence of wide-spread usage of a NULL value in existing application/database/system config files (not in the format specs themselves, but simply on-the-wild config file content), but to my knowledge it's pretty rare.
Null has some value as representative of the idea of an unknown, especially in RDBMS. However, I've rarely ever found it useful in actually application code, as most usages of it are better served by patterns such as Null Object.
Yeah I don't really see a strong enough argument to justify the complexity and burden of NULL. I remain in favor of keeping it out.
Definitely keep it out. As @BurntSushi correctly points out, if NULL is in the file format, you have to special-case it while using any other type. In the real world, this is already the case because a key might not be set at all. So while I think having NULL as a type is bad, the "explicitly unset a key" syntax looks useful:
[integration]
api_key=
You can just do this:
[toml]
null_integer = 0
null_string = ""
This is much easier to parse than null values.
I think an application should be in charge of knowing what the valid keys are and making sure sane defaults are set. It's too risky to leave that to a user editable config file. If you want to document all the available keys, but leave them "null" until they're set, then I think commenting those lines out is the best solution. Thanks for all the thoughts on this everyone!
Here's a trick that wasn't immediately obvious to me for those of you who need some NULL-like value:
Use false
For example:
a.toml:
timeout = 1000
b.toml:
timeout = false
Now why not just leave out timeout completely if you want to disable the timeout? One reason is you may want to prefer to be explicit. Another reason is you might have some override system. Example:
# Global settings
timeout = 1000
# Override settings for admin users
[admin]
timeout = false
Of course, this trick won't work for boolean fields, but arguably you should never make these nullable anyway since it is too confusing ("What's the difference between false and null?")
Most helpful comment
Definitely keep it out. As @BurntSushi correctly points out, if NULL is in the file format, you have to special-case it while using any other type. In the real world, this is already the case because a key might not be set at all. So while I think having NULL as a type is bad, the "explicitly unset a key" syntax looks useful: