Julia: Interpolation of `nothing` should be an error

Created on 31 May 2018 · 20Comments · Source: JuliaLang/julia

Now that we're firming up our notion of what nothing and missing actually mean, we should probably semantically enforce them a little bit more. Example:

Sys.which(cmd) will return nothing if the given cmd cannot be found on the PATH. This makes it easy to define a hierarchy of commands: coalesce(Sys.which("curl"), Sys.which("wget")), etc... However, if the user then interpolates the result into a Cmd, it makes sense that we should error out.

Jeff suggested disallowing all kinds of print() or string() conversion of nothing, leaving only repr(). I tend to agree, despite that this has the potential to be a quite breaking change.

I would also like to suggest that missing get a similar treatment, as it is a subtly different but largely identical concept.

breaking missing data

Source

staticfloat

👍3

Most helpful comment

I think it makes sense for missing to in general be a non-fatal absence with regards to construction of larger objects, whereas nothing should be the fatal absence. In that sense, I think the following should be true:

@test_throws ArgumentError "$(nothing)"
@test "$(missing)" == ""
@test join([1,2,missing,4], "") == "1,2,,4"

staticfloat on 15 Jun 2018

👍2

All 20 comments

I see the idea, but I can't help feeling that's not the right place to handle this. Forcing users to handle explicitly the possibility that a value is nothing is one of the two effects/goals of Some (the other being to distinguish nothing from Some(nothing)). Yet we have generally taken the stance that we don't wrap return values in Some, and instead expect users to check manually for nothing when appropriate. So I'm not sure why we should throw an error for some functions and not for others.

I guess the motivation in the present case is that Sys.which used to throw an error, so we are not used to it returning nothing, and we want to protect users from that. But isn't that just due to history? Also note that even if we don't throw an error on interpolation, the command will most likely fail trying to run nothing anyway, so it doesn't make a big difference.

nalimilan on 1 Jun 2018

What is the actual utility of allowing nothing to interpolate as a string into things? Put another way, where is this functionality being used that is not currently a bug?

StefanKarpinski on 1 Jun 2018

👍1

I agree interpolation doesn't sound very useful nor legitimate. But string is defined as "Create a string from any values using the print function", so wouldn't it be weird that nothing would be the only type not to support it? That's something I've wondered for the case of missing, where we could also imagine returning missing rather than "missing".

nalimilan on 1 Jun 2018

Argument for the specialness of nothing: there are a lot of situations now where you can do some operation and expect some kind of value like an integer or a string and the _only_ other value you may get is nothing. Therefore we should be extra careful with nothing as a return value since it's special in the sense of potentially appearing when you may not have been expecting it.

StefanKarpinski on 14 Jun 2018

Another argument is that printing of nothing is likely to occur upon writing to a CSV:

julia> b = IOBuffer(); writedlm(b, [1 2 nothing 4], ',')
       print(String(take!(b)))
1,2,nothing,4

Now perhaps this could be argued that it's the responsibility of the CSV writer to handle this case, but it is somewhat compelling to me. It's also interesting that the REPL doesn't even attempt to print anything when the result is nothing.

mbauman on 14 Jun 2018

👍1

This is slightly off-topic, but writedlm also writes missing as "missing", which I found somewhat surprising earlier today when writing a CSV. Not sure what I expected, but the worrisome case is when the string "missing" is written into a String column that may also contain "missing" as a valid string value.

yurivish on 14 Jun 2018

Maybe CSV writers should print missing as an empty field instead?

JeffBezanson on 14 Jun 2018

👍2

@test_throws ArgumentError "$(nothing)"
@test "$(missing)" == ""
@test join([1,2,missing,4], "") == "1,2,,4"

staticfloat on 15 Jun 2018

👍2

@test "$(missing)" == ""
@test join([1,2,missing,4], "") == "1,2,,4"

This would imply that print(missing) wouldn't print anything, as opposed to show, which needs to print missing. That would mean that the empty string is the "canonical (un-decorated) text representation" of missing, while "missing" is its "informative text representation". Would that be OK? I'm never sure where print and show are used down the line.

nalimilan on 15 Jun 2018

Yes, along with repr, which should also print “missing”.

staticfloat on 15 Jun 2018

Seems reasonable to me and worth trying. I like the "fatal absence" versus "non-fatal absence" distinction.

StefanKarpinski on 15 Jun 2018

Consider a new user trying out something like

julia> f(x) = x>0?nothing:-x
f (generic function with 1 method)

julia> a = f(1);

julia> print("I ran f and got $a")
I ran f and got nothing

You really want them to get an ArgumentError here? I imagine that would be very frustrating.

ggggggggg on 15 Jun 2018

That case can have a very helpful, specific error message, which makes it not so frustrating since it immediately tells you how to fix the problem: do "I ran f and got $(repr(a))" instead, which is good advice in general. The case of getting an unexpected nothing and splicing it into a command or whatever and not being able to figure out what went wrong, on the other hand, seems like it would be truly frustrating and not just for novices.

StefanKarpinski on 15 Jun 2018

👍1

Quite right --- getting an error (hopefully with a useful message) while trying things out is not a problem at all compared to silently getting nothing in your output stream, having it cause a problem much later, and then having to track down where it came from.

JeffBezanson on 15 Jun 2018

I guess with a good error message it wouldn't be so bad.

ggggggggg on 15 Jun 2018

See https://github.com/JuliaLang/julia/issues/27729 which proposes pwd() == nothing when in a deleted directory but as an alternative returning some kind of NotFound object which could also potentially be returned by Sys.which to give a better error message upon usage. However that object would not be amenable to the coalesce / something code which is annoying.

StefanKarpinski on 22 Jun 2018

FWIW, I'm not opposed to print(nothing) throwing an error and print(missing) printing nothing. The latter definitely makes sense for CSV, even if it's generally more complex than that (e.g. whitespace-delimited files are usually parsed by skipping contiguous spaces, so that a special text value like NA is needed to represent a missing value). Anyway I'm not completely clear on where print is/should be used rather than show.

nalimilan on 27 Jun 2018

👍1

Triage is in favor, noting that if we change our minds, we can but if we don't do this now, we can't change it in the 1.x timeline.

StefanKarpinski on 28 Jun 2018

Any resolution wrt. #27650? Can logging of nothing still result in no message, or should it error as well? Or could we use missing for that, as @nalimilan proposed above?