Now that we're firming up our notion of what nothing
and missing
actually mean, we should probably semantically enforce them a little bit more. Example:
Sys.which(cmd)
will return nothing
if the given cmd
cannot be found on the PATH
. This makes it easy to define a hierarchy of commands: coalesce(Sys.which("curl"), Sys.which("wget"))
, etc... However, if the user then interpolates the result into a Cmd
, it makes sense that we should error out.
Jeff suggested disallowing all kinds of print()
or string()
conversion of nothing
, leaving only repr()
. I tend to agree, despite that this has the potential to be a quite breaking change.
I would also like to suggest that missing
get a similar treatment, as it is a subtly different but largely identical concept.
I see the idea, but I can't help feeling that's not the right place to handle this. Forcing users to handle explicitly the possibility that a value is nothing
is one of the two effects/goals of Some
(the other being to distinguish nothing
from Some(nothing)
). Yet we have generally taken the stance that we don't wrap return values in Some
, and instead expect users to check manually for nothing
when appropriate. So I'm not sure why we should throw an error for some functions and not for others.
I guess the motivation in the present case is that Sys.which
used to throw an error, so we are not used to it returning nothing
, and we want to protect users from that. But isn't that just due to history? Also note that even if we don't throw an error on interpolation, the command will most likely fail trying to run nothing
anyway, so it doesn't make a big difference.
What is the actual utility of allowing nothing
to interpolate as a string into things? Put another way, where is this functionality being used that is not currently a bug?
I agree interpolation doesn't sound very useful nor legitimate. But string
is defined as "Create a string from any values using the print
function", so wouldn't it be weird that nothing
would be the only type not to support it? That's something I've wondered for the case of missing
, where we could also imagine returning missing
rather than "missing"
.
Argument for the specialness of nothing
: there are a lot of situations now where you can do some operation and expect some kind of value like an integer or a string and the _only_ other value you may get is nothing
. Therefore we should be extra careful with nothing
as a return value since it's special in the sense of potentially appearing when you may not have been expecting it.
Another argument is that printing of nothing
is likely to occur upon writing to a CSV:
julia> b = IOBuffer(); writedlm(b, [1 2 nothing 4], ',')
print(String(take!(b)))
1,2,nothing,4
Now perhaps this could be argued that it's the responsibility of the CSV writer to handle this case, but it is somewhat compelling to me. It's also interesting that the REPL doesn't even attempt to print anything when the result is nothing
.
This is slightly off-topic, but writedlm
also writes missing as "missing", which I found somewhat surprising earlier today when writing a CSV. Not sure what I expected, but the worrisome case is when the string "missing" is written into a String column that may also contain "missing" as a valid string value.
Maybe CSV writers should print missing
as an empty field instead?
I think it makes sense for missing
to in general be a non-fatal absence with regards to construction of larger objects, whereas nothing
should be the fatal absence. In that sense, I think the following should be true:
@test_throws ArgumentError "$(nothing)"
@test "$(missing)" == ""
@test join([1,2,missing,4], "") == "1,2,,4"
@test "$(missing)" == ""
@test join([1,2,missing,4], "") == "1,2,,4"
This would imply that print(missing)
wouldn't print anything, as opposed to show
, which needs to print missing
. That would mean that the empty string is the "canonical (un-decorated) text representation" of missing
, while "missing"
is its "informative text representation". Would that be OK? I'm never sure where print
and show
are used down the line.
Yes, along with repr, which should also print “missing”.
Seems reasonable to me and worth trying. I like the "fatal absence" versus "non-fatal absence" distinction.
Consider a new user trying out something like
julia> f(x) = x>0?nothing:-x
f (generic function with 1 method)
julia> a = f(1);
julia> print("I ran f and got $a")
I ran f and got nothing
You really want them to get an ArgumentError
here? I imagine that would be very frustrating.
That case can have a very helpful, specific error message, which makes it not so frustrating since it immediately tells you how to fix the problem: do "I ran f and got $(repr(a))"
instead, which is good advice in general. The case of getting an unexpected nothing
and splicing it into a command or whatever and not being able to figure out what went wrong, on the other hand, seems like it would be truly frustrating and not just for novices.
Quite right --- getting an error (hopefully with a useful message) while trying things out is not a problem at all compared to silently getting nothing
in your output stream, having it cause a problem much later, and then having to track down where it came from.
I guess with a good error message it wouldn't be so bad.
See https://github.com/JuliaLang/julia/issues/27729 which proposes pwd() == nothing
when in a deleted directory but as an alternative returning some kind of NotFound
object which could also potentially be returned by Sys.which
to give a better error message upon usage. However that object would not be amenable to the coalesce
/ something
code which is annoying.
FWIW, I'm not opposed to print(nothing)
throwing an error and print(missing)
printing nothing. The latter definitely makes sense for CSV, even if it's generally more complex than that (e.g. whitespace-delimited files are usually parsed by skipping contiguous spaces, so that a special text value like NA
is needed to represent a missing value). Anyway I'm not completely clear on where print
is/should be used rather than show
.
Triage is in favor, noting that if we change our minds, we can but if we don't do this now, we can't change it in the 1.x timeline.
Any resolution wrt. #27650? Can logging of nothing
still result in no message, or should it error as well? Or could we use missing
for that, as @nalimilan proposed above?
I don't think logging should use missing
for that. Having a special case for nothing
there would probably be fine.
Most helpful comment
I think it makes sense for
missing
to in general be a non-fatal absence with regards to construction of larger objects, whereasnothing
should be the fatal absence. In that sense, I think the following should be true: