Unable to distinguish the seq element types from echo outputs.
let seq1 = @[1, 2, 3]
echo "Integer elements: ", seq1
let seq2 = @['1', '2', '3']
echo "Char elements: ", seq2
let seq3 = @["1", "2", "3"]
echo "String elements: ", seq3
Integer elements: @[1, 2, 3]
Char elements: @[1, 2, 3]
String elements: @[1, 2, 3]
Just looks at the seq elements printed, it is not clear what the element types are.
Equivalent code in Python (sort of, because Python does not have char vs
string):
list1 = [1, 2, 3]
print('Integer elements: {}'.format(list1))
list2 = ['1', '2', '3']
print('String elements: {}'.format(list2))
Integer elements: [1, 2, 3]
String elements: ['1', '2', '3']
I am aware of repr, but it is too heavy/noisy for regular debug:
let seq1 = @[1, 2, 3]
echo "Integer elements: ", seq1
echo "Integer elements (Repr): ", seq1.repr
let seq2 = @['1', '2', '3']
echo "Char elements: ", seq2
echo "Char elements (Repr): ", seq2.repr
let seq3 = @["1", "2", "3"]
echo "String elements: ", seq3
echo "String elements (Repr): ", seq3.repr
Integer elements: @[1, 2, 3]
Integer elements (Repr): 0x7f1670b5f048[1, 2, 3]
Char elements: @[1, 2, 3]
Char elements (Repr): 0x7f1670b63048['1', '2', '3']
String elements: @[1, 2, 3]
String elements (Repr): 0x7f1670b5f080[0x7f1670b631e8"1", 0x7f1670b63210"2", 0x7f1670b63238"3"]
The reason for this is that $ on strings does not add quote signs; and the reason for that is that $ is, intentionally, the identity. This is because $ is used in other places than printing, and in those cases you want $ to be a noop
I thought that the role of $ was to be like __str__ in Python, and role of repr to be like.. __repr__ in Python.
But based on:
This is because $ is used in other places than printing, and in those cases you want $ to be a noop
may be we do need a true __str__ equivalent in Nim to help echo provide non-ambiguous, but readable prints.
Or may be $ output should be wrapped with '' or "" as necessary only inside echo?
Yeah, I agree that this is an issue. IMO the following should hold true $(@['1', '2', '3']) == "@['1', '2', '3']" (and the same for seq[string]).
It might be a good idea to look into the git history and dig out the PRs that introduced this behaviour to find out the reasoning behind it. I recall that there is a good reason for it being like this, but can't remember the reason :)
@dom96 like Andrea said: Seeing $ as generic "to string" operator, it is important that $"foo" remains "foo" and does not become "\"foo\"". It already is a string, so the generic conversion should be an identity operation.
I see echo as a tool to produce the most concise / human friendly representation. If I want a debug tool which makes the variable types more explicit, I would use something like this:
import macros
import sequtils
import strutils
import future
proc toDebugRepr[T](x: T): string =
when T is seq:
result = "@[" & x.map(el => toDebugRepr(el)).join(", ") & "]"
elif T is array:
result = "[" & x.map(el => toDebugRepr(el)).join(", ") & "]"
elif T is string:
result = "\"" & x & "\""
elif T is char:
result = "'" & x & "'"
else:
result = $x
macro debug*(args: varargs[typed]): untyped =
result = newCall(bindSym("echo"))
for arg in args:
result.add(newCall(bindSym("toDebugRepr"), arg))
debug 1, 2, 3
debug "hello world"
debug "a", "b"
debug(@[1, 2, 3])
debug(@["1", "2", "3"])
debug(@['1', '2', '3'])
debug([1, 2, 3])
debug(["1", "2", "3"])
debug(['1', '2', '3'])
Output:
"hello world"
"a""b"
@[1, 2, 3]
@["1", "2", "3"]
@['1', '2', '3']
[1, 2, 3]
["1", "2", "3"]
['1', '2', '3']
Note that the Python way of handling this is not fully consistent, because it has a special treatment of top level strings vs nested strings (print("hello world") vs print(["hello world"])). If you want to replicate that you could add an level argument to toDebugRepr and omit the quoting on the top level.
@bluenote10 Wow! Thank you for coming up with this. My comments:
When printing numbers (int, float, etc), it would be useful to always add a space after each if they are specified as above.
debug 1, 2, 3
debug 123
# Output
# 123
# 123
It would be less noisy to not double quote bare strings because almost all echo based used would be printing bare strings.. either directly or via format and such.
debug "hello world"
# Output
# "hello world"
These are perfect! Thanks.
# @[1, 2, 3]
# @["1", "2", "3"]
# @['1', '2', '3']
# [1, 2, 3]
# ["1", "2", "3"]
# ['1', '2', '3']
Note that the Python way of handling this is not fully consistent, because it has a special treatment of top level strings vs nested strings (print("hello world") vs print(["hello world"])).
That being not fully consistent makes sense because you wouldn't want to see double-quoted strings for all debug messages.. double-quoting should be introduced where needed to remove ambiguity.
If you want to replicate that you could add an level argument to toDebugRepr and omit the quoting on the top level.
I was curious if instead of having a new proc like debug, the echo can be updated in such a way that the new echo is a middle-ground between the current echo and the debug you proposed.
The actual example that inspired me to open this issue was this:
let temp_seq: seq[string] = "1,2,3".split(',', maxsplit=1) # actual example
echo temp_seq
debug temp_seq
# Output
# @[1, 2,3]
# @["1", "2,3"]
We can already see how your debug has improved the clarity of the printed result as seen above.
In summary, we should not introduce ambiguity while trying to make the outputs more human readable using echo. It would be really awesome if consensus is reached to single/double quote only nested chars/strings and not bare chars/strings, directly using echo.
From what you said earlier in your post:
I see echo as a tool to produce the most concise / human friendly representation.
With above proposal, echo will still produce concise and human friendly representation, but not at the cost of mis-representation of the data. Usual bare ints, strings will still be as it is now. Only the seqs, lists, etc. will use quotes where needed to prevent outputs like # @[1, 2,3] where it is not immediately clear if it is a seq of ints or a seq of one string element, or two, or three, or four? string elements.
I would rather leave echo as it is (it has a clear, simple semantics based on $, and moreover changing it now would break a lot of things) and add the debug macro in the stdlib
I see echo as a tool to produce the most concise / human friendly representation.
The current echo fails at that task. Displaying a sequence filled with strings as @[1,2,3] (as if it's filled with ints) isn't friendly at all. It's incredibly confusing.
I will be changing this, unless somebody has a very good reason against it.
Marking as high priority because it's a breaking change.
It would definitely be nice to change that, but I'm not really sure how.
I assume that we must not change the idempotent behavior of $ on strings, i.e., it must stay a no-op, otherwise this would break a lot of code.
I also don't think special-casing strings/chars in the $ of openarrays is the right solution, because the same problem exists for the $ implementations of sets, tables, lists, deques, etc. and a special handling for openarrays only would even result in inconsistent behavior. For instance, currently we would also get:
echo {1: "1", 2: "2"}.toTable
echo {"1": 1, "2": 2}.toTable
# will both print:
# {1: 1, 2: 2}
So probably we would need a Python-like __repr__ operator, maybe called $$. The operator could have a default generic implementation falling back to $, and overloads for strings/chars/... (potentially other stringlike types) which return quoted representations. Then all collections would have to use this $$ on their elements.
Another alternative: Overloading $ for strings/chars only temporarily in the context of the echo call, so that the non-idempotent $ on strings only applies in the echo context. Probably a bit hacky.
What would you prefer?
Another alternative: Overloading $ for strings/chars only temporarily in the context of the echo call, so that the non-idempotent $ on strings only applies in the echo context. Probably a bit hacky.
This seems like the best option, although I'm not sure myself how it would work. We should look to Python for inspiration I think.
One IMO damning reason to make this change is the following:
var x: seq[string] = @[]
echo(x) # @[]
echo(@[""]) # @[]
And I'm not just manufacturing this. I just stumbled on this while fixing #4377.
It is a breaking change, needs to be justified. I would rather leave echo as it is, it can used to write files using stdout for example where you don't want quotes you didn't ask for to appear. it is not always a debug tool to visualize what is going on.
maybe a separate command instead
@cooldome The upcoming version is supposed to be the "last" breaking release (cough) so we are improving as many things as reasonable. Note that when you don't use echo for debugging it's unlikely you are echoing full seqs or arrays or any kind of container. The breaking should be minimal, most of the time only test suites should be affected.
@cooldome Having different versions like echo + debug will be a clustermess.. It will be difficult for users to understand which should be used when. It would be nice to learn from how Python print does the right thing with quotes.
The motto should be the echo outputs concise information without data misrepresentation.
Note that this actually hasn't much to do with echo, the question is rather how collection-to-string works.
It would be nice to learn from how Python print does the right thing with quotes.
I've explained this in my comment here
I just tried out https://github.com/nim-lang/Nim/pull/6825, and now the echo outputs are so much clearer! Thanks @bluenote10!
let seq1 = @[1, 2, 3]
echo "Integer elements: ", seq1
let seq2 = @['1', '2', '3']
echo "Char elements: ", seq2
let seq3 = @["1", "2", "3"]
echo "String elements: ", seq3
Output:
Integer elements: @[1, 2, 3]
Char elements: @['1', '2', '3']
String elements: @["1", "2", "3"]
@kaushalmodi can this be closed since #6825 was merged in devel?
Yes of course. Thank you!
Thanks!
Most helpful comment
@dom96 like Andrea said: Seeing
$as generic "to string" operator, it is important that$"foo"remains"foo"and does not become"\"foo\"". It already is a string, so the generic conversion should be an identity operation.I see
echoas a tool to produce the most concise / human friendly representation. If I want a debug tool which makes the variable types more explicit, I would use something like this:Output:
Note that the Python way of handling this is not fully consistent, because it has a special treatment of top level strings vs nested strings (
print("hello world")vsprint(["hello world"])). If you want to replicate that you could add anlevelargument totoDebugReprand omit the quoting on the top level.