Julia: Need better array printing - especially for complex arrays

Created on 2 Jun 2011  Â·  51Comments  Â·  Source: JuliaLang/julia

Our array printing is currently quite embarrassing. It needs to be fixed.

display and printing help wanted

Most helpful comment

Are there any ideas out there for better printing of high-dimensional arrays.

For example I have a 20x20x20x20x10 array. When I forget to have the REPL not display this, I get 20*20*10=4000 "pages" of 20x20 matrices that are indexed like [:, :, i3, i4, i5]

I realize that it's probably not a great idea to have a lot of 5-d arrays floating around in my code, but that is the most natural way to order the data and it would be nice to limit the output to a reasonable level.

All 51 comments

Actually, while we are at it, perhaps we should think of what features we should have in printing. Do we want to support a simple markup (maybe just basic html tags)? The markup can be then interpreted differently in different environments such as the console, an IDE, in a browser etc. That way, we can avoid hardcoding newlines and spaces into the printed output, and let the presentation layer figure it out. For example, if you resize the window, things can be made to look nicer automatically etc.

That's a cool idea, but I'm not sure we should go there just yet. Maybe we can make a speculative feature issue for that one.

Agree that this is for a future date. This was the first realization I had when I tried repl-cloud.

-viral

On Jun 3, 2011, at 7:36 PM, StefanKarpinski wrote:

That's a cool idea, but I'm not sure we should go there just yet. Maybe we can make a speculative feature issue for that one.

Reply to this email directly or view it on GitHub:
https://github.com/JuliaLang/julia/issues/29#comment_1296889

Might be worthwhile to implement printf, and then implement pretty printing of arrays with printf. What's the way to do this? Implement printf in julia, or is it possible to just call the C printf?

Well, I guess printf can be done easily:

julia> ccall(:printf, Void, (Ptr{Uint8}, Int32, Int32), "%d, %d\n", 1, 2)
1, 2

This crashes though:

julia> ccall(:printf, Void, (Ptr{Uint8}, Float64, Int32), "%5.2lf, %d\n", 1/3, 2)
mach_port_mod_refs(reply_port) failed: (os/kern) invalid name
julia(49885,0x7fff70bafcc0) malloc: *** error for object 0x1010b7ba8: pointer being freed was not allocated

I'm guessing that it may have something to do with calling a varargs function. I'm not convinced this is the best way for us to do this, especially since printf is notoriously unsafe. That's acceptable in C because lots of things are unsafe, so printf is in equally dodgy company, but Julia code really ought to be much safer.

Works on linux. But yes, there's probably something strange about passing arguments of different types to a varargs function.

We should write a wrapper for printf that looks at the format string to check/convert arguments, then calls printf on each format specifier individually.

I can take that on. Parsing strings is something I've done a lot of in Julia at this point.

This is shockingly hard to implement.

This is like 90% done as of d1af957f8fbce1747936a8a11cf09689c356fc19. The only thing that remains is to implement new printing for tensors with N ≥ 3. Any thoughts on how that should work?

This is so great. Bravo.
Can't N-d arrays display a series of slices like they used to? No additional work is needed. I see my code to do that has been deleted ;)

A couple comments:

1) I don't like that 1-d arrays look the same as 1xN arrays:

julia> rand(3)
3-element Float64 Array:
 0.62891254833711341 0.14606580619349141 0.84381987352173238

julia> rand(1,3)
1x3 Float64 Array:
 0.79781283426537208 0.20546768541258831 0.25826778395199779

This is especially confusing in light of how our [] syntax works.

2) Complex numbers seem to be tricky:

julia> complex(rand(4,4),rand(4,4))

4x4 Complex128 Array:
 0.68162889320235931 + 0.67685489562472401im  :  0.60328471800565153 + 0.54200719032276257im
 0.57500572887828594 + 0.15107298345567055im     0.75639571382615522 + 0.78558217991623347im
 0.70901455376957778 + 0.13065340624964672im     0.88806184730527993 + 0.6202979241604607im 
 0.96557556248153475 + 0.593232999318827im        0.4792539900349897 + 0.73795275897182777im

There's a misaligned decimal in there.

3) Obviously what we need next is to show fewer decimal places. I can only see 3 columns in a default terminal, which is very little.

OK, another one: using this format for cell arrays makes certain data structures really ugly:

julia> ({1,2,3},{2,3})
(3-element Any Array:
 1 2 3,2-element Any Array:
 2 3)

I think the former special case for 1-d arrays was justified.

This is so great. Bravo.

Thanks. I still can't believe how hard this was to get right. It seems so trivial.

Can't N-d arrays display a series of slices like they used to? No additional work is needed. I see my code to do that has been deleted ;)

Yeah, I can resurrect that code. I just deleted all the old code to put this in it's place. Ain't version control great?

A couple comments:

  1. I don't like that 1-d arrays look the same as 1xN arrays:
julia> rand(3)
3-element Float64 Array:
0.62891254833711341 0.14606580619349141 0.84381987352173238

julia> rand(1,3)
1x3 Float64 Array:
0.79781283426537208 0.20546768541258831 0.25826778395199779

This is especially confusing in light of how our [] syntax works.

What do you propose as a better format. The matrix printing function is pretty configurable, so we should probably be able to do it. May [1, 2, ..., 11, 12].

  1. Complex numbers seem to be tricky:

julia> complex(rand(4,4),rand(4,4))

4x4 Complex128 Array:
0.68162889320235931 + 0.67685489562472401im  :  0.60328471800565153 + 0.54200719032276257im
0.57500572887828594 + 0.15107298345567055im     0.75639571382615522 + 0.78558217991623347im
0.70901455376957778 + 0.13065340624964672im     0.88806184730527993 + 0.6202979241604607im 
0.96557556248153475 + 0.593232999318827im        0.4792539900349897 + 0.73795275897182777im

There's a misaligned decimal in there.

On my phone, I can't tell what's going in here, but I only arranged for complex numbers to align on the + sign. So there's no attempt to do anything better than that. Optimal complex number alignment is really nasty because it requires rewriting the output based on what all the other complex numbers in the column look like. The current framework doesn't support that, but I could maybe try to figure out how.

  1. Obviously what we need next is to show fewer decimal places. I can only see 3 columns in a default terminal, which is very little.

Yup, probably. We could also save space by printing floats with no trailing zeros when unnecessary.

OK, another one: using this format for cell arrays makes certain data structures really ugly:

julia> ({1,2,3},{2,3})
(3-element Any Array:
1 2 3,2-element Any Array:
2 3)

I think the former special case for 1-d arrays was justified.

Yeah, this is pretty ugly. What was the former special case?

Actually the code for it is still there, show.j:262. It just calls
show_comma_array for 1d arrays.

On Thu, Oct 20, 2011 at 12:10 AM, Stefan Karpinski
[email protected]
wrote:

OK, another one: using this format for cell arrays makes certain data structures really ugly:

julia> ({1,2,3},{2,3})
(3-element Any Array:
1 2 3,2-element Any Array:
2 3)

I think the former special case for 1-d arrays was justified.

Yeah, this is pretty ugly. What was the former special case?

Reply to this email directly or view it on GitHub:
https://github.com/JuliaLang/julia/issues/29#issuecomment-2464501

Good enough for now?

I was going to take another crack at the issues you brought up and then close it. I think I can use the core print_matrix function to print vectors and tuples, etc., the way you prefer to have them printed.

Printing complex numbers like tuples is confusing, and it also doesn't seem to solve the decimal alignment problem.

Yeah, I was just testing it out. The a + bim format was just too verbose. I have a tentative plan for the alignment issue, it's just hard to do. Will get to it at some point.

One thing we could consider is having a complex constructor like C(r,i) or Z(r,i). Might be nice to have something shorter than complex.

I think thats too terse and takes away names users want.

-viral

On 08-Dec-2011, at 7:46 AM, [email protected] wrote:

One thing we could consider is having a complex constructor like C(r,i) or Z(r,i). Might be nice to have something shorter than complex.


Reply to this email directly or view it on GitHub:
https://github.com/JuliaLang/julia/issues/29#issuecomment-3056892

Yeah, I don't like either of C(r,i) or Z(r,i). I'll work on the array printing business. One question: is the array version with pairs more acceptable than the vector version? In other words is

julia> [ complex(i,j) | i=1:10, j=1:10 ]
10x10 ComplexPair{Int64} Array:
  (1,1)   (1,2)   (1,3)   (1,4)   (1,5)   (1,6)   (1,7)   (1,8)   (1,9)   (1,10)
  (2,1)   (2,2)   (2,3)   (2,4)   (2,5)   (2,6)   (2,7)   (2,8)   (2,9)   (2,10)
  (3,1)   (3,2)   (3,3)   (3,4)   (3,5)   (3,6)   (3,7)   (3,8)   (3,9)   (3,10)
  (4,1)   (4,2)   (4,3)   (4,4)   (4,5)   (4,6)   (4,7)   (4,8)   (4,9)   (4,10)
  (5,1)   (5,2)   (5,3)   (5,4)   (5,5)   (5,6)   (5,7)   (5,8)   (5,9)   (5,10)
  (6,1)   (6,2)   (6,3)   (6,4)   (6,5)   (6,6)   (6,7)   (6,8)   (6,9)   (6,10)
  (7,1)   (7,2)   (7,3)   (7,4)   (7,5)   (7,6)   (7,7)   (7,8)   (7,9)   (7,10)
  (8,1)   (8,2)   (8,3)   (8,4)   (8,5)   (8,6)   (8,7)   (8,8)   (8,9)   (8,10)
  (9,1)   (9,2)   (9,3)   (9,4)   (9,5)   (9,6)   (9,7)   (9,8)   (9,9)   (9,10)
 (10,1)  (10,2)  (10,3)  (10,4)  (10,5)  (10,6)  (10,7)  (10,8)  (10,9)  (10,10)

acceptable even if

julia> [ complex(2i-1,2i) | i=1:10 ]
[(1,2), (3,4), (5,6), (7,8), (9,10), (11,12), (13,14), (15,16), (17,18), (19,20)]

is no good, or are they both confusing? I could also see the array version being with just commas even:

julia> [ complexi,j | i=1:10, j=1:10 ]
10x10 ComplexPair{Int64} Array:
  1,1   1,2   1,3   1,4   1,5   1,6   1,7   1,8   1,9   1,10
  2,1   2,2   2,3   2,4   2,5   2,6   2,7   2,8   2,9   2,10
  3,1   3,2   3,3   3,4   3,5   3,6   3,7   3,8   3,9   3,10
  4,1   4,2   4,3   4,4   4,5   4,6   4,7   4,8   4,9   4,10
  5,1   5,2   5,3   5,4   5,5   5,6   5,7   5,8   5,9   5,10
  6,1   6,2   6,3   6,4   6,5   6,6   6,7   6,8   6,9   6,10
  7,1   7,2   7,3   7,4   7,5   7,6   7,7   7,8   7,9   7,10
  8,1   8,2   8,3   8,4   8,5   8,6   8,7   8,8   8,9   8,10
  9,1   9,2   9,3   9,4   9,5   9,6   9,7   9,8   9,9   9,10
 10,1  10,2  10,3  10,4  10,5  10,6  10,7  10,8  10,9  10,10

It's very compact and the fact that the array type is always printed above makes it a lot less confusing.

We could also use something like '\u26A' as an alternate name for the im constant:

1+2ɪ

That's pretty compact, and the fact that the ɪ is small and visibly distinct helps make it more readable, IMO.

I find I like this much better without the parens, since it looks less like a tuple and takes less space too. Technically it still is tuple syntax, but it's not how tuples are printed so it's OK.

But I fear we will just have to bite the bullet on this and use 1+2im format or people won't be happy.

The 1+2ɪ syntax has the advantage of also being allowable as valid input.

Another small point: for arrays of arrays we might want to print the elements using summary, or it gets messy.

Also, we might want showall to be the same as the normal array printer, except have it pretend there are infinite rows and columns available. Right now it doesn't do alignment.

That's just a function call at this point. The terminal row and column numbers are just defaults.

@StefanKarpinski As I was browsing through the issues list, this one seemed to be mostly resolved, can you close it (or recreate it as a new issue with the still relevant bits)?

There are actually some additional improvements I've wanted to make for a long time that I was keeping this around as a reminder for. Printing arrays of complex numbers, for example, really kind of sucks. I need to figure out a better way to make that look pretty. Also need to handle printing arrays of large things better. Currently I just don't print them, which is obviously a lousy way to handle that situation.

@StefanKarpinski Any plans to improve the printing of complex arrays? Otherwise, we are doing quite ok on this front.

julia> rand(10)+im*rand(10)
10-element Complex128 Array:
 0.254648+0.0659326im
  0.431824+0.799698im
  0.789352+0.953089im
 0.236006+0.0104477im
  0.325619+0.941836im
  0.393213+0.218764im
 0.060422+0.0951866im
   0.32252+0.363335im
  0.355442+0.563473im
 0.777203+0.0223823im

Perhaps the array printing code should be separated from show.jl and put in something like array_show.jl.

Yes, yes. This has been on my todo list for a rather long time. I have "A Plan"...

I really would like to have a good solution to this, but it is too late to do anything for 0.2. Also, we probably need to have a more general solution for printing columns that could be used by stuff like DataFrames.

Cc: @johnmyleswhite

It also needs to interact nicely with the whole new display framework and IJulia, etc. Definitely 0.3.

Once 0.2 ships let's focus on this. Definitely the ugly way that DataFrames print is the worst failing of that package these days.

@tanmaykm and I were just looking at this, and this can be fixed by specializing alignment and print_matrix_row for Array{Complex}.

We certainly need space around the + or - between the real and imaginary parts.

julia> 10000*randn(10,10) + 100*randn(10,10)*im
10x10 Array{Complex{Float64},2}:
  -1091.0-162.117im  -9289.62-121.309im  …   2867.89+84.8817im
  9608.85-95.743im    10657.7+176.436im     -1658.19+58.3654im
  4465.67+19.3115im   -4730.4+157.073im     -2939.09+34.1366im
 -15306.4-26.1914im  -4901.03-36.9312im      5825.56-91.1303im
  2134.12+38.0862im   10243.1+1.99506im     -3062.62-47.4671im
  2433.68+42.712im    20316.4-72.1663im  …   3738.36+61.6447im
  5354.18-174.893im   9501.97-51.3725im     -2553.53-139.401im
  10316.1+37.0481im  -16223.1-58.0976im      2053.98-50.0158im
  11775.2-125.242im   348.031-11.5898im     -1068.35+100.333im
 -12211.2-2.52959im  -7692.37-111.429im      5696.75+63.1306im

@tanmaykm You had suggested a possible way to fix this without making a huge change. Do you think we can make it in time for 0.3?

Are there any ideas out there for better printing of high-dimensional arrays.

For example I have a 20x20x20x20x10 array. When I forget to have the REPL not display this, I get 20*20*10=4000 "pages" of 20x20 matrices that are indexed like [:, :, i3, i4, i5]

I realize that it's probably not a great idea to have a lot of 5-d arrays floating around in my code, but that is the most natural way to order the data and it would be nice to limit the output to a reasonable level.

+1 for better printing of high-dimensional arrays

Note that when the float point numbers have exponents the alignment becomes wired
Without exponentials, the complex numbers are aligned in the plus or minus sign:

julia> rand(Complex128, 5)
5-element Array{Complex{Float64},1}:
  0.414626+0.949547im
  0.448123+0.498958im
 0.0685263+0.76635im 
  0.131332+0.835893im
  0.135788+0.445638im

With some exponentials, the alignment is with the last sign (including the signs of the exponent, which should be ignored for the alignment)

julia> rand(Complex128, 5).^20
5-element Array{Complex{Float64},1}:
            -2.63525-3.56815im   
 1.22574e-5-1.22442e-5im         
 1.79063e-7-1.96895e-6im         
          0.00109833-0.00711597im
            -0.00427-0.019812im  

That is really confusing indeed.

diff --git a/base/show.jl b/base/show.jl
index 5cfa8ac..35252ad 100644
--- a/base/show.jl
+++ b/base/show.jl
@@ -1343,7 +1343,7 @@ function alignment(io::IO, x::Real)
 end
 "`alignment(1 + 10im)` yields (3,5) for `1 +` and `_10im` (plus sign on left, space on right)"
 function alignment(io::IO, x::Complex)
-    m = match(r"^(.*[\+\-])(.*)$", sprint(0, show, x, env=io))
+    m = match(r"^(.*[^e][\+\-])(.*)$", sprint(0, show, x, env=io))
     m === nothing ? (length(sprint(0, show, x, env=io)), 0) :
                    (length(m.captures[1]), length(m.captures[2]))
 end

might work. Will try locally.

What is left for this issue to be closed? Use spaces between real and imaginary for Complex numbers even in compact mode?

Big numbers are still very ugly, try big.rand(10, 10))

Yeah:

image

Maybe they should be printed compactly even though the are Big.

I would say that the main reason is because the big method makes everything so long that it is spanned in multiple lines.
I checked what's the behavior when making an array of long strings and this is the result with [randstring(80) for i=1:3, j=1,3]:

screenshot from 2017-06-05 16-54-17

I choose 80 to fit with the previous example.

At the end, I think it is a quite consistent behavior. Should we also crop the strings that span into more than a single line? I don't think so.

I will just go ahead and close this one. The last two reported problems are solved on master (big.(rand(10, 10)) and rand(ComplexF64, 5).^20), and I see nothing actionable otherwise. More specific issues should be opened if someone sees other problems.

We could also use something like '\u26A' as an alternate name for the im constant:

1+2ɪ

That's pretty compact, and the fact that the ɪ is small and visibly distinct helps make it more readable, IMO.

Seems this never came to fruition and this is now closed. The printing as of today is pretty good using show, but I was just wondering if there are any plans to make something like this happen at all anymore?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

manor picture manor  Â·  3Comments

omus picture omus  Â·  3Comments

iamed2 picture iamed2  Â·  3Comments

arshpreetsingh picture arshpreetsingh  Â·  3Comments

yurivish picture yurivish  Â·  3Comments