Julia: "whos" is a bad name for the functionality

Created on 13 Jul 2015  Â·  51Comments  Â·  Source: JuliaLang/julia

Of course it was named by analogy of the matlab command. But whos is one of the more obscure names in julia. If at all, who should give users, not things. Second, the "s" in "whos" seems to be a contraction of "who + sizes" , making whos an extended version of who in matlab. In julia there is no who of course and whos does not report sizes. I am a bit ashamed of reporting this, but apparently I care.

breaking good first issue help wanted

Most helpful comment

Whatever it is called, let me make a plea to for whos() to actually return a data structure of some sort that is displayed, rather than acting via side effects. Except for functions that actually perform I/O, I think relying on side effects is a bad idea.

For one thing, returning a value allows the caller to perform further processing on the data if desired. For another thing, returning a data structure allows for more flexible display, e.g. a writemime(io, "text/html", ...) could output an HTML table and display nicely in IJulia. (See also JuliaLang/IJulia.jl#342.)

All 51 comments

Agree - but perhaps a suggestion for a better name? We used to have sizes and they got removed at some point.

Maybe globals? or objects? In R, this is called ls.

Anyway, +1 to a better name.

Random thoughts: (i wasn't even aware that 'who' exists in matlab)

  • variables
  • listvar
  • varlist

as commands
i always liked this dir(object) in python. I guess the correct julia case would be dir(Main).

P.S. workspace() for cleaning up could also use a better name -> new_workspace()

Tangentially, I posted a pull request to report sizes in whos. It is still open, but I don't have the number handy.

11461

dir is a nice option, also ls. It's only used by people working at the REPL so it shouldn't be a long name like variables.

Also, if whos is getting attention, it would be nice to reach a consensus about not displaying imported modules by default which can really clutter it up. See issue #9902, PR #10108.

I think that this function conflates two rather different things:

  • what names are bound in a given namespace (e.g. Main)
  • how much memory am I using and what is claiming it

Perhaps the APIs to discover these things should not be crammed into one.

I agree a separate function just to get a list of names should exist. But for the second function, returning the memory used by each object, you obviously need to also return the names to identify them.

what(AbstractString, Number, -Integer)
<string name>           ASCIIString
<string name>           UTF8String
<complex name>          Complex{Float64}

about(AbstractString, Number, -Integer)
<string name>           ASCIIString         <concise size>
<string name>           UTF8String          <concise size>
<complex name>          Complex{Float64}    <concise size>

where(<pkg|module name>,.. , what(AbstractString, Number, -Integer))
<string name>           ASCIIString         <pkg|module>
<string name>           UTF8String          <pkg|module>          
<complex name>          Complex{Float64}    <pkg|module>

where(<pkg|module name>,.. , about(AbstractString, Number, -Integer))
<string name>           ASCIIString         <concise size>      <pkg|module>
<string name>           UTF8String          <concise size>      <pkg|module>
<complex name>          Complex{Float64}    <concise size>      <pkg|module>

It would be nice to just get rid of whos, even if by just renaming it to something better - possibly for 0.4

I like scope for this. (No doubt that's controversial.)

ls() is free, short and has precedence in R and is discoverable.

ls +1

You also get information about the size and the location using bash's ls.

P.S. Can whos/ls return a vector of Symbols?

:-1: on calling it ls, do we need to propagate short indecipherable names in today's world with completion? Also, hopefully Julia is on it's way to being well-known as a general purpose language, not just a niche numeric/technical/scientific computing language (although, yes, data science is exploding lately), so precedence in R doesn't seem that important.
(although ls would be better than whos, which came from Matlab)

Programming languages and operating systems are replete with the overly
compressed and slightly bent choices -- most were introduced to advance
simple wants and adopted as harbingers of greater efficacy and clarity.

Julia is becoming ever more an engaging conversationalist -- the community
interacts with greater ease as talking with the language becomes a way that
goals are accomplished. Julia does not speak with "a computer accent" and
one good from that is we spend less time attending missed understandings.

There are many fine short words. +1 for those that are transparent in use.

On Sun, Jul 26, 2015 at 6:36 AM, Scott P. Jones [email protected]
wrote:

[image: :-1:] on calling it ls, do we need to propagate short
indecipherable names in today's world with completion? Also, hopefully
Julia is on it's way to being well-known as a general purpose language, not
just a niche numeric/technical/scientific computing language (although,
yes, data science is exploding lately), so precedence in R doesn't seem
that important.

—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/12131#issuecomment-124967724.

+1 for ls. And we could even do _ls_ for names only and _la_ for names+memory.

Once there was COBOL...

    PERFORM UNTIL B = 0
        MOVE A TO TEMP
        MOVE B TO A
        DIVIDE TEMP BY B GIVING TEMP REMAINDER B
    END-PERFORM

I know ls is l(i)s(t), what is la?

On Mon, Jul 27, 2015 at 4:38 AM, Andreas Lobinger [email protected]
wrote:

+1 for ls. And we could even do ls for names only la for names+memory.

—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/12131#issuecomment-125127720.

... on a lot of systems i visited the last years: an alias to ls -la

so
ls = list
la = list all

ls is an overly short unix-ism, and screams filesystem to me. names(Main) seems to do most of this, just without a summary description column. [typeof(Main.(i)) for i in names(Main)] gives at least the types, though I think that Main.(i) syntax might change to getfield(Main, i) at some point.

yezindeedee (and goodnight)

On Mon, Jul 27, 2015 at 5:46 AM, Tony Kelman [email protected]
wrote:

ls is an overly short unix-ism, and screams filesystem to me. names(Main)
seems to do most of this, just without a summary description column. [typeof(Main.(i))
for i in names(Main)] gives at least the types, though I think that
Main.(i) syntax might change to getfield(Main, i) at some point.

—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/12131#issuecomment-125148667.

How about names() with no args defaults to global names in the current module? We could use memory() or something like that to give a breakdown of memory reachable from global roots and what roots account for what portion of it.

+1 names()
-1 ls()

+1 names() and restructure documentation, so it shows up in Essentials.

+1

On Mon, Jul 27, 2015 at 9:39 AM, Andreas Lobinger [email protected]
wrote:

+1 names() and restructure documentation, so it shows up in Essentials.

—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/12131#issuecomment-125206983.

This may be too much change, but... what if we change: whos() -->
workspace(), and workspace() --> clean_workspace()

I always thought that resetting your session should not happen with a
function that looks like it should just provide information. At a minimum,
maybe we could change: workspace() --> workspace!()

On Mon, Jul 27, 2015 at 9:38 AM, Andreas Lobinger [email protected]
wrote:

+1 names() and restructure documentation, so it shows up in Essentials.

—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/12131#issuecomment-125206983.

+1 to names() if it has the same output format as names(someobj), it seems nicely dispatch-y and consistent with what names means.

Should the default be "global names in the current module" or "currently available unqualified names" (i.e. the set of names that you can use without prepending X.). Maybe that's not a well-defined enough concept to be useful.

Also memory or memusage seems really useful. +1 to that as well.

@tbreloff, I agree, workspace() seems like a dangerous name, since it doesn't follow the ! convention and isn't named clear_workspace() or something of that ilk.
Maybe workspace() could return the current workspace, and workspace!([workspace]) could set it (returning the old one).
I like names() and memory() or memusage() also.

Currently visible global names is a good default since it is hard to figure out whereas the names of a particular module is easy to ask for by doing names(M) where M is the module object.

On Jul 27, 2015, at 10:14 AM, Spencer Russell [email protected] wrote:

+1 to names() if it has the same output format as names(someobj), it seems nicely dispatch-y and consistent with what names means.

Should the default be "global names in the current module" or "currently available unqualified names" (i.e. the set of names that you can use without prepending X.). Maybe that's not a well-defined enough concept to be useful.

Also memory or memusage seems really useful. +1 to that as well.

—
Reply to this email directly or view it on GitHub.

Whatever it is called, let me make a plea to for whos() to actually return a data structure of some sort that is displayed, rather than acting via side effects. Except for functions that actually perform I/O, I think relying on side effects is a bad idea.

For one thing, returning a value allows the caller to perform further processing on the data if desired. For another thing, returning a data structure allows for more flexible display, e.g. a writemime(io, "text/html", ...) could output an HTML table and display nicely in IJulia. (See also JuliaLang/IJulia.jl#342.)

Very very good suggestion, which would be nice to have documented as part of best practices for Julia code.
:+1: :100:

Proposal: call it varinfo(::Module) and have it return an object as @stevengj suggests.

Python calls this globals() but it only gives a list of their names, not the variable info.

Python calls Python Python

globals makes a lot more sense to me than varinfo.

Something like this?

struct FieldDetail
    name::Symbol
    bytes::Int
    summary::String
end

struct ObjectTable{O} <: AbstractArray{O, 1}
    objects::Vector{O}
end

get_field_detail(anobject, asymbol) = begin
    value = getfield(anobject, asymbol)
    FieldDetail(asymbol, summarysize(value), summary(value))
end

global_details(m::Module, pattern = r"") = ObjectTable([
    get_field_detail(m, v)
    for v in sort!(names(m))
    if isdefined(m, v) && ismatch(pattern, string(v))
])

IndexStyle(t::ObjectTable) = IndexLinear()
size(t::ObjectTable) = size(t.objects)
getindex(t::ObjectTable, i::Int) = getindex(t.objects, i)
setindex!(t::ObjectTable, v, i::Int) = setindex!(t.objeccts, v, i)

Base.showarray(io::IO, o::ObjectTable, repr::Bool) = begin
    fnames = fieldnames(eltype(t))
    Base.show(io, 
        Markdown.MD([Markdown.Table(
            unshift!(
                map(t.objects) do object
                    string.(getfield.(object, fnames))
                end, 
                string.(fnames)), 
            repeat([:l], inner = length(fnames)))]))
end

```
julia> global_details(Base, r"^sin")
| name | bytes | summary |
|:----- |:----- |:---------------- |
| sin | 0 | Base.#sin |
| sinc | 0 | Base.Math.#sinc |
| sind | 0 | Base.Math.#sind |
| sinh | 0 | Base.#sinh |
| sinpi | 0 | Base.Math.#sinpi |

Adding an entire new kind of array just for this case seems like overkill to me.

Eh, maybe. I needed a new type to avoid the default show methods for vector. Here's another more explicit implementation:

struct FieldDetail
    name::Symbol
    bytes::Int
    summary::String
end

get_field_detail(anobject, asymbol) = begin
    value = getfield(anobject, asymbol)
    FieldDetail(asymbol, Base.summarysize(value), summary(value))
end

global_details(m::Module, pattern = r"") = [
    get_field_detail(m, v)
    for v in sort!(names(m))
    if isdefined(m, v) && ismatch(pattern, string(v))
]

object_table(v::Vector{T}) where T = begin
    fnames = fieldnames(T)
    Markdown.MD([Markdown.Table(
        unshift!(
            map(v) do object
                string.(getfield.(object, fnames))
            end, 
            string.(fnames)), 
        repeat([:l], inner = length(fnames)))])
end

object_table(global_details(Base, r"^sin"))

ObjectTables (or an object table function) seem more broadly useful to me. For example, they are a nice way to print a vector of NamedTuples.

Yes, I think this is a great use case for a lightweight built-in dataframe-like type.

This kind of lightweight table-like structure is needed in many applications, so it would indeed be great to have a standard way of representing them. That would prevent people from misusing DataFrames for this as is too often the case (probably inspired by what R does).

Question: fieldnames are statically known to the compiler, right? I don't see this reflected in the output of @code_warntype fieldnames(FieldDetails). I think this is the source of type instability in the code above.

Much simpler solution: whenever NamedTuples drop in base, have an Vector of NamedTuples print as a markdown table. Then whos could simply return a vector of NamedTuples.

I like varinfo over globals, since there's not much reason this is specific to global variables. It could in theory give information about any variables.

@varinfo in local scope? Or do you mean calling on values?

[@]varinfo(x::T) where T<:Union{Function, Module}; varinfo() == varinfo(Main)

if that, maybe [@]fieldinfo(x::T) where T isa struct

Maybe we should have someone who should be assigned for this?

The only thing to do here is rename whos to varinfo. In the future a @varinfo macro could potentially give info about local variables as well.

@StefanKarpinski, doesn't it also have to be changed to return a data structure instead of printing via side effects?

We're getting down to crunch time for 1.0 feature freeze and that seems pretty featurey to me. Since this is strictly interactive functionality, as long as the name correct for 1.0, we can change what it returns and how info gets printed in 1.x. Changing from returning nothing and printing info to returning a value with special display is the safer direction anyway. Given that, I'm leaving this issue to address only the name change with a 1.0 milestone. You can open a 1.x feature that proposes changing the function to return a value with special printing.

It might be good to just have it return a Markdown.Table.

Regardless, that can be done at any point in the future.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dpsanders picture dpsanders  Â·  3Comments

wilburtownsend picture wilburtownsend  Â·  3Comments

i-apellaniz picture i-apellaniz  Â·  3Comments

musm picture musm  Â·  3Comments

TotalVerb picture TotalVerb  Â·  3Comments