At the moment, you could indicate only Y values for boxplot, but the default width looks strange.
Also would be great to support a list of columns names when a wide dataframe is used. I found difficult to plot series of data (Y) since they use the same X value (1):


Thanks!
There are a couple issues here:
In thinking about this just now, I had the thought that the current method of hoping that a Vector{Any} is good enough to allow dispatch on "processed" data is flawed and ripe for subtle bugs... I should replace the internal logic with a wrapper type:
immutable InputData{T}
data::T
end
so that it's explicit that an input has been processed and wrapped, and dispatch will never get confused. I'll create a separate issue for this, and the arrays of symbols issue should be resolved as part of that change.
Right now I implement the boxplot recipe by explicitly applying the grouping and forcing the xticks to 1:length(shapes)... this will need to be made more flexible to allow overlaying multiple boxplots.
As a stop-gap solution, you could build the arrays as expected by the current recipe:

Boxplot looks broken right now:

The weird boxplot drawing issue is fixed.
I think the solution for the x-axis will be to have some sort of DiscreteAxis type that can map strings, etc to an x/y coordinate. I want to be able to overlay a scatter or violin plot over a boxplot but still allow new series to extend the axis. This can share implementation with the 'setStringVector..." stuff.
In the current master, boxplots are working fine with a categorical variable in x, but it can be used with group.
It can't be used with group.
Yeah I see the bug.. investigating
I think I got it. I'll push the fix soon.

Awesome :D I found a little bug with the whisker length. I will fix it soon.
@tbreloff The group bug was solved for a call like that, where x and group are the same, but it still gives a strange output in the following example:
ToothGrowth = dataset("datasets","ToothGrowth")
boxplot(ToothGrowth, :Dose, :Len, group=:Supp, notch=true)

I disagree that this is strange. At least... it's what I expect/want. The
group arg creates 2 series. Each of those series are boxplots, and each of
those series are then re-grouped over the same x-domain. (Unless I'm
missing something?)
If you want them in different subplots because you don't like the overlap,
you can add 'layout=2' (you'd probably want to 'link=:all' as well), or
maybe make them easier to see by setting 'alpha=0.5'?
On Tuesday, June 7, 2016, Diego Javier Zea [email protected] wrote:
@tbreloff https://github.com/tbreloff The group bug was solved for a
call like that, where x and group are the same, but gives a strange output
in the following exampleToothGrowth = dataset("datasets","ToothGrowth")boxplot(ToothGrowth, :Dose, :Len, group=:Supp, notch=true)
[image: image]
https://cloud.githubusercontent.com/assets/2822757/15881318/6b11a2f2-2d0b-11e6-9e27-9ffaeff548ac.png—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tbreloff/Plots.jl/issues/210#issuecomment-224474458,
or mute the thread
https://github.com/notifications/unsubscribe/AA492nvKb_xARbcw02_dWpPWi8hggrFvks5qJi9mgaJpZM4ISAu6
.
I didn't know that layout=2, link=:all makes the trick (maybe layout=:Supp could be more intuitive and/or similar to ggplot facet grid). The first time I was expecting something like the ggplot2 output:

maybe layout=:Supp could be more intuitive
I can't for the life of me figure out what :Supp is supposed to be. So I wouldn't vote for that being more intuitive! ;)
But these don't sound like very general ideas. What if the x data isn't nicely spaced? What if there are lots of groups? Just seems like its usefulness would be limited, but what do I know? I don't even know what "Supp" is!
Sorry... I was saying that
boxplot(ToothGrowth, :Dose, :Len, layout=:Supp)
would be more intuitive than
boxplot(ToothGrowth, :Dose, :Len, group=:Supp, layout=2, link=:all)
I imagine also layout taking a DataFrames's Formula like ggplot2's facet_grid.
Ha.. oh it's a field not a setting. I can't decide if that makes me look better or worse :open_mouth:
I'm not sure I fully understand what that would mean (in the general sense). This might only work well with dataframe column labels? Even then there's lots of weirdness?
Ok. I understand... In my opinion, no one wants superimposed boxplots (since you compare them side to side). So, having group given supperimposed boxplots instead of having a result similar to ggplot2 is no intuitive. But maybe that is because I used to make a lot of ggplot2 plots. I believe that the actual behavior of group if good for other series, but maybe not so good for boxplot.
As a general stuff, I used to found facet grid taking a R's formula to indicate variables/ data.frame columns of categorical data very useful. So to me, giving a categorical variable to layout means something like: I want a grid with so many plots as factor levels, and plot every data subset according to that levels. But, maybe I'm the only one who expect something like that XD
I think you're not the only one, but... would you agree that this discussion only really makes sense if your inputs are DataFrames and the Symbols for the columns?
Would it make more sense to have a "facet" recipe (similar to how I did marginal hists) which can handle all this stuff? Then it prepares everything for a "generic" boxplot (or whatever else) series recipe... offsetting x-values as needed, creating the layout, etc.
So you would call facet(iris, :Species, <blah blah>, layout = xxx ~ yyy) or something like that, and the facet recipe would replace layout with a real layout based on the formula.
Your facet idea is a lot better than my degeneration of the layout keyword argument ;) But I don't see what should it be restricted to DataFrames...
x = rand(10)
y = rand(10)
z = [0,0,0,0,1,1,1,1,1,1]
w = [1,0,1,0,1,0,1,0,1,0]
facet(x, y, <bla bla>, layout = z ~ w) # Can something like this work?
That may be a lot trickier to implement, as you'd get the Symbols z/w inside the recipe, with no way to access the variables z/w. I'm sure there's a way, it's just not as straightforward as the DataFrame case.
I imagine that maybe we can use a Facet type, which store the variables z and w and make the needed checks in its construction. So, it can use dispatch:
plot(x, y, Facet(z,w))
Other idea can be use a Julia's Pair instead of a DataFrame's Formula. Formula syntax being supported only for DataFrames seems fine to me.
The way I envision it:
@userplot Facet
@recipe function f(facet::Facet; facet_groups = nothing)
# inputs are the tuple: facet.args
# TODO: process args with facet_groups to build a layout and assign series to subplots
end
#usage:
facet(args...; facet_groups = ???)
The Facet user plot looks fine. One thing that R solves using points in its formula (i.e. . ~ var) is to indicate if the categorical variable will generate vertical or horizontal subplots.
What do you think about diverging of the formula syntax and using something like:
facet(args...; x_group=varx, y_group=vary)
@tbreloff Is there a better/elegant way to do this?

using RDatasets
iris = dataset("datasets","iris")
using Plots
pyplot(size=(300,300))
iris[:dummy] = 1 # To plot the boxplot
boxplot(iris, :dummy, [:SepalLength :SepalWidth :PetalLength :PetalWidth], layout=grid(1,4), link=:y)
I was expecting to do something like: boxplot(iris, [:SepalLength :SepalWidth :PetalLength :PetalWidth])
Ugh... I need to recode DataFrames support. I hate how I'm doing it now.
On Wed, Jun 29, 2016 at 3:08 PM, Diego Javier Zea [email protected]
wrote:
@tbreloff https://github.com/tbreloff Is there a better/elegant way to
do this?[image: image]
https://cloud.githubusercontent.com/assets/2822757/16465113/6ab634ba-3e3d-11e6-8db0-34a90ae84b85.pngusing RDatasets
iris = dataset("datasets","iris")using Plotspyplot(size=(300,300))
iris[:dummy] = 1 # To plot the boxplot boxplot(iris, :dummy, [:SepalLength :SepalWidth :PetalLength :PetalWidth], layout=grid(1,4), link=:y)I was expecting to do something like: boxplot(iris, [:SepalLength
:SepalWidth :PetalLength :PetalWidth])—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tbreloff/Plots.jl/issues/210#issuecomment-229457179,
or mute the thread
https://github.com/notifications/unsubscribe/AA492qGnlwAzeOJ3LG3eYGS3e3FpfN8Nks5qQsKugaJpZM4ISAu6
.
@tbreloff other thing about my last example... The boxplot linecolor is equal to the fillcolor, so the median line isn't visible.
You finally motivated me to fix the horribly inflexible DataFrames code, now you can do cool stuff:

These changes aren't pushed up yet.
Most helpful comment
I think I got it. I'll push the fix soon.