Current vegalite supports small multiples only on one axis (with row/col similar to Tableau)
However, it's useful to create wrapped small mutliples (similar to lattice in r / ggplot)
http://codealamode.blogspot.com/2012/02/trellis-graphs-in-ggplot2.html
or this example

Should this be called facet_wrap or facet_matrix?
ggplot uses the former, but essentially it's for creating trellis matrix.
+1
We almost support this already. It's really easy with the new Vega layouts that we are already using.
We still need to design the right syntax for this to avoid confusion with traditional row/column facet channels.
Thanks. I'm currently using vega-lite through altair, so presumably there will be a bit of a delay before it makes there, but good to know it is on the way
Besides supporting wrapped facet, a similar thing to consider when designing syntax would be wrapped concat and wrapped repeat.
Although wrapped concat and wrapped repeat are probably less important, we should consider designing syntaxes for all of them before start implementing wrapped facet.
Just so I'm clear: are you imagining this being a new channel, something along the lines of "encoding": {"facet_grid": {"field": "foo", "type": "ordinal"}}?
are you imagining this being a new channel
I don't think we will support this in channels. Instead, you have to explicitly use the facet operator. Wrapping would either be in the layout specification or we add a wrap in addition to row and column in the facet definition.
Ah, thanks. That makes sense.
I think if we have wrap channel in facet operator, we could consider having a similar shorthand wrap channel in encoding like we have for row and column too?
But there are also some wrapping parameters (like number of column) that we have to consider if adding this shortcut make sense too.
Just to add another voice that this would be fantastic. It is one of the features that I (and the folks around me) seem to use non-stop in ggplot, so adding that would allow us to move a lot more stuff over.
Yes, we will add this. It would be a tremendous help if you could help us by making suggestions for what the interface looks like. The implementation of this feature shouldn't be too hard as we use Vega's layout already for facets: https://vega.github.io/vega/docs/layout/
The only piece of input data that is missing is the maximum number of columns or rows, right?
In that case, I would add two optional values to FacetFieldDef.
The spec would error if both of those values are provided in one FacetFieldDef or if both row and column are defined and at least one of them has max number of columns/rows defined
Since a wrapped facet should only have one field and it's both row and column, I think we should instead introduce a different channel for this (wrap?).
Maybe just facet? Or perhaps multiple in reference to "small multiples"?
For context, the syntax probably look like this in a faceted spec:
data: ...
facet: {
wrap: {
field: 'category', type: 'nominal',
columns: 5 // number of columns before it wraps. (I guess it's the most natural to put the number here?)
}
},
spec: ...
I think facet / multiple probably won't fit in the place of wrap, but I'm open to hear other better terms if there is any.
That looks good.
My 2c after using ggplot2 for many years: I have set both the number of columns or rows. One may trump the other if they are both defined, but I have found useful having both.
@eibanez Currently the underlying Vega layout operator only supports columns. I think this should be sufficient for most of the time. But if you think rows is important too, please file a separate feature request in the Vega repo. Otherwise, we can't support it here anyway.
This looks great! Would this then also work?
{
..
"encoding": {
"wrap": {"field": "Origin", "type": "nominal", "columns": 4}
}
}
Is that what @jakevdp had in mind above?
In the full version it would be
{
..
"facet": {
"wrap": {"field": "Origin", "type": "nominal", "columns": 4}
}
}
I find it a bit odd to have the number of columns in the field def but I guess it makes sense in this case.
"facet": {
"wrap": {"field": "Origin", "type": "nominal", "columns": 4}
}
would be the full version, but I except that a macro channel like row/column would work too
{
..
"encoding": {
"wrap": {"field": "Origin", "type": "nominal", "columns": 4}
}
}
I'm worried that "wrap" as an encoding name will be confusing to users, and from the Altair side I anticipate this is where it will most often be used. "facet" or "facet_wrap" would be clearer as an encoding name.
We already use the column and row channels and so wrap would be more consistent than facet_wrap.
It's consistent if you know it refers to a facet, but nothing in the name implies that. Row and column are clearer in that regard.
Thank you all for your proposals. We will think through them and hopefully add this soon.
As a side note, this feature might interact with #2446.
Another alternative would be having one less level of nesting for wrapped facet (since users can use only one channel anyway):
{
// if facet is fieldDef right away, then it's a wrapped facet
"facet": {"field": "Origin", "type": "nominal", "columns": 4}
"spec": ...
}
then we can have
{
..
"encoding": {
"facet": {"field": "Origin", "type": "nominal", "columns": 4}
}
}
without confusing "wrap" channel like @jakevdp is concerned about.
However, this approach also has a few cons:
1) It's less consistent with the row-based, column-based, or row and column-based faceting.
2) It's still not very clear that facet channel = wrapped facet whereas row and column channels are row-based / column-based variant of facet.
(Despite these cons, currently I actually kinda like this one better than the other one above though.)
I guess one argument for saying facet = wrapped facet is that it will likely be used more often.
(In a way, once we have wrapped facet, I suspect that people will use row/column mostly when they want to use both row and column?)
@kanitw I like that idea.
It has the added benefit that it removes the ambiguity of what it means if someone specifies both column and wrap, for example. This proposal lets you specify either:
but not both.
I think (not 100% sure yet, though) I would prefer a different name for the encoding channel and a top level operator, i.e. something different from @kanitw latest code example.
I'm (finally) landing on what I think is a really nice julia syntax for vega-lite, but the latest design iteration I have kind of relies on not having overlaps between the names of encoding channels and top level spec elements.
@davidanthoff If you can give a brief example of how your API looks like that would be great.
We might take that into consideration. That said, we will have to design what's best for the JSON format here so if the proposal to have facet = wrapped facet seems the best for VL, we may have to do it.
columns at the top-level?By the way, for number of columns, I think it might be better to put it at the top-level, at least for the full syntax since (1) it's a property of the facet operator, not the field (2) we will likely want to apply the same idea to a wrapped repeat so the number of columns should be in a consistent place.
{
// if facet is fieldDef right away, then it's a wrapped facet
"facet": {"field": "Origin", "type": "nominal"}
"spec": ...,
"columns": 4
}
{
// if repeat is an array right away, then it's a wrapped repeat
"repeat": ["a", "b", "c"] // can't put column inside here
"spec": ...,
"columns": 4
}
I guess the facet shorthand have two options (a) have column in the field def or (b) have no column at all (as it's not really a field's property), but users can customize that is a facet config.
Finally, thinking about wrapped concat -- that's basically a flow/grid layout. Not sure what's the best term, but columns can be consistently put at the top-level too.
// this will produce
// spec1 spec2
// spec3 spec4
{
"concat/grid/flow/wrapped_concat": [spec1, spec2, spec3, spec4],
"columns": 2
}
If you can give a brief example of how your API looks like that would be great.
Sure. Nothing is final, at this point, but this is where I'm going right now.
In general you pipe data into a spec. And the most verbose version of doing a spec resembles the JSON closely:
julia
data |> @vlplot(
mark=:bar,
encoding={
x={field=:field_a},
y={field=:field_b}
}
)
And then I provide a number of shortcuts that make common cases easier. I have altair's shorthand, the ability to use enc instead of encoding, and then for the mark type allow specifying the type positional without the mark= keyword. So that gets this down to:
julia
data |> @vlplot(:bar,enc={x="field_a:q",y="field_b:q"})
I'd really like to also get rid of the nesting caused by enc=. In general I feel the hierarchical stuff is really a nice fit for the JSON format, but a real pain at the REPL for interactive work. Just keeping track how many parenthesis you need to close etc. is not a good experience at the REPL if I want to try something quickly. So my latest version also allows one to just skip the whole enc={} step. So then it looks like this:
julia
data |> @vlplot(:bar, x="field_a:q", y="field_b:q")
I think that is pretty much ideal for the simple and common cases at the REPL where I might just want to quickly explore a dataset.
But this last step (optionally leaving out the enc={}) really only works if there are no name overlaps between top level elements and things that one can have inside enc={}... So I'm a bit nervous about that option because it would mean that if you guys introduce a name conflict in a new version of vega-lite, my design would kind of break :)
You can look at many more examples here, here, here and here. All the examples are copied from either the vega-examples or the altair examples, and I've always pasted the original above the julia syntax so that one can easily compare. These notebooks don't yet use all of the simplifications I described in this comment here.
@davidanthoff I see. I think you should distinguish between a Repl method for a single view spec (@vlplot) and methods for view composition like concat / facet as they are different kind of methods anyway. For example, if you have (@vlfacet) for a faceted spec, then facet would be the facet property of Facet spec in @vlfacet. In @vlplot, facet would then be the shortcut facet encoding.
Let me think about that. Right now the view composition is done mostly via various julia operator overloads (I don't have examples for all of those options up yet), which ends up being quite intuitive for the most part.
In collisions are a concern, another option is to add a prefix to the encodings: enc_color, enc_x, ... etc. However, I agree with @kanitw that composition should be an operator and not an argument.
@jheer proposes another alternative which is using row and column to indicate the orientation for the wrapped facet. Basically, we can get something like
data: ...
facet: {
column: {field: 'category', type: 'nominal'}
},
columns: 5, // there will be 5 columns per row
spec: ...
data: ...
facet: {
row: {field: 'category', type: 'nominal'}
},
rows: 5, // there will be 5 rows per column
spec: ...
If both facet.row and facet.column are specified, columns or rows will be ignored and there will no wrapping.
This syntax is nice in the sense that we preserve the existing syntax and only need to extend the facet spec with rows/columns.
(Note that we can support rows by doing some math.)
That said, writing this spec down, I start to feel that I still like the one above that just simply has facet: {field: ..., ...} more as the row/column channels feel a bit redundant here. It also seems to be more clear if we have a flow direction parameter (to say whether it is horizontal or vertical).
I also don't think that the MVP of this need to support wrapped facet with vertical flow.
I like that. But how would you defined this when you are giving column or row in encoding?
An alternative would be allowing one of the row/columns to be defined as it is and the other as a value. That would be read as: encode this data value as rows and use 5 columns.
An alternative would be allowing one of the row/columns to be defined as it is and the other as a value. That would be read as: encode this data value as rows and use 5 columns.
Not sure what you mean here. Can you elaborate?
Something like this:
data: ...
facet: {
column: {field: 'category', type: 'nominal'},
row: 5 // there will be 5 rows
},
spec: ...
Or {value: 5}.
This would allow you to write wrapped facets in this "long" or the "short" form (with encoding).
@eibanez We should not mix between number of columns/rows and the field that the row or column channels encode. You can actually encode things along the columns and limit the number of column to 5 columns per row too. Mixing them up introduce an unnecessary limitations.
think you should distinguish between a Repl method for a single view spec (
@vlplot) and methods for view composition like concat / facet as they are different kind of methods anyway. For example, if you have (@vlfacet) for a faceted spec, thenfacetwould be the facet property of Facet spec in@vlfacet. In@vlplot,facetwould then be the shortcut facet encoding.
I thought about it, and I think I kind of like the idea of additional top level things like @vlfacet, but mostly as an additional option. One of the core design principles I followed (at least so far) was that one could take any valid vega-lite JSON spec and essentially translate it one-to-one to a call to @vlplot, by simply doing some very minimal syntax adjustments. I do think that is a valuable design principle because it makes it possible to essentially take pure vega-lite examples and documentation and apply it more or less directly in julia. I don't think I'd like to give up on that design principle for this one case, which suggests that the naming conflict would remain.
In collisions are a concern, another option is to add a prefix to the encodings:
enc_color,enc_x
I think I'd also like to avoid that, it makes every case more verbose, which is something I'm trying to avoid.
So, from my point of view ideally you guys would use something like facet_wrap for the encoding channel. But, I can completely understand if other considerations swamp that. I think in that case I would just introduce facet_wrap (or wrap or something like it) as the encoding channel shorthand in the julia wrapper. That would be pretty simple. So while I'm trying to avoid too many of these deviations from the vega-lite spec, it would be a second best option that would certainly work.
using row and column to indicate the orientation for the wrapped facet. Basically, we can get something like...
I have to admit that I'm not a fan of having additional things like columns at the top level, I think that pulls things that in my mind belong together too far apart. I would much prefer that for the operator version, everything related to faceting is inside the facet element, i.e. I like the earlier proposal better.
One of the core design principles I followed (at least so far) was that one could take any valid vega-lite JSON spec and essentially translate it one-to-one to a call to @vlplot, by simply doing some very minimal syntax adjustments.
How does this work with arbitrarily nested specs? For example vconcat(hconcat(spec0,layer(spex5,spec6)), hconcat(layer(spec1, spec2, spec3)),spec4).
How does this work with arbitrarily nested specs?
Just write it down as JSON, remove the quotes around key names, replace : with = in key-value pairs, and it should all work. For example:
julia
data |> @vlplot(
vconcat=[
{
mark=:bar,
encoding={
x=:foo
}
},
{
mark=:line,
encoding={
x=:bar
}
}
]
)
There is a more convenient syntax to concat specs, but the raw thing just works as well.
Another complicated issue is that for row/column facet -- we use headers to display labels for each subplot of the facet.
However, for wrapped facet, we need to use a different mechanism (e.g., title of each subplot).
For this reason, the schema of facet wrap's field definition _may_ be different from facet row/column' field definition as it wouldn't have "headers". Or we will have to re-design how we specify facet title and labels.
If we do the former, we shouldn't re-use facet row/column syntax for wrapped facet.
@sirahd Do you want to look into this?
I've summarize the proposed syntax in https://github.com/vega/vega-lite/issues/393. So let's close this and continue our discussion there.
The actual issue is https://github.com/vega/vega-lite/issues/4457
Most helpful comment
The actual issue is https://github.com/vega/vega-lite/issues/4457