Vega-lite: Wrapped facet & wrapped concat / repeat

Created on 21 Mar 2015 · 48Comments · Source: vega/vega-lite

Current vegalite supports small multiples only on one axis (with row/col similar to Tableau)

However, it's useful to create wrapped small mutliples (similar to lattice in r / ggplot)
http://codealamode.blogspot.com/2012/02/trellis-graphs-in-ggplot2.html

or this example

Area - View Composition

Source

kanitw

👍9 ❤1

Most helpful comment

The actual issue is https://github.com/vega/vega-lite/issues/4457

domoritz on 23 Jan 2019

👍2 😄1

All 48 comments

Should this be called facet_wrap or facet_matrix?

ggplot uses the former, but essentially it's for creating trellis matrix.

kanitw on 19 Jan 2016

eocarragain on 22 May 2017

We almost support this already. It's really easy with the new Vega layouts that we are already using.

domoritz on 22 May 2017

We still need to design the right syntax for this to avoid confusion with traditional row/column facet channels.

kanitw on 22 May 2017

Thanks. I'm currently using vega-lite through altair, so presumably there will be a bit of a delay before it makes there, but good to know it is on the way

eocarragain on 22 May 2017

Besides supporting wrapped facet, a similar thing to consider when designing syntax would be wrapped concat and wrapped repeat.

Although wrapped concat and wrapped repeat are probably less important, we should consider designing syntaxes for all of them before start implementing wrapped facet.

kanitw on 4 Mar 2018

Just so I'm clear: are you imagining this being a new channel, something along the lines of "encoding": {"facet_grid": {"field": "foo", "type": "ordinal"}}?

jakevdp on 30 Mar 2018

are you imagining this being a new channel

I don't think we will support this in channels. Instead, you have to explicitly use the facet operator. Wrapping would either be in the layout specification or we add a wrap in addition to row and column in the facet definition.

domoritz on 30 Mar 2018

Ah, thanks. That makes sense.

jakevdp on 31 Mar 2018

I think if we have wrap channel in facet operator, we could consider having a similar shorthand wrap channel in encoding like we have for row and column too?

But there are also some wrapping parameters (like number of column) that we have to consider if adding this shortcut make sense too.

kanitw on 31 Mar 2018

Just to add another voice that this would be fantastic. It is one of the features that I (and the folks around me) seem to use non-stop in ggplot, so adding that would allow us to move a lot more stuff over.

davidanthoff on 18 May 2018

Yes, we will add this. It would be a tremendous help if you could help us by making suggestions for what the interface looks like. The implementation of this feature shouldn't be too hard as we use Vega's layout already for facets: https://vega.github.io/vega/docs/layout/

domoritz on 18 May 2018

The only piece of input data that is missing is the maximum number of columns or rows, right?

In that case, I would add two optional values to FacetFieldDef.

The spec would error if both of those values are provided in one FacetFieldDef or if both row and column are defined and at least one of them has max number of columns/rows defined

eibanez on 18 May 2018

Since a wrapped facet should only have one field and it's both row and column, I think we should instead introduce a different channel for this (wrap?).

kanitw on 19 May 2018

👍2

Maybe just facet? Or perhaps multiple in reference to "small multiples"?

jakevdp on 19 May 2018

For context, the syntax probably look like this in a faceted spec:

data: ...
facet: {
   wrap: {
     field: 'category', type: 'nominal', 
     columns: 5 // number of columns before it wraps. (I guess it's the most natural to put the number here?)
   }
},
spec: ...

I think facet / multiple probably won't fit in the place of wrap, but I'm open to hear other better terms if there is any.

kanitw on 19 May 2018

👍1

That looks good.

My 2c after using ggplot2 for many years: I have set both the number of columns or rows. One may trump the other if they are both defined, but I have found useful having both.

eibanez on 19 May 2018

@eibanez Currently the underlying Vega layout operator only supports columns. I think this should be sufficient for most of the time. But if you think rows is important too, please file a separate feature request in the Vega repo. Otherwise, we can't support it here anyway.

kanitw on 19 May 2018

This looks great! Would this then also work?
{ .. "encoding": { "wrap": {"field": "Origin", "type": "nominal", "columns": 4} } }
Is that what @jakevdp had in mind above?

davidanthoff on 19 May 2018

👍1

In the full version it would be

{
  ..
  "facet": {
    "wrap": {"field": "Origin", "type": "nominal", "columns": 4}
  }
}

I find it a bit odd to have the number of columns in the field def but I guess it makes sense in this case.

domoritz on 19 May 2018

"facet": {
    "wrap": {"field": "Origin", "type": "nominal", "columns": 4}
 }

would be the full version, but I except that a macro channel like row/column would work too

{
  ..
  "encoding": {
    "wrap": {"field": "Origin", "type": "nominal", "columns": 4}
  }
}

kanitw on 19 May 2018

I'm worried that "wrap" as an encoding name will be confusing to users, and from the Altair side I anticipate this is where it will most often be used. "facet" or "facet_wrap" would be clearer as an encoding name.

jakevdp on 19 May 2018

We already use the column and row channels and so wrap would be more consistent than facet_wrap.

domoritz on 19 May 2018

It's consistent if you know it refers to a facet, but nothing in the name implies that. Row and column are clearer in that regard.

jakevdp on 19 May 2018

Thank you all for your proposals. We will think through them and hopefully add this soon.

As a side note, this feature might interact with #2446.

domoritz on 19 May 2018

Another alternative would be having one less level of nesting for wrapped facet (since users can use only one channel anyway):

{
  // if facet is fieldDef right away, then it's a wrapped facet
  "facet": {"field": "Origin", "type": "nominal", "columns": 4} 
  "spec": ...
}

then we can have

{
  ..
  "encoding": {
    "facet": {"field": "Origin", "type": "nominal", "columns": 4}
  }
}

without confusing "wrap" channel like @jakevdp is concerned about.

However, this approach also has a few cons:

1) It's less consistent with the row-based, column-based, or row and column-based faceting.
2) It's still not very clear that facet channel = wrapped facet whereas row and column channels are row-based / column-based variant of facet.

(Despite these cons, currently I actually kinda like this one better than the other one above though.)

kanitw on 19 May 2018

👍1

I guess one argument for saying facet = wrapped facet is that it will likely be used more often.

(In a way, once we have wrapped facet, I suspect that people will use row/column mostly when they want to use both row and column?)

kanitw on 19 May 2018

👍1

@kanitw I like that idea.

It has the added benefit that it removes the ambiguity of what it means if someone specifies both column and wrap, for example. This proposal lets you specify either:

row and/or column, which has a well-defined output, or
a wrapped facet, which has a well-defined output

but not both.

jakevdp on 19 May 2018

👍1

I think (not 100% sure yet, though) I would prefer a different name for the encoding channel and a top level operator, i.e. something different from @kanitw latest code example.

I'm (finally) landing on what I think is a really nice julia syntax for vega-lite, but the latest design iteration I have kind of relies on not having overlaps between the names of encoding channels and top level spec elements.

davidanthoff on 20 May 2018

@davidanthoff If you can give a brief example of how your API looks like that would be great.

We might take that into consideration. That said, we will have to design what's best for the JSON format here so if the proposal to have facet = wrapped facet seems the best for VL, we may have to do it.

Put `columns` at the top-level?

By the way, for number of columns, I think it might be better to put it at the top-level, at least for the full syntax since (1) it's a property of the facet operator, not the field (2) we will likely want to apply the same idea to a wrapped repeat so the number of columns should be in a consistent place.

{
  // if facet is fieldDef right away, then it's a wrapped facet
  "facet": {"field": "Origin", "type": "nominal"} 
  "spec": ...,
  "columns": 4
}

{
  // if repeat is an array right away, then it's a wrapped repeat
  "repeat": ["a", "b", "c"] // can't put column inside here
  "spec": ...,
  "columns": 4
}

I guess the facet shorthand have two options (a) have column in the field def or (b) have no column at all (as it's not really a field's property), but users can customize that is a facet config.

Wrapped concat = Grid ?

Finally, thinking about wrapped concat -- that's basically a flow/grid layout. Not sure what's the best term, but columns can be consistently put at the top-level too.

// this will produce  
// spec1  spec2
// spec3  spec4

{
  "concat/grid/flow/wrapped_concat": [spec1, spec2, spec3, spec4],
  "columns": 2  
}

kanitw on 22 May 2018

If you can give a brief example of how your API looks like that would be great.

Sure. Nothing is final, at this point, but this is where I'm going right now.

In general you pipe data into a spec. And the most verbose version of doing a spec resembles the JSON closely:
julia data |> @vlplot( mark=:bar, encoding={ x={field=:field_a}, y={field=:field_b} } )
And then I provide a number of shortcuts that make common cases easier. I have altair's shorthand, the ability to use enc instead of encoding, and then for the mark type allow specifying the type positional without the mark= keyword. So that gets this down to:
julia data |> @vlplot(:bar,enc={x="field_a:q",y="field_b:q"})
I'd really like to also get rid of the nesting caused by enc=. In general I feel the hierarchical stuff is really a nice fit for the JSON format, but a real pain at the REPL for interactive work. Just keeping track how many parenthesis you need to close etc. is not a good experience at the REPL if I want to try something quickly. So my latest version also allows one to just skip the whole enc={} step. So then it looks like this:
julia data |> @vlplot(:bar, x="field_a:q", y="field_b:q")
I think that is pretty much ideal for the simple and common cases at the REPL where I might just want to quickly explore a dataset.

But this last step (optionally leaving out the enc={}) really only works if there are no name overlaps between top level elements and things that one can have inside enc={}... So I'm a bit nervous about that option because it would mean that if you guys introduce a name conflict in a new version of vega-lite, my design would kind of break :)

You can look at many more examples here, here, here and here. All the examples are copied from either the vega-examples or the altair examples, and I've always pasted the original above the julia syntax so that one can easily compare. These notebooks don't yet use all of the simplifications I described in this comment here.

davidanthoff on 23 May 2018

@davidanthoff I see. I think you should distinguish between a Repl method for a single view spec (@vlplot) and methods for view composition like concat / facet as they are different kind of methods anyway. For example, if you have (@vlfacet) for a faceted spec, then facet would be the facet property of Facet spec in @vlfacet. In @vlplot, facet would then be the shortcut facet encoding.

kanitw on 23 May 2018

Let me think about that. Right now the view composition is done mostly via various julia operator overloads (I don't have examples for all of those options up yet), which ends up being quite intuitive for the most part.

davidanthoff on 23 May 2018

In collisions are a concern, another option is to add a prefix to the encodings: enc_color, enc_x, ... etc. However, I agree with @kanitw that composition should be an operator and not an argument.

domoritz on 23 May 2018

@jheer proposes another alternative which is using row and column to indicate the orientation for the wrapped facet. Basically, we can get something like

data: ...
facet: {
   column: {field: 'category', type: 'nominal'}
},
columns: 5, // there will be 5 columns per row
spec: ...

data: ...
facet: {
   row: {field: 'category', type: 'nominal'}
},
rows: 5, // there will be 5 rows per column
spec: ...

If both facet.row and facet.column are specified, columns or rows will be ignored and there will no wrapping.

This syntax is nice in the sense that we preserve the existing syntax and only need to extend the facet spec with rows/columns.

(Note that we can support rows by doing some math.)

That said, writing this spec down, I start to feel that I still like the one above that just simply has facet: {field: ..., ...} more as the row/column channels feel a bit redundant here. It also seems to be more clear if we have a flow direction parameter (to say whether it is horizontal or vertical).

I also don't think that the MVP of this need to support wrapped facet with vertical flow.

kanitw on 26 May 2018

👍1

I like that. But how would you defined this when you are giving column or row in encoding?

An alternative would be allowing one of the row/columns to be defined as it is and the other as a value. That would be read as: encode this data value as rows and use 5 columns.

eibanez on 26 May 2018

An alternative would be allowing one of the row/columns to be defined as it is and the other as a value. That would be read as: encode this data value as rows and use 5 columns.

Not sure what you mean here. Can you elaborate?

kanitw on 26 May 2018

Something like this:

data: ... facet: { column: {field: 'category', type: 'nominal'}, row: 5 // there will be 5 rows }, spec: ...

eibanez on 26 May 2018

Or {value: 5}.

This would allow you to write wrapped facets in this "long" or the "short" form (with encoding).

eibanez on 26 May 2018

@eibanez We should not mix between number of columns/rows and the field that the row or column channels encode. You can actually encode things along the columns and limit the number of column to 5 columns per row too. Mixing them up introduce an unnecessary limitations.

kanitw on 26 May 2018

think you should distinguish between a Repl method for a single view spec (@vlplot) and methods for view composition like concat / facet as they are different kind of methods anyway. For example, if you have (@vlfacet) for a faceted spec, then facet would be the facet property of Facet spec in @vlfacet. In @vlplot, facet would then be the shortcut facet encoding.

I thought about it, and I think I kind of like the idea of additional top level things like @vlfacet, but mostly as an additional option. One of the core design principles I followed (at least so far) was that one could take any valid vega-lite JSON spec and essentially translate it one-to-one to a call to @vlplot, by simply doing some very minimal syntax adjustments. I do think that is a valuable design principle because it makes it possible to essentially take pure vega-lite examples and documentation and apply it more or less directly in julia. I don't think I'd like to give up on that design principle for this one case, which suggests that the naming conflict would remain.

In collisions are a concern, another option is to add a prefix to the encodings: enc_color, enc_x

I think I'd also like to avoid that, it makes every case more verbose, which is something I'm trying to avoid.

So, from my point of view ideally you guys would use something like facet_wrap for the encoding channel. But, I can completely understand if other considerations swamp that. I think in that case I would just introduce facet_wrap (or wrap or something like it) as the encoding channel shorthand in the julia wrapper. That would be pretty simple. So while I'm trying to avoid too many of these deviations from the vega-lite spec, it would be a second best option that would certainly work.

davidanthoff on 29 May 2018

using row and column to indicate the orientation for the wrapped facet. Basically, we can get something like...

I have to admit that I'm not a fan of having additional things like columns at the top level, I think that pulls things that in my mind belong together too far apart. I would much prefer that for the operator version, everything related to faceting is inside the facet element, i.e. I like the earlier proposal better.

davidanthoff on 29 May 2018

One of the core design principles I followed (at least so far) was that one could take any valid vega-lite JSON spec and essentially translate it one-to-one to a call to @vlplot, by simply doing some very minimal syntax adjustments.

How does this work with arbitrarily nested specs? For example vconcat(hconcat(spec0,layer(spex5,spec6)), hconcat(layer(spec1, spec2, spec3)),spec4).

domoritz on 29 May 2018

How does this work with arbitrarily nested specs?

Just write it down as JSON, remove the quotes around key names, replace : with = in key-value pairs, and it should all work. For example:
julia data |> @vlplot( vconcat=[ { mark=:bar, encoding={ x=:foo } }, { mark=:line, encoding={ x=:bar } } ] )
There is a more convenient syntax to concat specs, but the raw thing just works as well.

davidanthoff on 29 May 2018

👍1

Another complicated issue is that for row/column facet -- we use headers to display labels for each subplot of the facet.

However, for wrapped facet, we need to use a different mechanism (e.g., title of each subplot).

For this reason, the schema of facet wrap's field definition _may_ be different from facet row/column' field definition as it wouldn't have "headers". Or we will have to re-design how we specify facet title and labels.

If we do the former, we shouldn't re-use facet row/column syntax for wrapped facet.

kanitw on 6 Jun 2018

@sirahd Do you want to look into this?

domoritz on 3 Oct 2018

I've summarize the proposed syntax in https://github.com/vega/vega-lite/issues/393. So let's close this and continue our discussion there.

kanitw on 15 Jan 2019

The actual issue is https://github.com/vega/vega-lite/issues/4457

domoritz on 23 Jan 2019

👍2 😄1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

setting "width":"container" is not working on Edge

learnwithratnesh · 4Comments

How does lasso selection fits into our interaction grammar?

kanitw · 3Comments

Support different Vega subsets

domoritz · 4Comments

Too many `point`? (mark, selection, scale type)

kanitw · 3Comments

Cannot set month as expression in a DateTime definition object for scale domains

mcadams92 · 3Comments

Vega-lite: Wrapped facet & wrapped concat / repeat

Most helpful comment

All 48 comments

Put columns at the top-level?

Wrapped concat = Grid ?

Related issues

Put `columns` at the top-level?