Illustrated by the following spec:
When you specify format in a vega legend, the generated vega spec does not have any format property present:
{
"data": {
"values": [
{"x": 0}, {"x": 1}, {"x": 2}, {"x": 3}, {"x": 4}, {"x": 5}, {"x": 6}
]
},
"transform": [{"calculate": "sin(datum.x)", "as": "y"}],
"mark": "point",
"encoding": {
"color": {
"type": "ordinal",
"field": "y",
"legend": {"format": ".2f"},
"scale": {"scheme": "reds"}
},
"x": {"type": "quantitative", "field": "x"},
"y": {"type": "quantitative", "field": "y"}
}
}
Format works for quantitative encodings. I'm not sure whether this is really a bug since it's not obvious whether the format should be a number or a date format. The same problem occurs with axes, btw. This is not unique to legends.
Having said this, I think we could apply format as a numberFormat and add as new property timeFormat.
That makes sense... I've noticed though that if I manually add the format argument to the compiled vega spec, it produces the result that I would have expected from the vega-lite:

Is there a reason not to assume the user knows what they want to do, and propagate this argument to the compiled spec?
Vega spec for the above chart:
{
"$schema": "https://vega.github.io/schema/vega/v4.json",
"autosize": "pad",
"padding": 5,
"width": 200,
"height": 200,
"style": "cell",
"data": [
{
"name": "source_0",
"values": [
{"x": 0},
{"x": 1},
{"x": 2},
{"x": 3},
{"x": 4},
{"x": 5},
{"x": 6}
]
},
{
"name": "data_0",
"source": "source_0",
"transform": [
{"type": "formula", "expr": "toNumber(datum[\"x\"])", "as": "x"},
{"type": "formula", "expr": "sin(datum.x)", "as": "y"}
]
}
],
"marks": [
{
"name": "marks",
"type": "symbol",
"style": ["point"],
"from": {"data": "data_0"},
"encode": {
"update": {
"opacity": {"value": 0.7},
"fill": [
{
"test": "datum[\"x\"] === null || isNaN(datum[\"x\"]) || datum[\"y\"] === null || isNaN(datum[\"y\"])",
"value": null
},
{"value": "transparent"}
],
"stroke": [
{
"test": "datum[\"x\"] === null || isNaN(datum[\"x\"]) || datum[\"y\"] === null || isNaN(datum[\"y\"])",
"value": null
},
{"scale": "color", "field": "y"}
],
"tooltip": {
"signal": "{\"x\": format(datum[\"x\"], \"\"), \"y\": format(datum[\"y\"], \"\")}"
},
"x": {"scale": "x", "field": "x"},
"y": {"scale": "y", "field": "y"}
}
}
}
],
"scales": [
{
"name": "x",
"type": "linear",
"domain": {"data": "data_0", "field": "x"},
"range": [0, {"signal": "width"}],
"nice": true,
"zero": true
},
{
"name": "y",
"type": "linear",
"domain": {"data": "data_0", "field": "y"},
"range": [{"signal": "height"}, 0],
"nice": true,
"zero": true
},
{
"name": "color",
"type": "ordinal",
"domain": {"data": "data_0", "field": "y", "sort": true},
"range": {"scheme": "reds"}
}
],
"axes": [
{
"scale": "x",
"orient": "bottom",
"grid": false,
"title": "x",
"labelFlush": true,
"labelOverlap": true,
"tickCount": {"signal": "ceil(width/40)"},
"zindex": 1
},
{
"scale": "x",
"orient": "bottom",
"gridScale": "y",
"grid": true,
"tickCount": {"signal": "ceil(width/40)"},
"domain": false,
"labels": false,
"maxExtent": 0,
"minExtent": 0,
"ticks": false,
"zindex": 0
},
{
"scale": "y",
"orient": "left",
"grid": false,
"title": "y",
"labelOverlap": true,
"tickCount": {"signal": "ceil(height/40)"},
"zindex": 1
},
{
"scale": "y",
"orient": "left",
"gridScale": "x",
"grid": true,
"tickCount": {"signal": "ceil(height/40)"},
"domain": false,
"labels": false,
"maxExtent": 0,
"minExtent": 0,
"ticks": false,
"zindex": 0
}
],
"legends": [
{
"stroke": "color",
"title": "y",
"format": ".2f",
"encode": {
"symbols": {
"update": {
"fill": {"value": "transparent"},
"opacity": {"value": 0.7}
}
}
}
}
],
"config": {"axisY": {"minExtent": 30}}
}
Yeah, I think we should propagate format to be the number format automatically.
I agree that we should fix this bug, though I disagree about the following part:
Having said this, I think we could apply format as a numberFormat and add as new property timeFormat.
We should not introduce a new property here. Raw time data without time unit generally do not make sense to be ordinal, so in most cases, we could apply format to timeFormat when there is timeUnit for the ordinal field.
To support the rare case, we could add formatType: 'number' | 'time' to allow explicit specification. This is better than adding timeFormat as format would normally work for ordinal time field with timeUnit.
Okay, let's do the following
formatType but it wouldn't be a breaking change. Use format as number format for ordinal and nominal
without timeUnit (and use as timeFormat, with timeUnit)
Most helpful comment
I agree that we should fix this bug, though I disagree about the following part:
We should not introduce a new property here. Raw time data without time unit generally do not make sense to be ordinal, so in most cases, we could apply
formatto timeFormat when there istimeUnitfor the ordinal field.To support the rare case, we could add
formatType: 'number' | 'time'to allow explicit specification. This is better than addingtimeFormatasformatwould normally work for ordinal time field withtimeUnit.