I want to use heatmaps and cluster visualizations to display historic data grouped in different time intervals. I already have my data stored in a series of 1-day groups. At runtime, I want to be able to choose what timeframe to visualize: any single day, or any span of seven or thirty days. Because of my visual design choices, the source data needs to be grouped together for display on a single layer.
Store all possible data in a single source and filter the data in an expression. This results in always downloading the most possible data and puts a potentially large computing cost on the client device.
Manage source aggregation earlier in the pipeline and change the layer source to the desired aggregation at runtime. I opened a separate ticket for enabling this kind of aggregation when creating tilesets.
Ideally, source aggregation would be supported at multiple points in the application lifecycle to allow for different use cases. Currently application developers must develop their own aggregation scheme and provide a single source when they interface with Mapbox tools.
In the layer style specification, add a sources key that accepts an array of sources. If an array of sources is provided, a singular source and source-layer should not be provided.
Library developers will support a new variant of the Layer type that contains multiple sources.
interface Source {
id: string; // Name of a source description to be used in this layer
layer?: string; // Layer to use from a vector tile source
}
interface LayerBase {
// …
// existing style spec for layers omitting source, source-layer, and sources
}
interface LayerWithSingleSource extends LayerBase {
source: string;
"source-layer"?: string;
}
interface LayerWithMultipleSources extends LayerBase {
sources: Source[];
}
type Layer = LayerWithSingleSource | LayerWithMultipleSources;
Application developers can add an array of sources instead of a single source+source-layer combo.
{
type: "heatmap"
sources: [ { id: "vehicles-2020-02-16", layer: "vehicles" }, { id: "vehicles-2020-02-17", layer: "vehicles" }, { id: "vehicles-2020-02-18", layer: "vehicles" } ]
}
New concept: you can have multiple sources for a single visual layer.
The sources should have compatible feature types for the style layer using them (e.g. all lines or polygons for a line layer).
In a simplest implementation, an AggregateSource acts as a facade to a collection of underlying sources. When data is requested from the source, it passes the request to all sources in the collection, aggregates their responses into a single response object, and provides that aggregate response to the code requesting data. Styles and other consumers of sources should not need to know anything about the source or how it works.
Pseudocode ignoring asynchrony and other library details:
class AggregateSource {
sources: Source[];
loadTile(tile: Tile): Data[] {
return this.sources.reduce((aggregate, source) => {
return aggregate.concat(source.loadTile(tile));
}, []);
}
}
Based on your proposal here, the client would still need to download all sources at once. You mentioned sources being split into time periods, but wouldn't you then need to pass a time filter which gets more complicated.
A possible alternative could be Map#setLayerSource proposed in #6895 so you can update a Layer source without removing and re-adding it.
Previous discussion about this can be found in https://github.com/mapbox/mapbox-gl-js/issues/4362. The heatmap aggregation use-case is new since the previous issue and it's one that this proposal would serve very well, but all layer types would benefit from multiple sources per layer.
@andrewharvey I'm not sure why you think downloading all sources at once is required. If you can combine sources and you want to grab a 7-day period out of your 365-days of data, you only need to download the 7 days of data. If the year was stored in a single tileset layer and then filtered, you would need to download 52 times as much data as you wanted to see. Does that clarify what I mean by avoiding downloading all the data?
Closing this here and moving the conversation and suggestion to #4362 to focus the discussion there.
@andrewharvey I'm not sure why you think downloading all sources at once is required. If you can combine sources and you want to grab a 7-day period out of your 365-days of data, you only need to download the 7 days of data. If the year was stored in a single tileset layer and then filtered, you would need to download 52 times as much data as you wanted to see. Does that clarify what I mean by avoiding downloading all the data?
But in your proposal here the source doesn't know which date it covers and the layer has no way of telling the source hey only download sources which cover this date range.
In the past I've taken advantage of Mapbox's Tileset compositing capabilities (upload say 3 days worth of data to one tileset) then to request 6 days it's one source but tilesetA,tilesetB.
That way the definition of the date range and tracking tileset IDs etc is completely controlled by your application.