There was some earlier discussion about allowing for authoring the chapter content using markdown. Would this still be something worth investigating as a spike task?
The benefits are that it seems easier to write in markdown than html, and that gitlocalize seems to have better support for markdown.
The challenges are that we will likely want some rich data visualisation in the chapters, which may not have be simple to achieve using markdown.
A suggestion that I have, that I'd like to table is using a potentially hybrid approach. Would it be feasible to us custom-elements inside the markdown, and then put the datavis content inside the custom element using slots?
Example:
# Chapter Eleventeen
Some meaningful textual content.
<pie-chart data="data.json">
<data value="25">Part of the pie chart</data>
<data value="75">The rest of the pie chart</data>
</pie-chart>
That way we have content that can be indexed and crawled, can fall back to a div if custom-elements aren't supported, but lets us write some more interactive visualisations using javascript as needed.
It's just an idea, not bound to it. :)
Thanks for kicking this off @mikegeyser!
lets us write some more interactive visualisations using javascript as needed
Does this mean that all visualizations are effectively client-side rendered? I think it'd be optimal to generate the SVG charts statically and embed that in the HTML. AFAICT embedding raw HTML in the markdown is what your proposal already is, so is that all we need to do?
is that all we need to do?
Absolutely! That's what I'm suggesting. If something is fine as a static SVG element, then we should just put it in the markdown directly.
I'm not suggesting that we client side render everything. The custom element would just be if (and only if) there was a compelling UX requirement to have the visualisation be interactive.
In that case, in my opinion, the ideal would be to have the slotted item contain the SVG and then progressively enhance it with javascript. If there's no requirement for interactivity, then the whole custom-element argument is irrelevant.
<drill-down-pie-chart data="data.json">
<svg>
<!-- default pie chart -->
</svg>
</drill-down-pie-chart>
(Apologies for the contrived example.)
Also, if we're worried about embedding large SVGs in the markdown file, we could possibly encourage people to reference them using <object data=""></object>?
<drill-down-pie-chart data="data.json">
I'm confused by this. So the pie chart _would_ be client-side rendered? After the static SVG loads?
No, so my thinking is that the page would simply render the SVG chart on initial page load.
If the user chooses to drill down, the custom element (which should be lazily initialised, and until then would just be a div) would have enough context to be able to redraw the SVG chart from the datasource (again, by fetching the data).
(Please feel free to stop me if it's a terrible idea. As I said, I'm not bound to it. I'm just thinking of ways that we can cater for more interactive visualisations as a hypothetical. )
Gotcha, thanks for clarifying.
So it seems like markdown itself isn't a limitation here; we can put arbitrary HTML in it and it should render normally. Is that right? (if so let's rename this issue to something more specific to data viz)
To the issue of how we implement the data viz, I think we're on the same page about optimizing for speed and rendering server-side.
For interactivity, it seems like we've got different ideas.
enough context to be able to redraw the SVG chart from the datasource (again, by fetching the data)
This is the part I'd like to explicitly avoid. So far, having the raw data lying around is not part of my mental model. My thought is that we would run the queries and generate the corresponding SVG. That graphic would be the only artifact of the response data. Besides that, I'm wary of the maintenance of having two different processes for generating the charts: static and dynamic.
My alternate proposal would be to make the entire thing as static as possible. We generate SVG for each chart and embed it into the markdown at "compile" time so that the end result is flat HTML with all the content/viz baked in. We can layer in interactivity with JS event listeners and CSS classes/animations. Do you see any downsides to this approach?
The "compile" time thing could literally be a build step where we inject prebuilt SVG into placeholders within the markdown. Or we can manually paste it into the markdown. I kind of like the first option because authors can use the placeholders as they write, but it's a bit more eng work. While we're preprocessing the markdown we might as well also render it to HTML statically at "compile" time rather than run time like I had originally planned. WDYT?
This is the part I'd like to explicitly avoid.
Gotcha, 100%.
inject prebuilt SVG into placeholders within the markdown
Would generating them as separate files and then referencing them using <object data=""></object> within markdown be feasible? That way the markdown and svg are loosely coupled, and the SVG resources will be independently versioned and cacheable? It would mean multiple requests for assets, though, instead of one larger initial payload.
Update: Wait, if we use object then I don't think we can style them via CSS. Which may or may not be a big deal, but is a definite reason to favour embedding.
If that's true, for that reason alone we should avoid <object> because I do think we want to style via CSS.
The compile time thing I mentioned would encourage the loose coupling you described. We would have something like a directory of SVG files and in the markdown we'd have placeholder references like {{13.6.svg}} where 13.6 is the metric ID (6th metric of the 13th chapter):
ID | Part | Chapter | Section | Metric description
-- | -- | -- | -- | --
13.6 | III. Content Publishing | 13. Ecommerce | Stats for sites that appear to be e-commerce sites | Images: quantity, format, byte size, pixel dimensions, etc.
We'd need to write a script that scans through the markdown, discovers the placeholders, and replaces them with the corresponding SVG file contents. We may also need additional affordances for things like layout preferences (inline, left, full width, etc) to give developers/designers a bit more control.
Sorry I've been quiet on this (work has been heavy), going to start working on it now.
There's some overlap with #70, so I'm going to try to keep the implementation in line with the proposed config.
@mikegeyser @HTTPArchive/developers WDYT about this open question from earlier:
While we're preprocessing the markdown we might as well also render it to HTML statically at "compile" time rather than run time
So chapter contents are written and maintained in markdown. When we build the website locally we run a script that renders markdown into HTML (including SVG graphics and metadata). The auto-generated HTML for each chapter can be saved under src/templates/en/2019/chapters/. The script could make use of the chapter config from #70 to generate the metadata for the HTML.
Here's how the process might work:
src/content/en/<year>/<chapter>.mdsrc/content/<lang>/<year>/<chapter>.mdsrc/static/images/<year>/<chapter>/<metric_id>.svg$ python src/generate_markup.py <lang> <year>lang and year as input (or iterate over all chapters in src/content/<lang>/<year>)src/config/<year>/chapters.jsonsrc/content/<year>/<chapter>.md, load its markdown file (or check git status to see which chapter contents are new/changed)src/static/images/<year>/<chapter>/<metric_id>.svgsrc/templates/<lang>/<year>/chapters/<chapter>.htmlI think that addresses a few different concerns about memory management of markdown files, translation of chapter metadata, and rendering SVG.
I think that鈥檚 a much better idea, and resolves a bunch of my concerns. I鈥檒l get on that.
My only question (and I鈥檓 sorry if this was discussed in another issue) is why the translation is being done against the HTML instead of the Markdown? If we change the generated template, and rerender the html files - wouldn鈥檛 that overwrite the translation? It鈥檚 much of a muchness, but couldn鈥檛 we then translate the markdown files as a part of the workflow, and then generate all the templates?
You're totally right. Per https://github.com/HTTPArchive/almanac.httparchive.org/issues/35#issuecomment-504005117 I suggested we use gitlocalize to translate the markdown. I think that would be preferred for translators. I'll update the process in my previous comment accordingly.
Edit: After updating it occurred to me that the chapter metadata would be added after translation, so it might need to be manually translated in the templates.