Almanac.httparchive.org: Extra paragraph tags in figures

Created on 10 Nov 2019 · 3Comments · Source: HTTPArchive/almanac.httparchive.org

(I know @mikegeyser said he would look today but raising so we don't forget).

As discussed in https://github.com/HTTPArchive/almanac.httparchive.org/pull/394#discussion_r344455485, generating the chapters using npm run generate leads to extra <p></p> lines in both <img and <table figures:

table:

            </table>
          </div>
        </div>
        <p></p>
        <figcaption>Figure 4. HTTP version usage for home pages.</figcaption>
        <p></p>
      </figure>

img:

        <figcaption>Figure 9. TCP connections per page. (Source: <a href="https://httparchive.org/reports/state-of-the-web#tcp">HTTP Archive</a>)</figcaption>
        <p></p>
      </figure>

This is invalid HTML when you validate it.

At least some of them look to be due to calling wrap_tables as commenting that out doesn't lead to the issue.

A simple fix is to add a regex replace in generate_chapters.js to remove these spurious tags:

  body = generate_figure_ids(body);
  body = wrap_tables(body);
  body = body.replace(/<p><\/p>/g,"");
  const toc = generate_table_of_contents(body);

Will see if @mikegeyser has a better fix to prevent them happening in first place before we go this route.

development enhancement

Source

bazzadp

👍2

Most helpful comment

Actually, I think we should put in the fix that @bazzadp recommended while I keep working on a proper solution for the next release.

mikegeyser on 10 Nov 2019

👍2

All 3 comments

As @bazzadp pointed out, I think it's the wrap_tables functionality. That uses JSDOM rather than regex, which relies on serializing the jsdom document to string. We had some uncontrollable behaviour in the generate_figure_ids chapter while using that approach, and eventually abandoned it in favour of regex. I think this is probably a similar situation, which is why the problem disappears when you comment our wrap_tables.

I'll carry on looking into it, though, and see if there's an expedient fix.

mikegeyser on 10 Nov 2019

Actually, I think we should put in the fix that @bazzadp recommended while I keep working on a proper solution for the next release.

mikegeyser on 10 Nov 2019

👍2

415 was an interim fix but I think we can close this since there's no other immediate action.

rviscomi on 10 Nov 2019

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Home page contributor count style bug

rviscomi · 5Comments

Add a link to the HTTP Archive logo in footer

MSakamaki · 4Comments

Wrong axis labels for Compression charts

rviscomi · 5Comments

Join the 2020 Editors team

rviscomi · 3Comments

Typo in JS featured snippet

rviscomi · 3Comments