Almanac.httparchive.org: Extra paragraph tags in figures

Created on 10 Nov 2019  路  3Comments  路  Source: HTTPArchive/almanac.httparchive.org

(I know @mikegeyser said he would look today but raising so we don't forget).

As discussed in https://github.com/HTTPArchive/almanac.httparchive.org/pull/394#discussion_r344455485, generating the chapters using npm run generate leads to extra <p></p> lines in both <img and <table figures:

table:

            </table>
          </div>
        </div>
        <p></p>
        <figcaption>Figure 4. HTTP version usage for home pages.</figcaption>
        <p></p>
      </figure>

img:

        <figcaption>Figure 9. TCP connections per page. (Source: <a href="https://httparchive.org/reports/state-of-the-web#tcp">HTTP Archive</a>)</figcaption>
        <p></p>
      </figure>

This is invalid HTML when you validate it.

At least some of them look to be due to calling wrap_tables as commenting that out doesn't lead to the issue.

A simple fix is to add a regex replace in generate_chapters.js to remove these spurious tags:

  body = generate_figure_ids(body);
  body = wrap_tables(body);
  body = body.replace(/<p><\/p>/g,"");
  const toc = generate_table_of_contents(body);

Will see if @mikegeyser has a better fix to prevent them happening in first place before we go this route.

development enhancement

Most helpful comment

Actually, I think we should put in the fix that @bazzadp recommended while I keep working on a proper solution for the next release.

All 3 comments

As @bazzadp pointed out, I think it's the wrap_tables functionality. That uses JSDOM rather than regex, which relies on serializing the jsdom document to string. We had some uncontrollable behaviour in the generate_figure_ids chapter while using that approach, and eventually abandoned it in favour of regex. I think this is probably a similar situation, which is why the problem disappears when you comment our wrap_tables.

I'll carry on looking into it, though, and see if there's an expedient fix.

Actually, I think we should put in the fix that @bazzadp recommended while I keep working on a proper solution for the next release.

415 was an interim fix but I think we can close this since there's no other immediate action.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rviscomi picture rviscomi  路  5Comments

MSakamaki picture MSakamaki  路  4Comments

rviscomi picture rviscomi  路  5Comments

rviscomi picture rviscomi  路  3Comments

rviscomi picture rviscomi  路  3Comments