Almanac.httparchive.org: Broken external links

Created on 8 Nov 2019  路  9Comments  路  Source: HTTPArchive/almanac.httparchive.org

We have these broken external links

In Mobile web chapter :
https://almanac.httparchive.org/en/2019/%5D(https://gs.statcounter.com/)

In Ecommerce chapter :
https://almanac.httparchive.org/en/2019/amazon.com
https://almanac.httparchive.org/en/2019/ebay.com

in third parties and http ones :

https://almanac.httparchive.org/en/2019/security [it is ok since the chapters are not yet published... what would be better to give a 404 and not HTTP 500].

in HTTP/2 chapter:

https://almanac.httparchive.org/en/2019/cdn [it is ok since the chapters are not yet published... what would be better to give a 404 and not HTTP 500].

In JavaScript chapter :

https://almanac.httparchive.org/methodology
https://almanac.httparchive.org/compression
[it is ok since the chapters are not yet published... what would be better to give a 404 and not HTTP 500].

bug development good first issue

All 9 comments

Great catches. All external links must be prefixed with http[s]:// otherwise the browser assumes they're internal. We should add this prefix to all external links.

Internal links to other chapters should be prefixed with ./ (current subdirectory) instead of / (root directory). This allows language/year preferences to be preserved without having to use template code to generate the URL (url_for).

The Mobile Web chapter needs the http[s]:// prefix.

The Third Parties broken links are actually correct, but the chapters just don't exist yet. So no changes needed.

Which chapter links to CDN? It can't be the CDN chapter itself because that one doesn't exist.

The JS chapter needs to be edited to use ./.

Should we raise a separate issue to stop responding with a 500 when a chapter doesn鈥檛 exist?

Thinking especially if all chapters don鈥檛 make launch and as translations become unavailable (or will we only launch each language when it鈥檚 available in full)?

I think we can catch this special case on the server when we try to render a chapter that is marked TODO and return a custom error message like "This chapter isn't available yet, come back tomorrow" or something.

Ok but you鈥檒l also get a 500 for non-existent chapters and typos. E.g. /markupp. Should be 404 for those if possible.

Yes, we've had a long-outstanding TODO to validate the chapter. We can use this issue to track that:

https://github.com/HTTPArchive/almanac.httparchive.org/blob/363c818ced353683f977e5efaaf9cc6a4f106699/src/main.py#L95-L99

And on that note if there are any external links going to misspelled chapters (seen via analytics), let's permanently redirect those to the appropriate chapter rather than show an error page every time. If the link is internal we should fix it.

Which chapter links to CDN? It can't be the CDN chapter itself because that one doesn't exist.

It's linked in https://almanac.httparchive.org/en/2019/http2 @rviscomi

Ok thanks. Edited your comment to clarify.

Yes, we've had a long-outstanding TODO to validate the chapter. We can use this issue to track that:

https://github.com/HTTPArchive/almanac.httparchive.org/blob/363c818ced353683f977e5efaaf9cc6a4f106699/src/main.py#L95-L99

And on that note if there are any external links going to misspelled chapters (seen via analytics), let's permanently redirect those to the appropriate chapter rather than show an error page every time. If the link is internal we should fix it.

I've made a start on that here: https://github.com/HTTPArchive/almanac.httparchive.org/tree/chapter_url_handling

Adds the following:

  • validates chapters and 404s when a non-valid chapter is entered, rather than a 500
  • adds redirect array/list to allow us to remap URLs

To Do:

  • Currently hardcoding the list of chapter. Should load from JSON and list by year and language. Couldn't get this working despite trying for ages (any help appreciated!).
Was this page helpful?
0 / 5 - 0 ratings

Related issues

rviscomi picture rviscomi  路  5Comments

rviscomi picture rviscomi  路  3Comments

rviscomi picture rviscomi  路  5Comments

bazzadp picture bazzadp  路  4Comments

bazzadp picture bazzadp  路  3Comments