Respec: Lint for charset not being UTF-8

Created on 15 Jul 2016  ·  15Comments  ·  Source: w3c/respec

We should add a rule to the linter that checks if a document's charset is UTF-8.

Feature request good first issue

Most helpful comment

@saschanaz okay, great! I'll do a bit of checking and get back if I have any doubts!

All 15 comments

We can probably do better, and always convert to UTF-8 on save via Encoding API.

and always convert to UTF-8 on save via Encoding API.

@marcoscaceres The exported document is already checked for utf-8 in core/save-html

let metaCharset = cloneDoc.querySelector(
    "meta[charset], meta[content*='charset=']"
  );

if (!metaCharset) {
    metaCharset = cloneDoc.createElement("meta");
    metaCharset.setAttribute("charset", "utf-8");
  }

Please correct if I am wrong.

So it’s not exactly checked: it just blindly adds a meta charset saying it’s UTF-8. But the document might be in a different encoding.

TextEncoder of the API is what we need here?

TextEncoder of the API is what we need here?

I think so, then decode it out to UTF-8.

@marcoscaceres Is this still relevant? I think the new core/exporter always generates UTF-8.

I think the idea was to check if the input is not UTF-8 or if it contains a contradictory <meta charset> prior to exporting.

if the input is not UTF-8

I think the only way to check it is to check whether meta[charset] is absent or points a different encoding.

if it contains a contradictory prior to exporting

👍


From: Marcos Cáceres notifications@github.com
Sent: Saturday, January 19, 2019 6:09:46 AM
To: w3c/respec
Cc: Kagami Sascha Rosylight; Comment
Subject: Re: [w3c/respec] Lint for charset not being UTF-8 (#873)

I think the idea was to check if the input is not UTF-8 or if it contains a contradictory prior to exporting.


You are receiving this because you commented.
Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fw3c%2Frespec%2Fissues%2F873%23issuecomment-455688372&data=02%7C01%7C%7C5b8974364cf04b50ce4a08d67d8946e0%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636834425879346464&sdata=EZsPHbbZQMwqtHTDXV03quRRQ5rKIpMqkiamwB%2BVUh0%3D&reserved=0, or mute the threadhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADPUTrGXLAL9gmr5PT4dpbTP7uMoh1G1ks5vEjgagaJpZM4JNJDi&data=02%7C01%7C%7C5b8974364cf04b50ce4a08d67d8946e0%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636834425879346464&sdata=kYu0BMx76R6PFIVOW%2B9Z%2B8MFSDD1dQvq926%2BwIYBuug%3D&reserved=0.

@saschanaz any work that needs to be done here, maybe I can try solving this issue if it's still relevant ?

Hello @CodHeK!

We want to add a file in src/core/linter-rules that checks whether the document has <meta charset="utf-8" /> properly. A linter warning message should be shown if it's absent or there is non-UTF-8 charset declaration instead.

Checking out existing linters e.g. local-refs-exist may help you understanding how the linters work.

@saschanaz okay, great! I'll do a bit of checking and get back if I have any doubts!

@saschanaz looking at the existing linters gives me a fair idea of how to proceed, but how do I test them as an when I write the code for the linter ?

@CodHeK You can test it somehow like this: https://github.com/w3c/respec/blob/bd65955f036e1912f932ac4188cdff5c4a65780a/tests/spec/core/linter-rules/local-refs-exist-spec.js

Note that you have to register your new linter in src/w3c/defaults.js first before any test.

@saschanaz should the link add the meta tag by it's own to the doc if not present ? or change it to utf-8 if it's something else ?

@saschanaz please review the PR for the linter https://github.com/w3c/respec/pull/2075

Was this page helpful?
0 / 5 - 0 ratings

Related issues

marcoscaceres picture marcoscaceres  ·  3Comments

marcoscaceres picture marcoscaceres  ·  6Comments

saschanaz picture saschanaz  ·  5Comments

andrea-perego picture andrea-perego  ·  3Comments

xfq picture xfq  ·  4Comments