We should add a rule to the linter that checks if a document's charset is UTF-8.
We can probably do better, and always convert to UTF-8 on save via Encoding API.
and always convert to UTF-8 on save via Encoding API.
@marcoscaceres The exported document is already checked for utf-8 in core/save-html
let metaCharset = cloneDoc.querySelector(
"meta[charset], meta[content*='charset=']"
);
if (!metaCharset) {
metaCharset = cloneDoc.createElement("meta");
metaCharset.setAttribute("charset", "utf-8");
}
Please correct if I am wrong.
So it’s not exactly checked: it just blindly adds a meta charset saying it’s UTF-8. But the document might be in a different encoding.
TextEncoder of the API is what we need here?
TextEncoder of the API is what we need here?
I think so, then decode it out to UTF-8.
@marcoscaceres Is this still relevant? I think the new core/exporter always generates UTF-8.
I think the idea was to check if the input is not UTF-8 or if it contains a contradictory <meta charset> prior to exporting.
if the input is not UTF-8
I think the only way to check it is to check whether meta[charset] is absent or points a different encoding.
if it contains a contradictory prior to exporting
👍
From: Marcos Cáceres notifications@github.com
Sent: Saturday, January 19, 2019 6:09:46 AM
To: w3c/respec
Cc: Kagami Sascha Rosylight; Comment
Subject: Re: [w3c/respec] Lint for charset not being UTF-8 (#873)
I think the idea was to check if the input is not UTF-8 or if it contains a contradictory prior to exporting.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fw3c%2Frespec%2Fissues%2F873%23issuecomment-455688372&data=02%7C01%7C%7C5b8974364cf04b50ce4a08d67d8946e0%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636834425879346464&sdata=EZsPHbbZQMwqtHTDXV03quRRQ5rKIpMqkiamwB%2BVUh0%3D&reserved=0, or mute the threadhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADPUTrGXLAL9gmr5PT4dpbTP7uMoh1G1ks5vEjgagaJpZM4JNJDi&data=02%7C01%7C%7C5b8974364cf04b50ce4a08d67d8946e0%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636834425879346464&sdata=kYu0BMx76R6PFIVOW%2B9Z%2B8MFSDD1dQvq926%2BwIYBuug%3D&reserved=0.
@saschanaz any work that needs to be done here, maybe I can try solving this issue if it's still relevant ?
Hello @CodHeK!
We want to add a file in src/core/linter-rules that checks whether the document has <meta charset="utf-8" /> properly. A linter warning message should be shown if it's absent or there is non-UTF-8 charset declaration instead.
Checking out existing linters e.g. local-refs-exist may help you understanding how the linters work.
@saschanaz okay, great! I'll do a bit of checking and get back if I have any doubts!
@saschanaz looking at the existing linters gives me a fair idea of how to proceed, but how do I test them as an when I write the code for the linter ?
@CodHeK You can test it somehow like this: https://github.com/w3c/respec/blob/bd65955f036e1912f932ac4188cdff5c4a65780a/tests/spec/core/linter-rules/local-refs-exist-spec.js
Note that you have to register your new linter in src/w3c/defaults.js first before any test.
@saschanaz should the link add the meta tag by it's own to the doc if not present ? or change it to utf-8 if it's something else ?
@saschanaz please review the PR for the linter https://github.com/w3c/respec/pull/2075
Most helpful comment
@saschanaz okay, great! I'll do a bit of checking and get back if I have any doubts!