Input:
console.log(
'\u{1D306}',
`\u{1D306}`,
/\u{1D306}/u
);
Current output: https://babeljs.io/repl/#?babili=false&evaluate=true&lineWrap=false&presets=es2015&experimental=false&loose=false&spec=false&code=console.log(%0A%20%20'%5Cu%7B1D306%7D'%2C%0A%20%20%60%5Cu%7B1D306%7D%60%2C%0A%20%20%2F%5Cu%7B1D306%7D%2Fu%0A)%3B
'use strict';
console.log('𝌆', '𝌆', /(?:\uD834\uDF06)/);
Note that the output contains non-ASCII symbols, leading to potential encoding issues (e.g. if the server is not configured to serve the file as UTF-8) or just confusion. By transpiling to a pair of Unicode escape sequences (i.e. \uD834\uDF06
in this case) this issue is avoided.
cc @jayphelps
TL;DR we should ensure any generated code for the following literals is ASCII-safe:
Agreed. Seems like we should add an encoding argument to the code generator and have it default to ASCII.
Seems like we should add an encoding argument to the code generator and have it default to ASCII.
I don’t think we need an option for it — this should just be the default behavior with no way to disable it. There’s no good reason to disable it, really.
Patch that ensures the generated code for string literals is ASCII-safe: #4478.
I can do the same thing for template literals, but that warrants more discussion. E.g., should this:
var x = `
foo
bar
baz
`;
…turn into this:
var x = `\nfoo\nbar\nbaz\n`;
…or not? I think it should, but a lot of tests expect otherwise, so I figured I’d ask.