Mithril's code attempts to decode the location.hash value but that is absolutely not required, hash should never be received in decoded fashion when querying location.hash; worse, this means that you can sometimes end up in situations where Mithril crashes your page by trying to decode invalid url fragments, and this error is not handled properly within mithril and aborts page loading.
mithril (with router enabled) crashes when loading page with specially crafted hash values
not attempting to decode location.hash via normalize1(), as it is already decoded
alternatively, that function should not throw if it cannot perform the replacement it desires to perform
load a page like https://bing.com/#abc%abc then run the code found in normalize1:
location.hash.replace(/(?:%[a-f89][a-f0-9])+/gim, decodeURIComponent)
you can also use javascript to prove that browsers auto-decode location hashes:
location.hash = "#abc%20def"; console.log(location.hash); // prints "#abc def" in all browsers
Workaround: If you need to have routes in your app that can contain arbitrary symbols, either url-encoding or not url-encoding them will result in issue in at least one browser, so here is what I ended up doing in my application:
function encodeHash(text) {
return encodeURIComponent(text.replace(/\u200B/g, "\u200B\u200B").replace(/#/g, "\u200Bê–›").replace(/%/g, "\u200Bâ„…"));
}
function decodeHash(text) {
return text.replace(/(?:%[a-f0-9]+)+/gim, function(t) { try { return decodeURIComponent(t) } catch (ex) { return t }}).replace(/\u200Bâ„…/g,"%").replace(/\u200Bê–›/g,"#").replace(/\u200B\u200B/g, "\u200B");
}
the trick is that the final hash before url-encoding does not contain any percent sign because it is preencoded beforehand. Since your hash does not contain any percent sign, it will not trigger the url decoding accidentally in browsers that do not return an encoded hash (most of them).
@FremyCompany What browser are you using? FYI, Firefox 58 prints "#abc%20def" when testing the JS you had.
As for the other issue (invalid escapes), I'm working on a fix right now.
@FremyCompany Mithril normalizes this because there are inconsistencies with how Unicode characters are represented when accessed through the window.location getters. The situation may have improved, but at the very least that code used to have a purpose.
See https://github.com/MithrilJS/mithril.js/pull/881 for a table with the problematic cases.
Edit: Firefox and Safari still return the hash with percent encoding.
Edit2: this would probably work:
var data = $window.location[fragment]
try {data = data.replace(/(?:%[a-f89][a-f0-9])+/gim, decodeURIComponent)} catch(e){}
Edit3: I should have read the whole thread before replying... digesting your third post...
Edit4: Something along this should work and be future proof.
var aElement = $document.createElement("a")
aElement.href = "/ö?ö#ö"
function getURLPart(target, mode) {
// In this context, `decodeURI` is the right function to call
// (not `decodeURIComponent`), since URI components separators
// are not encoded in the raw routes.
return /ö/.test(aElement[mode]) ? target[mode] : decodeURI(target[mode])
}
@pygy Thanks for the table in 881, that is very useful to understand the reasoning behind the unescape attempt, it's annoying browsers don't all do the same thing here.