Wp-calypso: Domains: Able to reach checkout with Punycode version of IDN

Created on 21 Nov 2015  ·  15Comments  ·  Source: Automattic/wp-calypso

When attempting to register a domain, siobhánbamber.com, via https://wordpress.com/domains/add, I'm blocked by an error notice informing me that the domain isn't valid. This is expected as WordPress.com doesn't support the registration or mapping of IDNs (Internationalized Domain Names).

screen shot 2015-11-21 at 12 55 17

However, when attempting to register xn--siobhn-ttabamber.com (which is the IDNA encoded or Punycode version) of siobhánbamber.com then I was able to reach checkout.

screen shot 2015-11-21 at 13 01 05

The registration technically failed as xn--siobhn-ttabamber.com isn't supported at WordPress.com, however, no notice of the failure was given throughout the registration process.

Domains [Status] Stale [Type] Bug

Most helpful comment

@umurkontaci yes, I am able to reproduce it on a regular account, unproxied.

screen shot 2017-05-16 at 11 24 10 am

All 15 comments

/cc: @aidvu since this better be done in the backend

@SiobhyB Turns out there's already a check for this type for user accounts.

@Just reopening this ticket because I am able to reproduce this error now from Tortuga:

1) Go to https://wordpress.com/domains/

2) Search for any punycode domain like xn--8zc3ctb0d.com

screen shot 2017-05-13 at 6 31 06 pm

3) Select it and proceed to checkout:

screen shot 2017-05-13 at 6 31 22 pm

screen shot 2017-05-13 at 6 41 12 pm

4) Checkout, and observe how it throws this error:

screen shot 2017-05-13 at 6 34 19 pm
Ideally, we should prevent the user from ever being able to add the punycode domain to their cart, like we do from /domains/add/.

Please let me know if I should file this on the Tortuga GH repo instead. Thank you!

@mahangu can you confirm with a regular user account?

@umurkontaci yes, I am able to reproduce it on a regular account, unproxied.

screen shot 2017-05-16 at 11 24 10 am

Seems like there are two aspects of this - if you search for a direct match, like xn--8zc3ctb.com, you'll be able to add it to cart. But the actual domain should be මහඟ.com, which you can't add to cart, because of the availability check that runs after you try to add it to cart.
We should also prevent Domainsbot from suggesting IDN domains, because right now it does :/

This is not yet fixed - the patch helped with hiding this issue for Tortuga users, but if someone searches for a domain in punycode format it will still allow it to register it.

This issue has been marked as stale and will be closed in seven days. This happened because:

  • It has been inactive in the past 9 months.
  • It isn't a project or a milestone, and hasn’t been labeled `[Pri] Blocker`, `[Pri] High`, `[Status] Keep Open`, or `OSS Citizen`.

You can keep the issue open by adding a comment. If you do, please provide additional context and explain why you’d like it to remain open. You can also close the issue yourself — if you do, please add a brief explanation.

cc @Automattic/i18n - I was wondering what's your take on internationalized domain names (aka IDN)? My guess would be that they are popular in some countries (with a non-latin alphabet perhaps), but would love to hear your expertise, so that we can better prioritize this issue.

Japanese domains are not super common but not too rare either.

A Japanese page on Wikipedia says 120K out of 900K .JP domains are Japanese IDN, but I don't feel I see them very often. Maybe people register them but use alphabet version as the primary one.

FYI - according to https://jprs.jp/faq/use/ , ドメイン名例.JP and XN--ECKWD4C7CU47R2WF.JP (Punycode version) can be used as test domains, similar to example.com.

I don't think we've done any real investigation.

Tucows says they support IDNs in some languages for some tlds, which sounds "pretty fun", but we're in a good place to handle that sort of thing now (I believe that that "some languages" is enforcing iana specified character subsets).

The big question mark is the places where we've made broken assumptions. For example, I saw a thread where a publicize link was broken in facebook because of the encoding, and videopress has it's own is_valid_domain() that would reject IDNs whether or not they're encoded.

We don't even support mapping IDNs right now, but it looks like there are some tricks kicking around to make it work sometimes, so it's not all bad :)

There are also some security questions around using visually similar characters to nefarious ends.

We need some idea of the value here to compare the effort to. How valuable are IDNs to users?

Thanks for chiming in, @naokomc and @deBhal 🙇 Maybe let's start then with just supporting mapping such domains (it _is_ the prerequisite for support registering such domains anyway) and see how popular those are. Meaning, if someone searches for an IDN, let's direct them to map the domain (if it's taken) or show them the usual green nudge to buy it if a non-IDN version is available (if it's possible to easily convert to such - easy for stuff like żłóć, but Japanese alphabet not so much ;)). Sounds good? Any volunteers? 😁

We would have to store them in the non-IDN representation in the DB, because collation/charset, so some mumbo-jumbo would be required encoding/decoding them every time.

Would be nice to have it, yeah, but needs a bit more research on P2. :)

Right - probably in the Punycode format. And that seems the biggest technical hurdle here - it will introduce a different "raw" domain and a different one for display purposes.

This issue has been marked as stale and will be closed in seven days. This happened because:

  • It has been inactive in the past 9 months.
  • It isn't a project or a milestone, and hasn’t been labeled `[Pri] Blocker`, `[Pri] High`, `[Status] Keep Open`, or `OSS Citizen`.

You can keep the issue open by adding a comment. If you do, please provide additional context and explain why you’d like it to remain open. You can also close the issue yourself — if you do, please add a brief explanation.

Was this page helpful?
0 / 5 - 0 ratings