Next.js: Pages with utf-8 name don't work properly under SSR

Created on 14 Jan 2020  路  7Comments  路  Source: vercel/next.js

Bug report

Pages with utf-8 non-ASCII characters in their name don't work properly under SSR

Describe the bug

Pages with utf-8 non-ASCII characters in their name work just fine with client-side navigation,
but when rendered on server side return "404 This page could not be found."

To Reproduce

Steps to reproduce the behavior, please provide code snippets or a repository:

  1. Create page 'pages/褌械褋褌.js'
  2. Navigate to http://localhost:3000/褌械褋褌
  3. See error "404 This page could not be found."

Expected behavior

I'm expecting to see page 'pages/褌械褋褌.js' rendered

System information

  • OS: Windows
  • Version of Next.js: 9.1.7

Additional context

Minimal repository to reproduce bug: https://github.com/frei-0xff/nextjs-utf8-pagename

needs investigation

Most helpful comment

In version 9.2 client-side routing for pages with non-ASCII characters worked just fine. The issue was only with the server-side routing, that could be worked around with custom server.js with decodeURI(parsedUrl.pathname).

After updating to version 9.5.1 client-side routing for pages with non-ASCII characters stopped working at all. In development mode, after clicking on the link with such a page name, no navigation happens without any error messages. After routeChangeStart event neither routeChangeComplete nor routeChangeError events are fired, and only after clicking on another link routeChangeError with "Error: Route Cancelled" is fired.

Edit:
It seems that this https://github.com/vercel/next.js/pull/14827 was the breaking change.
Because URLs returned by WHATWG URL API are URL-encoded and it is inconsistent with other parts of the code.

All 7 comments

What's the purpose of using none-ASCII chars if your page name should be displayed as a valid URL?

http://localhost:3000/褌械褋褌 is converted to http://localhost:3000/%D1%82%D0%B5%D1%81%D1%82 and can't be found.

@StarpTech you might want to have a link like http://褟薪写械泻褋.褉褎/褌械褋褌. These display as Cyrillic URLs modern browser tabs. From my experience, 褌械褋褌 turns into %D1%82%D0%B5%D1%81%D1%82 only when you copy the URL into buffer.

@StarpTech none-ASCII URL's displayed properly in all modern browsers and used by popular sites. For example by wikipedia.org

Thanks for the examples. I have never used it.

As a workaround you can use dynamic page [page and switch case on pages names in utf8 pages name.

In version 9.2 client-side routing for pages with non-ASCII characters worked just fine. The issue was only with the server-side routing, that could be worked around with custom server.js with decodeURI(parsedUrl.pathname).

After updating to version 9.5.1 client-side routing for pages with non-ASCII characters stopped working at all. In development mode, after clicking on the link with such a page name, no navigation happens without any error messages. After routeChangeStart event neither routeChangeComplete nor routeChangeError events are fired, and only after clicking on another link routeChangeError with "Error: Route Cancelled" is fired.

Edit:
It seems that this https://github.com/vercel/next.js/pull/14827 was the breaking change.
Because URLs returned by WHATWG URL API are URL-encoded and it is inconsistent with other parts of the code.

Tested v9.5.0, v9.5.5, and v10.0.1 and none of them support statically generated pages with non-ascii names like /褌械褋褌 and /h忙. It worked as expected in a few versions I tested between v9.0.0 and v9.4.4. I would classify this is a bug or an undocumented breaking change in v9.5.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

olifante picture olifante  路  3Comments

swrdfish picture swrdfish  路  3Comments

timneutkens picture timneutkens  路  3Comments

DvirSh picture DvirSh  路  3Comments

kenji4569 picture kenji4569  路  3Comments