Warehouse: pypi.org showing in different languages

Created on 14 Oct 2019  路  11Comments  路  Source: pypa/warehouse

Describe the bug

Visiting https://pypi.org/ shows in an unexpected language.

On another computer yesterday (macOS/Chrome), it showed Brazilian Portuguese. I clicked English to change the language.

On this computer today (macOS/Chrome), it first showed English. I reloaded the frontpage, it showed the text in French then re-rendered in English. I reloaded again, it showed French, and but replaced the top-menu bar text (Help, Donate, ...) in English. I typed up most of this and reloaded again, and it's in English again. Reloading again and again, it's now in English.

I'm not signed in, and haven't clicked the language selector.

On my phone (Android/Chrome), I just started typing in pypi.org and the history showed something like "L'Index Package..." in French. Visiting the page showed it in English. Not signed in, haven't clicked language selector.

Expected behavior

It shows the English, or some language based on my browser/OS language setting or geolocation. My browser and OS language settings are for English, and I'm in Finland.

To Reproduce

Sorry, I don't have a reliable method to reproduce, but have seen this on three different devices.

My Platform

Yesterday: macOS Mojave, Chrome (probably latest)
Now: macOS High Sierra, Chrome 77.0.3865.90
Now: Samsung S8/Android P/Chrome 77.0.3865.116

bug

Most helpful comment

馃槗 I think this is finally resolved.

All 11 comments

@hugovk can you confirm if this is still the case? after investigation, it seems that we hadn't properly configured our CDN for the case of an annonymous/cookieless user, causing pages to be fetched from cache in whatever language they were first retrieved.

https://github.com/python/pypi-infra/pull/49 should have resolved the issue by defaulting the necessary header to english for the first fetch of a page without a locale cookie set.

I can no longer reproduce, thanks for fixing it!

Just happened again on mobile: homepage in English, search results page in English:


...

But project page partly in English but mostly Spanish:


...


And Portuguese on desktop:

image
image

Reloading, it went into English. Reloading several times, still English. But search for a project (results page English), and project page was in Spanish:

image

sigh, thanks for re-reporting. will look a bit closer.

I haven't verified this, but I suspect the issue is effectively that our caching headers were not updated to account for the translation work that was done.

Given an url like /project/pip/, the content on that page is now going to change based on whether a specific cookie is there or not, however our Vary headers don't reflect this. This means that browsers and downstream caches are free to serve say, a spanish version of /project/pip, because we've told it that the content of the page isn't going to vary due to cookies, when it obviously does.

This same thing is true inside Fastly itself, but from what I see we're lifting the value out of that cookie and putting it into a header, and Varying on that header. That should solve the issue for Fastly, but not for any downstream caches (at the cost of making it one cache entry per URL per language).

There are a few options to solve it, but likely the best one is to update our VCL so that in the deliver method, we look for Vary: PyPI-Locale and add a Vary: Cookie. Doing this in deliver will mean it won't affect Fastly caching [1], but it will affect downstream caches. We've theoretically already solved the issue for the Fastly cache, so that's perfectly fine.

We probably want to remove PyPI-Locale from Vary when we actually serve the page to end users, because as written we're not going to actually Vary the content for them based on that value (because we always set it's value to something, either from the cookie or from a default) so we're just possibly inflating their caches for no reason (though unlikely it would actually have an effect).

It also appears that the code that adds the Vary: PyPI-Locale is also overbroad, it appears to be adding it to every response, which means that we're inflating our cache for pages that do not and will never have translations. Likely we want to do something like that was done with sessions, and make it impossible to access the translation machinery unless a view has been decorated with like, @has_translations or something and move the Vary: PyPI-Locale code to that.

Not sure when/if I'll have time to get to this, but I wrote a quick PoC of the changes that need to land in PyPI itself at https://github.com/pypa/warehouse/pull/6857.

The first phase of this landed with https://github.com/python/pypi-infra/pull/50, I'm not 100% certain, but this should resolve any incorrect translations that were occurring due to shielding.

I'm also getting this, the page for the package python-certifi-win32 is showing in Ukranian for me, whereas all other pages are showing in English which is what I'd expect (Win10, Chrome, not signed in). Just happened for me in the last 5 minutes.

It was odd that it was only happening for one package, and at first I suspected a hack. Good to know it's "simply" a localisation issue :-)

I think this is finally solved. It stems from the fact that warehouse currently tries to respect the Accept-Language header if no _LOCALE_ cookie is set.

https://github.com/pypa/warehouse/blob/92b28e69d2cb9af3456d2be047610ec955bc3bc5/warehouse/i18n/__init__.py#L75-L80

This leads to us identifying a request as en in the synthesized PyPI-Locale header, even though what PyPI responds with may be translated.

In the interest of expediency, I've configured our CDN to strip the Accept-Language header completely from requests for now. If we work to improve this functionality as requested in #6864 we'll need to handle it in the VCL.

馃槗 I think this is finally resolved.

Looks good to me now, thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mahmoud picture mahmoud  路  4Comments

LarsFronius picture LarsFronius  路  4Comments

hartwork picture hartwork  路  4Comments

webknjaz picture webknjaz  路  4Comments

toddrme2178 picture toddrme2178  路  3Comments