Amphtml: Bing.com AMP cache sets incorrect __amp_source_origin value on CORS requests

Created on 3 Apr 2019  路  20Comments  路  Source: ampproject/amphtml

What's the issue?

CORS requests made from AMP documents hosted by Bing.com's AMP cache are failing due to incorrect domain being set in the __amp_source_origin query param. The value of this param is being set to the AMP cache domain rather than the publisher's domain. This causes CORS validation to fail on the source origin's server.

How do we reproduce the issue?

Yahoo Sports document hosted by Bing's AMP cache and served via bing.com:

https://www.bing.com/amp/s/sports.yahoo.com/amphtml/fantasy-basketball-rookies-who-will-provide-the-most-future-value-211618398.html

This results in a CORS validation failure, as seen in the console:

Access to fetch at 'https://www.yahoo.com/caas/sidekick/sidekick?appid=amp&lang=en-US&region=US&site=sports&__amp_source_origin=https%3A%2F%2Fsports-yahoo-com.bing-amp.com' from origin 'https://sports-yahoo-com.bing-amp.com' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.

__amp_source_origin is being set to https://sports-yahoo-com.bing-amp.com instead of https://sports.yahoo.com

There's some suspicion that this is related the configuration of the cache, specifically the cdnProxyRegex property, which defaults to the Google AMP CDN domain if not set. See this bit of code, where cdnProxyRegex is configured, otherwise it falls back to Google's CDN:

https://github.com/ampproject/amphtml/blob/2755c10b95d54b98a587b7b091db7d5697684903/src/config.js#L41-L42

Worth asking - if this value is required to be set by a cache to a regex for its own domain in order for AMP documents to function properly, why offer a fallback value at all?

cc @choumx @ssantosms

What browsers are affected?

All

Which AMP version is affected?

Noticed on 1901081935550, suspect it's always been an issue however

Bug caching

All 20 comments

We can change this to be a required setting in the config. /to @rsimha

This doesn't look like a tooling issue. Moving to runtime to triage.

It's part of the build process where you specify the URL config changes. Runtime just takes what the config object build creates, and uses it to generate URLs. Definitely a build tooling issue. Ping me to discuss.

Assigning to @erwinmombay who should be familiar with this.

In our build system, we are replacing the default config to bing. However, I am not sure why it's not working properly.

@erwinmombay Can you check the below is correct?

From Bing v0.js:
var Wc=self.AMP_CONFIG||{},Xc={thirdParty:Wc.thirdPartyUrl||"https://3p.bing-amp.net",thirdPartyFrameHost:Wc.thirdPartyFrameHost||"bing-amp.net",thirdPartyFrameRegex:("string"==typeof Wc.thirdPartyFrameRegex?new RegExp(Wc.thirdPartyFrameRegex):Wc.thirdPartyFrameRegex)||/^d-\d+.bing-amp.net$/,cdn:Wc.cdnUrl||"https://www.bing-amp.com",cdnProxyRegex:("string"==typeof Wc.cdnProxyRegex?new RegExp(Wc.cdnProxyRegex):Wc.cdnProxyRegex)||/^https:\/\/([a-zA-Z0-9_-]+.)?.bing-amp.com$/,localhostRegex:/^https?:\/\/localhost(:\d+)?$/,

Everything appears correct in the config. I'm not able to load the actual https://sports-yahoo-com.bing-amp.com page (I'm assuming it's https://sports-yahoo-com.bing-amp.com/c/s/sports.yahoo.com/amphtml/fantasy-basketball-rookies-who-will-provide-the-most-future-value-211618398.html?) to test.

I believe they鈥檝e taken down Yahoo AMP pages while this issue is open. @ssantosms I鈥檇 be okay bringing the pages back while debugging if needed, the broken content is secondary in nature

@jridgewell @ssantosms It looks to me like there is an extra . in the regex that shouldn't be there, before 'bing-amp.com':

/^https://([a-zA-Z0-9_-]+.)?.bing-amp.com$/

There's another dot inside the capture group, so it's expecting two dots in the string it's matching. Also, I assume these should be literal periods, and thus escaped?

@jridgewell do you mean we would require cdnProxyRegex to be explicitly set through AMP_CONFIG?

For me the main question, and this hasn't been answered is, Is AMP_CONFIG actually a required Object for v0.js (or any other main js binary) to function correctly?

originally we had treated it as optional and that v0.js should work without it.

Oh, I thought we treated it as a requirement. Does it makes sense to make it required in this case?

Today, we don't write the AMP_CONFIG to v0.js unless...

  1. --fortesting is passed to gulp dist
  2. gulp prepend-global is explicitly run

https://github.com/ampproject/amphtml/blob/2b8b439203444545acd5c8780ae5ea79ac972b0a/gulpfile.js#L860-L876

I'm not against making the config a requirement. However, which one should we default to? (Tests use prod-config.json as the default.)

yeah, at this point i think it should be a requirement. The runtime won't really run correctly on cache without it.

prod makes more sense to me as default as it is the majority case.

@jridgewell @ssantosms It looks to me like there is an extra . in the regex that shouldn't be there, before 'bing-amp.com':

/^https://([a-zA-Z0-9_-]+.)?.bing-amp.com$/

There's another dot inside the capture group, so it's expecting two dots in the string it's matching. Also, I assume these should be literal periods, and thus escaped?

That's a good point. I will validate and fix if needed.

@src-code we brought it back to you can take a look. The CORS error is still there, but the regex has been fixed.

Also, I see the same error in the console when I open your page from Google CDN

Access to fetch at 'https://www.yahoo.com/caas/sidekick/sidekick?appid=amp&lang=en-US&region=US&site=sports&__amp_source_origin=https%3A%2F%2Fsports.yahoo.com' from origin 'https://sports-yahoo-com.cdn.ampproject.org' has been blocked by CORS policy: The 'Access-Control-Allow-Origin' header has a value 'https://sports.yahoo.com' that is not equal to the supplied origin. Have the server send the header with a valid value, or, if an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.

Thanks @ssantosms, the CORS request seems to be working properly now with your fix!

@ssantosms Unrelated, but I see that the ads are failing to render on the above example, seems that the host bing-amp.net can't be found. www.bing-amp.net does resolve however.

https://bing-amp.net/1904021746450/f.js
vs.
https://www.bing-amp.net/1904021746450/f.js

@src-code I noticed that yesterday. Team is looking into it with priority.
I will close this bug

Actually @rsimha I can't close this bug. Would you mind closing it? It has been fixed on our side.

Was this page helpful?
0 / 5 - 0 ratings