Envoy: gRPC-Web requests on Safari fail when using Envoy 1.13

Created on 25 Mar 2020  ·  27Comments  ·  Source: envoyproxy/envoy

When attempting to use gRPC-Web on Safari using Envoy 1.13, the request succeeds, but in the inspector in Safari, this is the error I see:

Failed to load resource: WebKit encountered an internal error

Here's the web inspector output from Safari:

Summary

URL: https://myurl.io/svc.Authentication/Login
Status: —
Source: —
Initiator: 
2.c85f5bc6.chunk.js:2:583533

Request

Content-Type: application/grpc-web-text
Accept: application/grpc-web-text
Origin: https://my-url.com
Referer: https://my-url.com/login
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.5 Safari/605.1.15
X-User-Agent: grpc-web-javascript/0.1
X-Grpc-Web: 1
grpc-timeout: 4999m

Response

No response headers

Request Data

MIME Type: application/grpc-web-text
Request Data: 

This issue was initially raised under the https://github.com/grpc/grpc-web repo: https://github.com/grpc/grpc-web/issues/759

The request should return a successful response.

To give you more context, I'm personally running envoy via https://github.com/projectcontour/contour, though I don't think it's specific to contour since @phlippieb seems to be experiencing the same issue without contour I think.

I'm running envoy on GKE 1.15.9 as an ingress via a LoadBalancer (hence why I'm using contour).

aregrpc bug no stalebot

Most helpful comment

Seems like skipping trailers.clear() for HTTP/2 could help this. @yoitsro could you try my image: dio123/envoy:grpc-web-debug (note that this one is ubuntu-based, not the alpine one)? From my local testing here, it seems to be working (I could reproduce this issue using the latest release).

Basically, the following:

  if (encoder_callbacks_->streamInfo().protocol().value() != Http::Protocol::Http2) {
    // Clear out the trailers so they don't get added since it is now in the body
    trailers.clear();
  }

All 27 comments

@yoitsro sorry, does this consistently happen? Could you provide some screenshots? Thanks!

I tried to run the grpc-web example (echotest). However, I couldn't repro this case (both for safari 12 and 13).

image

image

Note the warning there is from grammarly ext that I have.

This does consistently happen. Here's what I see with the URLs obfuscated. If you'd like more details, let me know and we could potentially do a screen share.

image

It's also worth noting that envoy is being used to terminate TLS too - I've noticed your setup is just over HTTP to localhost.

Same issue, with envoy (gloo)... managing the TLS as well with gloo.

Safari 13.0.5
chrome and firefox perform as expected.

are there any updates on this, we are facing the same issue after updating to istio 1.5.1, running envoy 1.13 - tested with

safari 13.05 and 13.1

I have the same exact issue, also with Gloo (1.3.14) running on GKE it seem to be a crash WebKit side in their network process on queue com.apple.CFNetwork.HTTP2.HTTP2Stream triggered by some new behaviour of envoy, so while the problem is not in Envoy maybe there is a way workaround the problem envoy-side.

I couldn't find any issue on WebKit bugzilla so I created one: https://bugs.webkit.org/show_bug.cgi?id=210108

The crash looks like that:

Process:               com.apple.WebKit.Networking [55694]
Path:                  /System/Library/StagedFrameworks/Safari/WebKit.framework/Versions/A/XPCServices/com.apple.WebKit.Networking.xpc/Contents/MacOS/com.apple.WebKit.Networking
Identifier:            com.apple.WebKit.Networking
Version:               14608 (14608.5.12)
Build Info:            WebKit2-7608005012000000~4
Code Type:             X86-64 (Native)
Parent Process:        ??? [1]
Responsible:           Safari [55672]
User ID:               502

Date/Time:             2020-04-06 17:30:11.502 +0200
OS Version:            Mac OS X 10.14.6 (18G3020)
Report Version:        12
Bridge OS Version:     4.2 (17P3050)
Anonymous UUID:        874D8B32-6898-4020-64F8-21AA02638A09

Sleep/Wake UUID:       09C3F802-395E-44B9-88C3-9D71B8E92310

Time Awake Since Boot: 260000 seconds
Time Since Wake:       16000 seconds

System Integrity Protection: enabled

Crashed Thread:        4  Dispatch queue: com.apple.CFNetwork.HTTP2.HTTP2Stream

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000000
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Segmentation fault: 11
Termination Reason:    Namespace SIGNAL, Code 0xb
Terminating Process:   exc handler [55694]

VM Regions Near 0:
--> 
    __TEXT                 000000010faea000-000000010faeb000 [    4K] r-x/rwx SM=COW  /System/Library/StagedFrameworks/Safari/WebKit.framework/Versions/A/XPCServices/com.apple.WebKit.Networking.xpc/Contents/MacOS/com.apple.WebKit.Networking

Thread 0:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib          0x00007fff5899421a mach_msg_trap + 10
1   libsystem_kernel.dylib          0x00007fff58994768 mach_msg + 60
2   com.apple.CoreFoundation        0x00007fff2c8d499e __CFRunLoopServiceMachPort + 328
3   com.apple.CoreFoundation        0x00007fff2c8d3f0c __CFRunLoopRun + 1612
4   com.apple.CoreFoundation        0x00007fff2c8d366e CFRunLoopRunSpecific + 455
5   com.apple.Foundation            0x00007fff2eb392ff -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 280
6   com.apple.Foundation            0x00007fff2eb391d4 -[NSRunLoop(NSRunLoop) run] + 76
7   libxpc.dylib                    0x00007fff58a9805b _xpc_objc_main + 552
8   libxpc.dylib                    0x00007fff58a97b5d xpc_main + 433
9   com.apple.WebKit                0x000000010fccfeb1 WebKit::XPCServiceMain(int, char const**) + 547
10  libdyld.dylib                   0x00007fff5885f3d5 start + 1

Thread 1:: JavaScriptCore bmalloc scavenger
0   libsystem_kernel.dylib          0x00007fff58997866 __psynch_cvwait + 10
1   libsystem_pthread.dylib         0x00007fff58a5656e _pthread_cond_wait + 722
2   libc++.1.dylib                  0x00007fff55a90b31 std::__1::condition_variable::__do_timed_wait(std::__1::unique_lock<std::__1::mutex>&, std::__1::chrono::time_point<std::__1::chrono::system_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > >) + 93
3   com.apple.JavaScriptCore        0x0000000115110ec6 bmalloc::Scavenger::threadRunLoop() + 774
4   com.apple.JavaScriptCore        0x0000000115110889 bmalloc::Scavenger::threadEntryPoint(bmalloc::Scavenger*) + 9
5   com.apple.JavaScriptCore        0x0000000115113097 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(bmalloc::Scavenger*), bmalloc::Scavenger*> >(void*) + 39
6   libsystem_pthread.dylib         0x00007fff58a532eb _pthread_body + 126
7   libsystem_pthread.dylib         0x00007fff58a56249 _pthread_start + 66
8   libsystem_pthread.dylib         0x00007fff58a5240d thread_start + 13

Thread 2:: com.apple.NSURLConnectionLoader
0   libsystem_kernel.dylib          0x00007fff5899421a mach_msg_trap + 10
1   libsystem_kernel.dylib          0x00007fff58994768 mach_msg + 60
2   com.apple.CoreFoundation        0x00007fff2c8d499e __CFRunLoopServiceMachPort + 328
3   com.apple.CoreFoundation        0x00007fff2c8d3f0c __CFRunLoopRun + 1612
4   com.apple.CoreFoundation        0x00007fff2c8d366e CFRunLoopRunSpecific + 455
5   com.apple.CFNetwork             0x00007fff2b7b8078 -[__CoreSchedulingSetRunnable runForever] + 210
6   com.apple.Foundation            0x00007fff2eb2f0e2 __NSThread__start__ + 1194
7   libsystem_pthread.dylib         0x00007fff58a532eb _pthread_body + 126
8   libsystem_pthread.dylib         0x00007fff58a56249 _pthread_start + 66
9   libsystem_pthread.dylib         0x00007fff58a5240d thread_start + 13

Thread 3:
0   libsystem_pthread.dylib         0x00007fff58a523f0 start_wqthread + 0

Thread 4 Crashed:: Dispatch queue: com.apple.CFNetwork.HTTP2.HTTP2Stream
0   com.apple.CoreFoundation        0x00007fff2c8b1b6e CFArrayGetCount + 10
1   com.apple.CFNetwork             0x00007fff2b7e5269 HTTP2Stream::_onqueue_processRawHeaders() + 45
2   com.apple.CFNetwork             0x00007fff2b91bc4a HTTP2Stream::_onqueue_endTrailers() + 18
3   libdispatch.dylib               0x00007fff588115f8 _dispatch_call_block_and_release + 12
4   libdispatch.dylib               0x00007fff5881263d _dispatch_client_callout + 8
5   libdispatch.dylib               0x00007fff588188e0 _dispatch_lane_serial_drain + 602
6   libdispatch.dylib               0x00007fff58819396 _dispatch_lane_invoke + 385
7   libdispatch.dylib               0x00007fff588216ed _dispatch_workloop_worker_thread + 598
8   libsystem_pthread.dylib         0x00007fff58a52611 _pthread_wqthread + 421
9   libsystem_pthread.dylib         0x00007fff58a523fd start_wqthread + 13

Thread 5:: WebCore: AsyncFileStream
0   libsystem_kernel.dylib          0x00007fff58997866 __psynch_cvwait + 10
1   libsystem_pthread.dylib         0x00007fff58a5656e _pthread_cond_wait + 722
2   com.apple.JavaScriptCore        0x000000011509f217 WTF::ParkingLot::parkConditionallyImpl(void const*, WTF::ScopedLambda<bool ()> const&, WTF::ScopedLambda<void ()> const&, WTF::TimeWithDynamicClockType const&) + 3927
3   com.apple.WebCore               0x0000000110e275b5 bool WTF::Condition::waitUntil<WTF::Lock>(WTF::Lock&, WTF::TimeWithDynamicClockType const&) + 165
4   com.apple.WebCore               0x00000001111f8f83 std::__1::unique_ptr<WTF::Function<void ()>, std::__1::default_delete<WTF::Function<void ()> > > WTF::MessageQueue<WTF::Function<void ()> >::waitForMessageFilteredWithTimeout<WTF::MessageQueue<WTF::Function<void ()> >::waitForMessage()::'lambda'(WTF::Function<void ()> const&)>(WTF::MessageQueueWaitResult&, WTF::MessageQueue<WTF::Function<void ()> >::waitForMessage()::'lambda'(WTF::Function<void ()> const&)&&, WTF::Seconds) + 211
5   com.apple.WebCore               0x0000000111ea546d WebCore::callOnFileThread(WTF::Function<void ()>&&)::$_9::operator()() const::'lambda'()::operator()() const + 93
6   com.apple.WebCore               0x0000000111ea5409 WTF::Detail::CallableWrapper<WebCore::callOnFileThread(WTF::Function<void ()>&&)::$_9::operator()() const::'lambda'(), void>::call() + 9
7   com.apple.JavaScriptCore        0x00000001150b87c3 WTF::Thread::entryPoint(WTF::Thread::NewThreadContext*) + 403
8   com.apple.JavaScriptCore        0x00000001150bb379 WTF::wtfThreadEntryPoint(void*) (.llvm.12532219493473948909) + 9
9   libsystem_pthread.dylib         0x00007fff58a532eb _pthread_body + 126
10  libsystem_pthread.dylib         0x00007fff58a56249 _pthread_start + 66
11  libsystem_pthread.dylib         0x00007fff58a5240d thread_start + 13

Thread 6:: IndexedDatabase Server
0   libsystem_kernel.dylib          0x00007fff58997866 __psynch_cvwait + 10
1   libsystem_pthread.dylib         0x00007fff58a5656e _pthread_cond_wait + 722
2   com.apple.JavaScriptCore        0x000000011509f217 WTF::ParkingLot::parkConditionallyImpl(void const*, WTF::ScopedLambda<bool ()> const&, WTF::ScopedLambda<void ()> const&, WTF::TimeWithDynamicClockType const&) + 3927
3   com.apple.JavaScriptCore        0x0000000114a8b545 bool WTF::Condition::waitUntil<WTF::Lock>(WTF::Lock&, WTF::TimeWithDynamicClockType const&) + 165
4   com.apple.JavaScriptCore        0x000000011507cb3e WTF::Detail::CallableWrapper<WTF::CrossThreadTaskHandler::CrossThreadTaskHandler(char const*, WTF::CrossThreadTaskHandler::AutodrainedPoolForRunLoop)::$_0, void>::call() + 302
5   com.apple.JavaScriptCore        0x00000001150b87c3 WTF::Thread::entryPoint(WTF::Thread::NewThreadContext*) + 403
6   com.apple.JavaScriptCore        0x00000001150bb379 WTF::wtfThreadEntryPoint(void*) (.llvm.12532219493473948909) + 9
7   libsystem_pthread.dylib         0x00007fff58a532eb _pthread_body + 126
8   libsystem_pthread.dylib         0x00007fff58a56249 _pthread_start + 66
9   libsystem_pthread.dylib         0x00007fff58a5240d thread_start + 13

Thread 7:
0   libsystem_pthread.dylib         0x00007fff58a523f0 start_wqthread + 0

Thread 8:: Dispatch queue: com.apple.CFNetwork.Connection
0   libobjc.A.dylib                 0x00007fff57086bbe -[NSObject autorelease] + 20
1   com.apple.LaunchServices        0x00007fff2ded3769 -[_UTTypeQuery resolve] + 76
2   com.apple.LaunchServices        0x00007fff2ded5649 UTTypeCopyAllTagsWithClass + 116
3   com.apple.LaunchServices        0x00007fff2ded5591 UTTypeCopyPreferredTagWithClass + 12
4   com.apple.CFNetwork             0x00007fff2b7e2d8e copyMIMETypeForExtension + 65
5   com.apple.CFNetwork             0x00007fff2b7e2cfe +[NSURLSessionTaskDependencyTree mimeTypeForURLString:] + 55
6   com.apple.CFNetwork             0x00007fff2b7e6968 -[__NSCFURLSessionTaskActiveStreamDependencyInfo removeStreamWithStreamID:requestURLString:] + 242
7   com.apple.CFNetwork             0x00007fff2b7e6799 __NSURLSessionTaskDependency_RemoveRequest + 135
8   com.apple.CFNetwork             0x00007fff2b7e66a8 HTTP2Stream::cleanUpInUserDataResetCallback(nghttp2_session*, int, unsigned int, HTTP2Connection*) + 90
9   com.apple.CFNetwork             0x00007fff2b7e6616 cf_nghttp2_on_stream_close_callback(nghttp2_session*, int, unsigned int, void*) + 111
10  libapple_nghttp2.dylib          0x00007fff558ddacd nghttp2_session_close_stream + 147
11  libapple_nghttp2.dylib          0x00007fff558db5e4 nghttp2_session_mem_recv + 8010
12  libapple_nghttp2.dylib          0x00007fff558d9621 nghttp2_session_recv + 98
13  com.apple.CFNetwork             0x00007fff2b7e3f1b HTTP2Connection::_onqueue_performRead() + 21
14  com.apple.CFNetwork             0x00007fff2b7e3167 HTTP2Connection::_onqueue_scheduleIO() + 153
15  libdispatch.dylib               0x00007fff588115f8 _dispatch_call_block_and_release + 12
16  libdispatch.dylib               0x00007fff5881263d _dispatch_client_callout + 8
17  libdispatch.dylib               0x00007fff588188e0 _dispatch_lane_serial_drain + 602
18  libdispatch.dylib               0x00007fff588193c6 _dispatch_lane_invoke + 433
19  libdispatch.dylib               0x00007fff5881a667 _dispatch_workloop_invoke + 2100
20  libdispatch.dylib               0x00007fff588216ed _dispatch_workloop_worker_thread + 598
21  libsystem_pthread.dylib         0x00007fff58a52611 _pthread_wqthread + 421
22  libsystem_pthread.dylib         0x00007fff58a523fd start_wqthread + 13

Thread 4 crashed with X86 Thread State (64-bit):
  rax: 0x0000000000000001  rbx: 0x00007fd3df564dc0  rcx: 0x00000000000001b1  rdx: 0x00007fff8efc1388
  rdi: 0x0000000000000000  rsi: 0x00007fff588115ec  rbp: 0x000070000c6948a0  rsp: 0x000070000c6948a0
   r8: 0x00000000840f4000   r9: 0x00007fd3df5270e0  r10: 0x0000000000000004  r11: 0x0000000000000004
  r12: 0x0000000000000000  r13: 0x0000000000000000  r14: 0x00007fd3e9022cf0  r15: 0x0000000118a02880
  rip: 0x00007fff2c8b1b6e  rfl: 0x0000000000010246  cr2: 0x0000000000000000

Logical CPU:     10
Error Code:      0x00000004
Trap Number:     14

cc @lizan @qiwzhang @fengli79

Someone responded on WebKit bugzilla that the problem is not inside WebKit but on a lower level library & now tracked Apple side in Radar as rdar://problem/61383605 (If anyone wants to reference it with them)

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

It does not have any problem regarding to version v1.12.0. I am noting it down here, if any of you would like to have quick solution, until the bug fixed.

The same problem on 1.14.1
image

I set the parameters:
alpn_protocols:
- "http/1.1"
- "h2"
And everything worked

We are experiencing the same issues, but also note that it seems to effect certain versions of Windows 10 and Internet Explorer/Edge (pre Chromium versions) as well as Safari.

@engineerdev 's fix to enable ALPN for HTTP/1.1 does indeed work; but requests are now forced to use HTTP/1.1 even when HTTP/2 is available; so this workaround is not a long term solution.

You may want to downgrade to Envoy 1.12 until a full (HTTP/2) solutions is available.

We are experiencing the same issues, but also note that it seems to effect certain versions of Windows 10 and Internet Explorer/Edge (pre Chromium versions) as well as Safari.

@engineerdev 's fix to enable ALPN for HTTP/1.1 does indeed work; but requests are now forced to use HTTP/1.1 even when HTTP/2 is available; so this workaround is not a long term solution.

You may want to downgrade to Envoy 1.12 until a full (HTTP/2) solutions is available.

Yes, issue in Safari is resolved in 1.14 with @engineerdev 's alpn_protocol options, but IE/edge still has issues with the same setup.

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

Hey @eselkin @rogchap - did this issue also surface with you in Envoy 1.13 or was it potentially happening on 1.12 too?

Just on 1.13. People reverted to 1.12 and said the issue was resolved. But I could not revert.

Yea, issue was there for 1.13 too, but not 1.12. One theory was that it was related to the upgrade of nghttp2 as it seems fine on http2 with chrome and firefox.

We're currently experiencing this issue. Is this fixed with any of the later versions of envoy without having to enable ALPN for HTTP/1.1?

@LukeHickling Sadly not; still hoping that someone on the envoy core team can look into this.
We've had to roll out with using HTTP/1.1 again 😞.

envoy 1.15.0 the same problem :(

I'm going to start digging into this soon. I have no experience with envoy's source, but I'll start documenting progress here.

Given all of three minutes, I've naïvely headed straight to the grpc_web filter code (https://github.com/envoyproxy/envoy/commits/master/source/extensions/filters/http/grpc_web) and I can see there a couple of commits made between the v1.12.2 release and the v1.13.0 release.

There's this commit, which very much seems like a big ol' find and replace, so nothing so specific to us: https://github.com/envoyproxy/envoy/commit/5248a4fb7d4c2a3d1fa151f944d3a63f6b7a06cf#diff-27a67954e6368bb1ca4f2bb23d271a50

And then there's this, which seems likely unrelated: https://github.com/envoyproxy/envoy/commit/2d5a4e94720cc195324f79ca68f0e7a7dc83ee9e#diff-181baa129c5643cbe0aba4b4357d2905

BUT then there's this: https://github.com/envoyproxy/envoy/commit/cbf565fed3ecf04df7be9b90c3f1384396c54012#diff-181baa129c5643cbe0aba4b4357d2905

This seems somewhat related. From our experience, the gRPC request is successfully called on the server application and the application returns the response. This suggests the error is then in the _encoding_ of a gRPC response into a gRPC-web response, which is what this commit deals with. Can anybody else confirm that their application receives the gRPC request successfully?

@mattklein123 and @Chuongv, is there anything in that commit which stands out to you which might cause the behaviour we're seeing?

EDIT: I _know_ this is inherently a bug in Safari, as opposed to a direct bug in envoy, but I guess we can only control what we can control :) thanks in advance for any help from anyone 🙏

@yoitsro The only thing that might be the issue is that the trailers for the grpc web response might not be encoded which might put Safari in a bad state? But then that does not explain why it works on the other web browsers 🤷 ?

You could try turning on enable_trailers on the listener to check and see if that would fix the issue?

Seems like skipping trailers.clear() for HTTP/2 could help this. @yoitsro could you try my image: dio123/envoy:grpc-web-debug (note that this one is ubuntu-based, not the alpine one)? From my local testing here, it seems to be working (I could reproduce this issue using the latest release).

Basically, the following:

  if (encoder_callbacks_->streamInfo().protocol().value() != Http::Protocol::Http2) {
    // Clear out the trailers so they don't get added since it is now in the body
    trailers.clear();
  }

I can confirm that this is working :tears_of_joy:

Is there anything I/we can do to help get this released?

@yoitsro sorry for the late reply, but I have submitted a PR: https://github.com/envoyproxy/envoy/pull/12178.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

justConfused picture justConfused  ·  3Comments

karthequian picture karthequian  ·  3Comments

jeremybaumont picture jeremybaumont  ·  3Comments

yanniszark picture yanniszark  ·  3Comments

jmillikin-stripe picture jmillikin-stripe  ·  3Comments