Metamask-extension: Metamask losing connection

Created on 6 Nov 2016 · 13Comments · Source: MetaMask/metamask-extension

This occurs randomly but seems to be occurring after calling a contract´s function.
The Extension contains no information about the issue but the console contains:

inpage-provider.js:14 
MetamaskInpageProvider - lost connection to MetaMask
(anonymous function) @ inpage-provider.js:14
(anonymous function) @ index.js:72f @ once.js:25
(anonymous function) @ index.js:44call @ index.js:49
(anonymous function) @ index.js:69f @ once.js:25
(anonymous function) @ index.js:28f @ once.js:17EventEmitter.emit @ events.js:96PostMessageStream._onMessage @ index.js:34

index.js:54 Uncaught Error: StreamProvider - Unknown response id(…)
StreamProvider._onResponse  @   index.js:54
StreamProvider._write   @   index.js:77
doWrite @   _stream_writable.js:319
writeOrBuffer   @   _stream_writable.js:308
Writable.write  @   _stream_writable.js:246
ondata  @   _stream_readable.js:531
EventEmitter.emit   @   events.js:81
readableAddChunk    @   _stream_readable.js:198
Readable.push   @   _stream_readable.js:157
Transform.push  @   _stream_transform.js:123
(anonymous function)    @   obj-multiplex.js:15
Transform._read @   _stream_transform.js:159
Transform._write    @   _stream_transform.js:147
doWrite @   _stream_writable.js:313
writeOrBuffer   @   _stream_writable.js:302
Writable.write  @   _stream_writable.js:241
ondata  @   _stream_readable.js:542
EventEmitter.emit   @   events.js:81
readableAddChunk    @   _stream_readable.js:213
Readable.push   @   _stream_readable.js:172
PostMessageStream._onMessage    @   index.js:32

This is critical for a DAPP because there is no feedback to the DAPP and the function will never be called.

Piece of code I use:

   ...
   return contract.myFunction(param1, param2, ..., {from: account, value: price})
      .then(_tx => {
          // NEVER GET THERE
       })
      .catch(e => {
          // NEVER GET THERE
      })
... game over!

T00-bug

Source

chevdor

All 13 comments

This issue is a very critical one and I don´t see much progress so I would like to know if there is a way to help.

I have been doing some testing today, still using 2.13.7 (same version that I had when reporting the issue in the first place) and I did see the issue way less often when I tested a few hours ago.

Since the extension version did not change, I suppose that the code I run is still the same so something external to the extension did change.

Could it be related to the node you are hosting? Do you have load/stats about it?

This won´t be a solution on its own but understanding what causes the issue may allow building an environment where the issue occurs more frequently, which I am certain would help identifying and fixing the bug.

chevdor on 16 Nov 2016

Sorry progress has been slow. We were at fractional capacity the last couple weeks, as we discussed with you in private slack:

Kumavis's new baby was born
I was on vacation last week

We also consider this very critical, but also actually have other critical issues to deal with! They basically never stop, so please be patient.

still using 2.13.7 (same version that I had when reporting the issue in the first place)

I don't think that's true, it was probably 2.13.6. 2.13.7 was only released 8 days ago, while you opened this issue 10 days ago.

You can see the release here:
https://github.com/MetaMask/metamask-plugin/releases/tag/2.13.7

That patch was mostly related to out of gas errors. If you noticed less frequent disconnects, and that the disconnects were resulting from initiating transactions, it's possible the disconnect crashes were coming from a sort of gas estimation error.

This would still mean we need to do a better job of handling errors, since it looks like some error may be crashing the background in a way that prevents a reconnection, but it's good to hear it might have improved a bit..?

If there's one thing you could do to help us move this along, it would be to give us a nice reproduction environment. Ideally this would be a dapp that causes the disconnect, that we could visit and debug from.

danfinlay on 16 Nov 2016

Hi @flyswatter, no worries. I am just hoping I can help.

I am sure about the versioning, I did test with 2.13.7 (yep, few days after reporting the issue as I hoped that some other fixes may have helped) but the frequency was rather high (not 100% of the time though). You can see the version number in the video below (I took it 6d ago).

I surely can provide a DAPP to test as well as a recording showing the issue.
https://youtu.be/w4HqG8cZ9UI?t=5m1s

You can use the DAPP on testnet from http://ethincorp.com/beta
There is an issue on my side so if you get a message saying that Metamask is missing (I am checking for the presence of web3 too early and don´t give it time to load...), reload (or hard reload) the page and you should see the CREATE button in the menu. That´s where you can test.
To test, you do not need to input anything. The default address will be yours.

There are times like this morning when things look better, meaning: tx hash comes back in a reasonable time of around 30s and the disconnection message shows up less than 1 time per contract call.
There are also times when it takes forever to get a tx hash back if at all, and usually, Metamask loses the connection before returning anything.

chevdor on 17 Nov 2016

Just a 2cents: I just saw a case where I could use the DAPP mentioned above. It was slow but worked. I left the page open and did something else. After a while, I could see the 'lost connection' message.

What the DAPP does when open is polling both the current network and account every second.

chevdor on 17 Nov 2016

I made a test with a polling every 100ms and it seems to trigger the issue more often.
I will make you a special version that polls even more often hoping it will make it easier to test.

chevdor on 17 Nov 2016

You can find a special version at https://ethincorp.com/metamask/
This version polls every 10ms and I brought back https://ethincorp.com/beta to 5s

In 'your' version (https://ethincorp.com/metamask/), the issue now occurs very fast and you should see it after reloading the page a few times under 5 seconds.

chevdor on 17 Nov 2016

Very nice, I was able to reproduce very quickly! Thank you, this will help!

danfinlay on 17 Nov 2016

🎉1

This crash is caused by receiving a response to a request who our StreamProvider does not remember making.

The web3-stream-provider keeps a hash of all issued requests by ID, and deletes them as requests with a matching ID are returned.

I it seems that when we issue requests, we assign each one an ID based on web3-provider-engine/util/random-id.js.

This "random" id generation code uses the current time + 3 additional digits.

When a site is making many requests at once, we're basically playing a lottery for a request collision before two requests have the same ID.

I propose we extend the number of random digits to something dramatically less likely to collide, like the current unix milisecond time + 10 random digits. It still has collision potential, but far, far less.

danfinlay on 17 Nov 2016

👍1

I don´t think polling every millisecond makes sense but would not it be better to ensure no collision by using a hash of your current or extended ID?

Sure hashing is no guaranty either but it will generate a very different id for consecutive ids.

Moreover, re-injecting the previous hash in the string used to generate the new one would solve the problem even if we generate IDs every nanosecond.

We initialise with oldHash = ''

newHash = hash(timeBasedID + oldHash)
oldHash = newHash

chevdor on 17 Nov 2016

It's very funny that you've found a way to use a tiny blockchain to solve this problem.

Looks like it would work..

danfinlay on 17 Nov 2016

😄 I just don´t think we can find a time base that works.
If you pick an ID with a resolution of 1s, there will be always someone using 1ms.
And if you use 1ms, someone will use less... etc...

So chaining the hashes, what counts is mainly the first value and having something around a millisecond would do it. I doubt a user can reload Metamask in less than a few milliseconds... at least!

Happy you pinned this one @flyswatter
Thanks for looking into it.

chevdor on 17 Nov 2016

One tradeoff is that hashing is kindof computationally expensive, so a hash-based id scheme would slow down requests a bit. Maybe we could do something simpler, like append the modulus of the most recent previous request, or even just keep track of the requests and increment them.

danfinlay on 17 Nov 2016

Actually yes, simply keeping track and incrementing does it, you are totally right.

chevdor on 17 Nov 2016

Was this page helpful?

0 / 5 - 0 ratings