As a result of #450, in trufflesuite/ganache-core#41, @0xNPE gave us some much needed tests which reproduce the underlying problems behind #417. This bounty is for a fix for the underlying problem which those tests identify.
Bounty acceptance criteria:
ganache-core or any other impacted projects.Important note: This bounty will not be paid out until all required PRs are accepted and merged.
Tips:
ganache-core you can use npm link to symlink your local copy of the dependency into the node_modules directory of the ganache-core project.standard tool to verify that you have done so.__This issue now has a funding of 0.5 ETH (374.81 USD) attached to it.__
happy new year! im going to spend some cycles this week sourcing someone to take this on..
FYI, https://github.com/ethereumjs/merkle-patricia-tree/pull/28 looks like a promising fix for TypeError: Cannot read property 'get' of undefined
I dug a little deeper into TypeError: Cannot read property 'pop' of undefined. It happens during ethereumjs-vm runTx when it flushes the cache to its trie after a couple of checkpoints, commit, and revert.
Hi, I am the author of https://github.com/ethereumjs/merkle-patricia-tree/pull/28.
I also looked at the pop error and I think it's related to reading a non-existent trie node (maybe due to some confusion about the currently active DB).
IIRC, the problematic call stack goes like this:
// baseTrie.js
put
findPath
_walkTrie
_lookupNode
_walkTrie.processNode
findPath.processNode
_updateNode
Where _updateNode is then called with an undefined value for stack.
If you look at the _walkTrie function, you will find that the child processNode function calls cb without arguments if it does not receive a node (!node).
cb then does onDone.apply(null, returnValues) with an empty returnValues array (because walkController.return was never called and thus returnValues never reassigned in this case).
Thus, stack in _updateNode ends up undefined.
Also interesting to look at are the conditions under which _lookupNode (called from _walkTrie) returns a null value.
@federicobond your analysis of the call stack is similar to what I observed.
but I don't think the root cause can be fixed in merkle-patricia-tree.
I think the root cause pertains to ethereumjs-vm working on its cache of statemanager while in checkpoint mode, then when it tries to persist its cache after certain revert actions, there is a reference to a node that never existed in the db, hence your call stack occurs.
That makes sense. I'll see if I find anything else.
__The funding of 0.5 ETH (427.59 USD) attached has been claimed by @kammerdiener.__
@kammerdiener, please leave a comment to let the funder (@owocki) and the other parties involved your implementation plan. If you don't leave a comment, the funder may expire your claim at their discretion.
So a quick update on this. Based on what I have found I think that both issues are currently originating from ethereumjs/merkle-patricia-tree. I have fixed the get being undefined. Currently working on figuring out what is going on with pop coming back undefined.
Believe I have fixed it all within ethereumjs/merkle-patricia-tree. The Pull Request is here: PR #30. I tried to take into account all of the information talked about from above and was sure to check @federicobond PR to help fix the _getDBs portion. All of the tests pass now with no errors.
I went into the node_modules folder and replaced the items and am attaching a screenshot of it passing the race conditions test. Once the PR for Merkle Patricia Trees is merged then. Should I go ahead and submit a PR for this with it including the race conditions test (removing the skip command)?

@kammerdiener there are actually a couple of different competing submissions at this point. I'll be taking some time early this week to evaluate them. I realize now that I said that the goal was to make the tests pass, but the real goal is to fix #417. The best thing that you can do at this point is to manually test ganache-cli with your fix in place (you can use npm link to do this) and determine whether you observe either of the crashes identified in #417. This is how I'll be evaluating the proposed fixes as well, so it'll take a bit of time, I'm afraid.
@roderik by any chance would you like to help with evaluating these fixes, as I believe you've been impacted by this bug quite regularly?
@kammerdiener, as I feared, when testing your fix I observe a new issue. I believe this is due to what @0xNPE @federicobond were discussing above. That said, your changes do make ganache much more stable, so until a proper fix is found we will likely include this fix in future releases (provided that it is accepted by the ethereumjs/merkle-patricia-tree maintainers).
The newly observed issue:
Error: Error: LevelUpArrayAdapter named 'blockLogs' index out of range: index 190; length: 187
at ganache-cli/build/lib.node.js:87740:23
at ganache-cli/build/lib.node.js:87714:5
at ganache-cli/build/lib.node.js:88160:12
at ganache-cli/build/lib.node.js:87925:19
at ganache-cli/build/lib.node.js:92548:7
at _combinedTickCallback (internal/process/next_tick.js:67:7)
at process._tickCallback (internal/process/next_tick.js:98:9)
at Object.InvalidResponse (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:41484:16)
at node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:329530:36
at node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:176186:11
at node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:325200:9
at XMLHttpRequest.request.onreadystatechange (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:328229:7)
at XMLHttpRequestEventTarget.dispatchEvent (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:176415:18)
at XMLHttpRequest._setReadyState (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:176705:12)
at XMLHttpRequest._onHttpResponseEnd (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:176860:12)
at IncomingMessage.<anonymous> (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:176820:24)
at IncomingMessage.emit (events.js:164:20)
Alright thanks for the information. I will start working trying to figure this out as well. Do you have any more information on how you caused this error?
@kammerdiener it's fairly easy to do this with trufflesuite/ganache, as it'll crash somewhat readily when running truffle test on nearly any truffle project while the accounts page is visible. You can grab the metacoin truffle box if you don't have a project handy.
I recommend using the develop branch Ganache as well, as that will let you see your logs on crash a bit more easily by hitting the "report issue to github" button (just do me a favor and don't submit the issue that it opens up).
Also I'd expected ganache-cli to have good sourcemap support by now. I'm looking into this and will commit a fix to the relevant places once that's done. Will comment back here as well, hopefully with a mapped stack trace rather than the useless one I just posted.
@benjamincburns I will be watching for it. Thanks for the tip. I will see what I can do to fix it and make it more stable!
Also be advised that the error I reported will be visible as an error response in your tests. Ganache doesn't crash in this case, but it's also useless for future tests (rerunning tests w/o restarting Ganache just spits out the same error)
@kammerdiener I managed to fix sourcemaps. I'll commit that to the develop branch of ganache-cli within the next few minutes.
In the mean time, here's the mapped stacktrace:
Error: Error: LevelUpArrayAdapter named 'blockLogs' index out of range: index 28; length: 28
at ganache-cli/build/webpack:/ganache-core/lib/database/leveluparrayadapter.js:51:1
at ganache-cli/build/webpack:/ganache-core/lib/database/leveluparrayadapter.js:25:1
at ganache-cli/build/webpack:/ganache-core/node_modules/level-sublevel/shell.js:102:1
at ganache-cli/build/webpack:/ganache-core/node_modules/level-sublevel/nut.js:122:1
at ganache-cli/build/webpack:/ganache-core/node_modules/cachedown/index.js:53:1
at _combinedTickCallback (internal/process/next_tick.js:67:7)
at process._tickCallback (internal/process/next_tick.js:98:9)
at Object.InvalidResponse (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:41484:16)
at node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:329530:36
at node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:176186:11
at node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:325200:9
at XMLHttpRequest.request.onreadystatechange (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:328229:7)
at XMLHttpRequestEventTarget.dispatchEvent (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:176415:18)
at XMLHttpRequest._setReadyState (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:176705:12)
at XMLHttpRequest._onHttpResponseEnd (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:176860:12)
at IncomingMessage.<anonymous> (node/v9.3.0/lib/node_modules/truffle/build/cli.bundled.js:176820:24)
at IncomingMessage.emit (events.js:164:20)
Actually now that I read that a bit more carefully, it seems that it's actually broken the truffle side of things. I'll dig in a bit further and see if I can't figure out what's gone amiss.
@kammerdiener I'm afraid I also just observed the original issue with your fix in place:
TypeError: Cannot read property 'pop' of undefined
at CheckpointTrie.Trie._updateNode (ganache-cli/build/webpack:/merkle-patricia-tree/baseTrie.js:359:1)
at ganache-cli/build/webpack:/merkle-patricia-tree/baseTrie.js:106:1
at ganache-cli/build/webpack:/merkle-patricia-tree/baseTrie.js:460:1
at processNode (ganache-cli/build/webpack:/merkle-patricia-tree/baseTrie.js:465:1)
at ganache-cli/build/webpack:/merkle-patricia-tree/baseTrie.js:503:1
at ganache-cli/build/webpack:/merkle-patricia-tree/baseTrie.js:179:1
at ganache-cli/build/webpack:/merkle-patricia-tree/util.js:75:1
at ganache-cli/build/webpack:/merkle-patricia-tree/node_modules/async/lib/async.js:52:1
at ganache-cli/build/webpack:/merkle-patricia-tree/node_modules/async/lib/async.js:269:1
at ganache-cli/build/webpack:/merkle-patricia-tree/node_modules/async/lib/async.js:44:1
@benjamincburns gladly!
@kammerdiener did you still want to claim the work here, or should i release it to someone else?
@owocki I'll work on it some more. I think I have an idea for the fix.
@owocki I have exhausted all of my potential fixes. If you would go ahead and release it in the interest of getting this fixed sooner. I will look for other issues to try and tackle.
@owocki are claims mutually exclusive, or can multiple people register a claim to the bounty at once? If only one claim can be registered at a time, I think it's best if we reject @kammerdiener's claim until he's resubmitted. I really appreciate your work, @kammerdiener, but I worry that people may see a claim for the bounty here and be put off from trying to solve the issue, when the truth is that the field is still wide open.
@benjamincburns totally understand and totally agree. I am out of ideas as to what to check (although I will be watching for the solution just because of my own curiosity).
Hi @benjamincburns -- only one person can claim work for an issue at a time. We do this so that the network is not an enabler of spec work.
I'm rejecting @kammerdiener 's claim now.. Thanks for the work and the solid communication.
__This claim for the funding of 0.5 ETH (516.33 USD) attached to this issue has been rejected and can now be claimed by someone else.__
__The funding of 0.5 ETH (491.36 USD) attached has been claimed .__
If you are the claimee, please leave a comment to let the funder (@owocki) and the other parties involved your implementation plan. If you don't leave a comment, the funder may expire your claim at their discretion.
I think the best (only?) way that this can be solved such that the band-aid fix of processing each request individually can be removed is if instances of merkle-patricia-tree are short-lived and not shared between requests. Rather, each instance should share the same backing db.
I had a local branch open a while ago that did this, and which also created instances of the EVM as needed, but I never managed to get it working with the forking feature.
The tests expose two different issues both in the https://github.com/ethereumjs/merkle-patricia-tree. There may be additional problems between the ganache doubles
I have both working, but need to cleanup code and will put up a patch
hey @clehene are you the claimee above?
@owocki hey, yes
That's exciting news, @clehene! I'm eager to see what you've come up with!
@clehene Just a tip - if you're going to test using Ganache like I described here, you'll want to be sure to disable the request queuing in ganache-core, found in lib/provider.js.
@benjamincburns hey first sorry for the delay.
Second, I'm afraid I'll need a bit more info on how to do that test or the one mentioned in #417 - i.e. concrete instructions
I tried a few things, but I'm otherwise doing many things for the first time, so likely missing some context.
lib/provider.js in ganache-coretruffle unbox metacointruffle test in metacoin it works by defaultnpm start in ganache and then truffle test in metacoin it fails with a complaint its' missing ConvertLibI can't tell what would be request queueing (requestfunnel subprovider?)
Request queuing is here
Easiest way to disable it is to pass an options object with asyncRequestProcessing = false. You can hard code this in ganache in src/chain.js to make life a little easier.
if I do a truffle test in metacoin it works by default
Just a heads up that if you don't pass asyncRequestProcessing = false then this will likely be the case, as the queuing mechanism I linked to above sidesteps the issue.
if I do npm start in ganache and then truffle test in metacoin it fails with a complaint its' missing ConvertLib
If I remember correctly, ConvertLib is a contract which comes as part of the metacoin box. I'd recommend reinstalling the box.
Also @clehene if you're thinking it's working well in ganache-cli, I'm happy to help test it in the Ganache UI if you're having difficulties - just need you to point me at the branch(es) which contain(s) your fix(es).
@clehene actually, I can do one better than that. I've just pushed a branch to ganache-core called fix-forking-debug. In this branch I add test/forked_debug.js, which is essentially the same test as test/debug.js except I'm debugging a transaction from a forked chain. The changes I've made in that branch should have been enough to make things work, but unfortunately I run into the missing stack in _updateNode (aka Cannot read property 'pop' of undefined) crash.
I have a feeling that if your changes make this test pass, plus 0xNPE's tests, we'll be in really good shape.
later edit: missed the last few comments after my previous
@benjamincburns I spent some time over the weekend: I had a bit of trouble figuring out how to link modules to run the cli with my version of ganache-core.
Guessing the two options you mentioned were _queueRequest vs `_sendAsnyc.
I had my initial work based on just the unit tests provided, so had to re-introduced some scaffolding in the otherwise cleaned up code.
The two unit tests address 2 distinct problems. Both are fixed.
In short there's at least an additional one related to async ops under semaphores which will reproduce with truffle test and sendAsync but only under certain conditions, bot not captured by the unit tests.
In other words, I can probably spend some time later this evening to put up a patch to start the review and some validation, before a full assessment of the remaining stuff.
but unfortunately I run into the missing stack in _updateNode (aka Cannot read property 'pop' of undefined) crash.
@benjamincburns the current fix (which will need a lot of scaffolding cleanup so may take a bit of time this evening) fixes the missing stack and also a good degree of the get issues scenarios.

@clehene : I don't know how to say it, but thanks for the coming patch ! I've never been F5'ing an issue so much :p
This bug has been so annoying ! So thanks again a thousand times ! Also Thanks to the ganache-cli team for the tools you provide :3
I'll take a rare pleasure to read the patch code and try to improve my understanding of this awesome tool :D
P.S : however no rush, this was to thanks, not to rush anything, i'll wait the needed time to enjoy a clean fix :)
I fixed that last get undefined issue (and added a few additional locks in places where it was clear it could have caused issues). However this last issue was trickier and is partially a break of encapsulation problem: getRaw should probably be private but is called directly from Account.
To recap, when doing a checkpoint we end up with two dbs and all reads (getRaw->db.get) happen on both asynchronously. When things are committed or reverted the second db goes away and the in-flight get will try the second instance (think db[0] db[1]) that's not there anymore and fail with undefined .
This means that access needs to be synchronized around anything that would race for db instances.
Many of the operations were already using semaphores for this however some didn't (e.g. checkpoint, get). The problem with the last race was due to direct access to getRaw from Account.prototype.getCode. Again IMO this is mostly an encapsulation problem as get should probably be used instead (and perhaps that's the right fix, although I'd say it would also require making getRaw "private".
The alternative (which is what I'll include in the patch) for a merkle-patricia-tree only fix is to mark the under-lock state with a flag (param), based on which's absence (if flag is off) the semaphore can be acquired to avoid the race.
The pop issue, like I mentioned, was a logical one and happened due to propagation (or lack thereof) not found errors.
While most of this conversation has happened on ganache-cli all the problems are actually in merkle-patricia-tree.
Lastly, the merkle-patricia-tree code is more complex than it perhaps should given the functionality (deep callback stack, mutual recursion, heavy async, etc.). In addition the stuff around it, there may be other issues lying around,
As a caveat - now with the get / pop issues out of the way I can sometimes see
https://github.com/trufflesuite/truffle/issues/558
https://github.com/trufflesuite/ganache-cli/issues/433
Initially I was concerned that they were somehow introduced by these fixes and tried to see how but that wasn't the case (not sure how they occur though) however I've confirmed them without the fixes as well.
Great work @clehene! I've added a comment to the PR, however bear in mind that I don't maintain that repo, so I can't accept that PR. I'll flag down some of the ethereumjs folks tomorrow and see if we can get someone to review it.
Quick update.
summarizing the PR mention here as I closed it for now:
As I updated multiple methods to use the semaphore, there were two without callbacks and I missed fully fixing them.
As checkpoint takes a semaphore and changed from sync to async, there are implications to call sites and wanted to revalidate after changing the callsites.
I also realized the second one was unnecessary after inspecting all possible ways to get to it and realizing it should be locked already.
As I'm blowing in yogurt ;) I'm double checking the two assumed unrelated issues to be sure.
I don't think the changes would be to big / hard (I've made them in the libraries involved with truffle) and seem to be ok.
Hi @clehene, how are things going? I can't help but feel like you were sooooo close with the changes you submitted last time. No pressure, but let me know if there's anything I can do to be of assistance!
@benjamincburns hey, I've integrated the changes in VM in order to use the async checkpoint. However I've started seeing the ou of gas errors after (The contract code couldn't be stored, please check your gas amount.). I don't think my fix is causing it necessarily, but it may make it occur more frequently now (https://github.com/trufflesuite/truffle/issues/558).
Hence I've instrumented all libs to track things and went down the rathole tryin to fix that too. Found one or two more potential issues... Meanwhile had a hand surgery and a bit slower :). I can repost the patch with the latest changes and take it from there.
Meanwhile had a hand surgery and a bit slower :)
"... with one hand tied behind his back!" :joy:
No pressure - definitely take it easy if you're still in recovery.
I appreciate you trying to sort out trufflesuite/truffle#558, but I think if you're reasonably confident that the change you've made is necessary to fix #417, then you've just exposed some other existing flaw in the process, not created a new one. I'm overjoyed if you want to take that on, too - but don't feel that it's necessary to capture this bounty.
@benjamincburns , @owocki the bounty shows up as "open" with "expired funds". It looks that someone is working on this, pleas clarify the status.
@lazaridiscom I claimed it and started working on it but it's indeed not showing that. Not sure why
gitcoin.co has a few hickups! A personal note: please remember that we have only 2 hands, and usually there are no spare parts!
Good luck with your hunt.
the bounty shows up as "open" with "expired funds". It looks that someone is working on this, pleas clarify the status.
first; sorry for the slow response.. been traveling and doing ETHDenver.com stuff..
bounty is still open, please proceed knowing that the funds are still available. ive got an action item to update the issue in the system later this week
@clehene Please treat this one as if it were yours. We're working to get the status changed on Gitcoin and it should be updated shortly here. cc @owocki
just re-added the work status here.. thanks for your patience @clehene
@owocki @vs77bb thanks, no problem :) I've rebased things in the weekend but had to retest things a bit. I was unable to use npm link for most parts and haven't spent time trying to make that work so have to do multiple diffs every time I rebase. Currently changes are in merkle-patricia-tree, ethereumjs-vm and ganache-core. I'm pushing them and re-creating the PRs
I'll need to rebase one more time. These are the updates
https://github.com/clehene/merkle-patricia-tree/commit/d4ca2388b5f32c0bf09ab1901ed4a1b09be1aa8d
https://github.com/clehene/merkle-patricia-tree/commit/7f070d2ec9cd14ea6fea98dc0ef696f715e9916e
https://github.com/clehene/ganache-core/commit/5b6a8df30b571b9d40db044c3d464db39310c1d7
https://github.com/clehene/ethereumjs-vm/commit/b51229c77d78cbb9f08cf68da7f063263daa2dc0
Did another rebase with latest merkle-patricia-tree changes, but can't reproduce pop error (test hangs) will look into it
hi all. i just killed the bounty for this issue for some gitcoin-internal migration reasons but wanted to let you know that, regardless of the issue description on gitcoin.co.. im good to pay out this bounty when the time is right. just @ me back then!
hi @clehene ! Sorry if i'm being annoying but is there any update or info about this bugfix ?
Thanks, and much datalove !
@Asone hey, yes. I the last few updates above were up to date and ready for PR-ing. I think I've made one fix after I had those. I'd have to rebase things locally to get latest changes and can push again. IMO those can be made PRs, but finalizing would require a conversation around some of the original intentions in the code.
The two problems were that concurrent tests were still problematic after these fixes due to the issue described above so you won't be able to consistently run tests in parallel.
The other was mostly a discussion about the fixes that I was hoping to have with someone.
Sorry, but I had to do other stuff and had to pause this. If you're willing to give them a spin, I'd love input. I think I can get them updated and pushed later today.
@Asone it took me a while to get back the context of where I left this.
I pushed the latest rebased versions including some wip related to the put problem which adds the new err param. I've had some trouble figuring out which callback is getting called on error here (https://github.com/clehene/merkle-patricia-tree/blob/master/baseTrie.js#L495) as sometimes it's unexpected (or maybe my debugger is wrong..) I need to spend a bit more time to finish that wip:
For context the background of this issue is that there's a conflicting expectations wrt what happens when root is not found in the trie (more on that below) between get and put (which is why put blows in the original issue description). This said, introducing the err was tedious as we need to differentiate between actual errors and the "not found" (which btw we'd have to change to something better than the current placeholder (https://github.com/clehene/merkle-patricia-tree/blob/master/util.js#L78).
Of course the root of the problem is caused by trying to walk a trie that has a broken root. In the case of the test for this issue, this is artificially created by this https://github.com/trufflesuite/ganache-core/blob/develop/test/stability.js#L179 (note that changing the root of the trie is sufficient to reproduce the problem, so arguably the rest of the test is unnecessary).
Finally, the main problem is why the root would get broken in the first place under normal operations (while outside of the direct scope of this issue, that may be the root cause and @0xNPE suggestion may be valid, yet the issues indicated here are still valid.
Just another person rooting for a fix :).
@clehene are you still intending to complete this fix? I'd love to award this bounty if possible.
Either way, @seesemichaelj is going to take over responsibility from me for tracking this. I'll still be around, however if you guys have any questions for me.
@benjamincburns @seesemichaelj this got more or less blocked with the final bits as I saw two different ways which depend on input from EVM / merkle-tree. I.e. we'd need to understand the author's intentions there as the logic is a bit convoluted and undocumented or otherwise discuss what's the best thing based on context.
I wouldn't mind taking another stab. TBH the bounty was just the cherry on top, I mostly wanted to get my feet wet and have it done, but I need to discuss with someone.
I like to finish things when I start them, but getting the whole context back and rebasing it will take some effort. I need to go over notes, etc. as I don't fully remember all bits - so I may need to spend some time and would only do that if I'm sure I can get proper support and not sure whom to ask for that :)
Hey @clehene; sorry I'm just getting around to this now. Please mind me as I'm getting up to speed what the issue is and possible theories/solutions.
As afar as getting input from EVM/merkle-tree, I think one of the best people to start a discussion will be @holgerd77. However, just looking at the history of baseTrie.js it may be difficult to get "original author" intentions.
Looking back on the stuff you said March 30, it may be best to go ahead and submit PRs and probe questions. It'd help if you have somewhat direct questions as well as a detailed description of your proposed changes.
Are the commits https://github.com/clehene/merkle-patricia-tree/commit/4927c76914c5811cea4495d69761537ab4c1c119 and https://github.com/clehene/merkle-patricia-tree/commit/25541d479c57fe57ec9664a5bebc6b1ee93033fb potential solutions to the problem @clehene?
@holgerd77 please see @clehene's last comment. Are you the right person to help us navigate getting this change accepted? If you're not familiar w/ this issue, I'd be happy to set up a call with you to discuss. Would love it if @clehene joined as well.
Either way, we haven't heard from @clehene in a while and the price of ETH is drastically different from when we first opened this bounty. As a result we're going to adjust the funding and clear out @clenehe's currently active claim. That said, if @clehene is still interested I'd be thrilled if he picked it back up - he's definitely carried it most of the way.
Hey @seesemichaelj didn't intend to go AWOL.
@benjamincburns I was under the impression that this is no longer under bounty / my claim - i.e. that I'm not locking anyone from fixing it.
See https://github.com/trufflesuite/ganache-cli/issues/453#issuecomment-372532842
hi all. i just killed the bounty for this issue
As much as I'd love to finish this, I'm not sure it'd be realistic just based on how much time I estimate it would take. I was not able to reproduce all faulty conditions last time I rebased things (that was after June 13th). Before that I remember going rather deep in the VM last time I had a clue of what could be the root cause and expose the effect we were hunting for. For that I was debugging and stepping through VM and instrumented the VM along with libraries to get detailed tracing information. That makes rebasing costly as it's essentially carrying tracing parameters across the entire call stack. So long story short it's a full time job for 1-2 weeks to get up to speed :).
I would gladly get on a call. If anyone's fixing this in SF I could spare a few hours in an office to help out too.
Anyways, if there's indeed an active claim, I'm sorry for that and hope it didn't deter anyone from taking a stab at it :).
A final thought. I think it may be worth considering rewriting the merkle tree implementation from scratch by someone that has a clear idea of how it needs to behave, etc. (I remember going through the yellow paper that there were some peculiarities there too). I honestly think that would be the best and cheapest long term maintenance solution.
Issue Status: 1. Open 2. Started 3. Submitted 4. Done
__This issue now has a funding of 400.0 DAI (400.0 USD @ $1.0/DAI) attached to it.__
@benjamincburns Sorry, I have to prioritize other time-critical tasks for now. Can help to get this settled mid-term hopefully (8-12 weeks first estimation).
Update:
I've investigated more and here is my hypothesis on the root cause:
The way we use blockchain.vm.stateManager.[checkpoint|commit|revert]() in some of the blockchain functions, such as processCall() assume there is no other function performing transaction on the trie.
For example, as demonstrated in the test case, the following events can be problematic:
.checkpoint() is called by processCall().checkpoint() is called by processNextBlock().revert() is called by processCall() <- this reverts the checkpoint created by processNextBlock()..commit() is called by processNextBlock() <- this commits the checkpoint created by processCall()Proposal:
We should introduce semaphore lock that makes sure multiple functions try to perform transactions on the trie to avoid these race condition.
Issue Status: 1. Open 2. Started 3. Submitted 4. Done
__Work for 400.0 DAI (400.0 USD @ $1.0/DAI) has been submitted by__:
@ceresstation please take a look at the submitted work:
I commented over on the PR, but I figure I should comment here. @kn I really think @clehene identified the root cause already, and the changes he listed on 1 March likely addressed the true underlying issues. The most important problem he identified was that the MPT's checkpoint was written to assume it was operating synchronously, when this wasn't generally the case. This requires a breaking change to allow for a continuation callback to be passed into the MPT's checkpoint function.
Hey @PaulRBerg, @raboid are either of you interested in giving this a shot? Please also see the reference issue at the bottom. @kn really appreciate your work, going to pay out a tip separate from the bounty for your effort!
Thanks! Appreciate it.
Here is what I left off with and changes I made are:
https://github.com/ethereumjs/merkle-patricia-tree/compare/master...kn:i417
https://github.com/ethereumjs/ethereumjs-vm/compare/master...kn:i417
https://github.com/trufflesuite/ganache-core/compare/develop...kn:i417_2
The next steps were
hey @ceresstation how is this bounty proceeding? Looks like @benjamincburns closed a PR that kn started due to some extra complexity, but is it safe to say the bounty is still open?
@frankchen07 Yes, this bounty is still open
Closing this one, as it was fixed by trufflesuite/ganache-core#209
We didn't fund this bounty ourselves, so I'm not sure how to go about compensating those who helped out. I think @kn and @clehene should split the reward, as their solutions were both viable and they ultimately lead me to the solution in the above PR.
I'll check in w/ the GitCoin folks next week and see what we can do.
@benjamincburns - gitcoin.co/tip is a way to compensate those who helped close out an issue (but not necessarily the ones who submitted the work).
@benjamincburns nice! was it fixed entirely through trufflesuite/ganache-core#209 or did it require changes in other parts?
Fully agree @benjamincburns, I'll split the reward between @kn and @clehene :)
⚡️ A tip worth 200.00000 DAI (200.0 USD @ $1.0/DAI) has been granted to @kn for this issue from @ceresstation. ⚡️
Nice work @kn! Your tip has automatically been deposited in the ETH address we have on file.
Great work - glad to see it fixed! Thanks for the generous tip.
This bounty should no longer be open.
On Wed, Feb 6, 2019 at 1:24 AM Lazaridis notifications@github.com wrote:
@ceresstation https://github.com/ceresstation , the bounty is still
listed as open on gitcoin.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/trufflesuite/ganache-cli/issues/453#issuecomment-460937919,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAxAyMO4Q6XOmK1PpFBiZzTiUSOv4Z46ks5vKpEkgaJpZM4RMaid
.
@ceresstation mind closing the bounty? I just removed the 'open' override status, and that made it appear as 'submitted'. If you cancel the bounty that should close it out.
Issue Status: 1. Open 2. Started 3. Submitted 4. Done
__This Bounty has been completed.__
Additional Tips for this Bounty:
⚡️ A tip worth 200.00000 DAI (200.0 USD @ $1.0/DAI) has been granted to @clehene for this issue from @ceresstation. ⚡️
Nice work @clehene! Your tip has automatically been deposited in the ETH address we have on file.
Most helpful comment
Closing this one, as it was fixed by trufflesuite/ganache-core#209
We didn't fund this bounty ourselves, so I'm not sure how to go about compensating those who helped out. I think @kn and @clehene should split the reward, as their solutions were both viable and they ultimately lead me to the solution in the above PR.
I'll check in w/ the GitCoin folks next week and see what we can do.