There is a patch fix here:
https://github.com/cosmos/cosmos-sdk/pull/2825/files#r235796732
There is an edge case which doesn't really make sense to me yet, it likely has to do with precision rounding... the problem is effectively goes something like this:
(1000.00000 * 100.00000) / * 100.00000 == 1001.00000
CC: @ValarDragon
Wowza...I wish we had more of an inclination to leverage other OSS projects (e.g. this for decimals). Is there a reason we tend to implement most things from scratch?
I do agree with in general leveraging more open source software. However just because something exists, doesn't mean its good. In this case, I actually don't find this library great for our use case, especially relative to what we have. We'd have to fork it anyway to add MarshalAmino functions :/ I do think it would have been a good idea to use if we didn't already build the more tailored decimal impl.
It has one global variable for precision: https://github.com/shopspring/decimal/blob/master/decimal.go#L45. Theres no benchmarks on any of the meaningful functions. (w.r.t. to what we'd use) It is fundamentally designed for decimals with variable precision, our use case is many decimals with the same precision. I am fairly confident ours will be more efficient for that reason. In the future, I think we should strive for less re-allocation (e.g. big int style) as we re-allocate all over the place, a decision this library chose to not do for ease of usage. (I'd ideally like to aim for a big-int pool / decimal pool) I also think utilizing SIMD would be cool for the operations. (Our 256 bit decimals fit perfectly into AVX2 SIMD registers, the underlying golang big int takes a perf hit by not getting to know max word size before the computation.)
Back to this library, I fail to see what it does that our implementation doesn't already do. We already have something thats been tailored to our use case, I don't see a reason to switch to the more general case. IMO, that libraries core benefit would be implementations of functions we don't use atm. (Different rounds, math funcs, etc.) but that seems like bloat for the sdk type impl.
Thats an interesting edge case rigel! I'll take a look tomorrow.
I was just listing an example @ValarDragon...I can't recall us actually looking at other libraries.
I can't reproduce this. This should block launch for sure if it is a problem, but that would be surprising.
package main
import "fmt"
import "github.com/cosmos/cosmos-sdk/types"
import dtypes "github.com/cosmos/cosmos-sdk/x/distribution/types"
func main() {
fmt.Println("vim-go")
a := types.NewDec(1000)
b := types.NewDec(100)
c := types.NewDec(100)
d := a.Mul(b).Quo(c)
fmt.Println(a, b, c, d)
w := dtypes.DecCoins([]dtypes.DecCoin{dtypes.NewDecCoin("foo", 1000)})
x := w.MulDec(types.NewDec(100)).QuoDec(types.NewDec(100))
fmt.Println(w, x)
}
This is how to reproduce this bug...
What's happening is after this type of an operation, there is slightly more accum tokens produced than is allowed (IN = 0.0000007470, OUT = 0.0000007471)... definitely has to do with precision rounding in this operation... maybe this isn't really even a bug and that code and the patch I added in is appropriate... not totally sure. Anyways this is how to reproduce.
checkout this branch rigel/reference2888-dec-bug
OR undo this change:
https://github.com/cosmos/cosmos-sdk/pull/2825/files#r235796732
and run this command:
go test ./cmd/gaia/app -run TestGaiaImportExport -SimulationEnabled=true -SimulationNumBlocks=50 -SimulationBlockSize=200 -SimulationCommit=true -SimulationSeed=22 -v -timeout 24h
You'll see this panic (close to the beginning of the simulation output)
Simulating... block 6/50, operation 200/507. Panic with err
negative remDelPool: [{STAKE 0.00000000-1}]
vi.DelPool [{STAKE 0.0000007470}]
accum 0.4222968745
vi.DelAccum.Accum 0.4222968745
withdrawalTokens [{STAKE 0.0000007471}]
goroutine 37 [running]:
runtime/debug.Stack(0xc001748998, 0x2, 0x2)
/usr/local/Cellar/go/1.11/libexec/src/runtime/debug/stack.go:24 +0xa7
github.com/cosmos/cosmos-sdk/x/mock/simulation.SimulateFromSeed.func2(0xc0008ec020, 0xc0000b1680, 0xc0001e6e90)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/mock/simulation/simulate.go:130 +0xb6
panic(0x1982200, 0xc0001e6390)
/usr/local/Cellar/go/1.11/libexec/src/runtime/panic.go:513 +0x1b9
github.com/cosmos/cosmos-sdk/x/distribution/types.DelegationDistInfo.WithdrawRewards(0xc001280880, 0x14, 0x14, 0xc0012808a0, 0x14, 0x14, 0x6, 0x6, 0xc00199b980, 0x0, ...)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/distribution/types/delegator_info.go:89 +0x882
github.com/cosmos/cosmos-sdk/x/distribution/keeper.Keeper.withdrawDelegationReward(0x1d3c7a0, 0xc0001e6a00, 0xc000120000, 0xc000120000, 0x1d3c7a0, 0xc0001e6a50, 0x1d3c7e0, 0xc0001e6a60, 0xc0001b5880, 0x5, ...)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/distribution/keeper/delegation.go:83 +0x58b
github.com/cosmos/cosmos-sdk/x/distribution/keeper.Keeper.withdrawDelegationRewardsAll.func1(0x6, 0x1d43b20, 0xc001aab040, 0xc001aab040)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/distribution/keeper/delegation.go:175 +0x19b
github.com/cosmos/cosmos-sdk/x/stake/keeper.Keeper.IterateDelegations(0x1d3c7a0, 0xc0001e69c0, 0x1d3c7e0, 0xc0001e69d0, 0xc000120000, 0x1d499e0, 0xc0000c9200, 0x1d4af40, 0xc00018c6c0, 0xc000120000, ...)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/stake/keeper/sdk_types.go:140 +0x330
github.com/cosmos/cosmos-sdk/x/distribution/keeper.Keeper.withdrawDelegationRewardsAll(0x1d3c7a0, 0xc0001e6a00, 0xc000120000, 0xc000120000, 0x1d3c7a0, 0xc0001e6a50, 0x1d3c7e0, 0xc0001e6a60, 0xc0001b5880, 0x5, ...)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/distribution/keeper/delegation.go:182 +0x192
github.com/cosmos/cosmos-sdk/x/distribution/keeper.Keeper.WithdrawDelegationRewardsAll(0x1d3c7a0, 0xc0001e6a00, 0xc000120000, 0xc000120000, 0x1d3c7a0, 0xc0001e6a50, 0x1d3c7e0, 0xc0001e6a60, 0xc0001b5880, 0x5, ...)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/distribution/keeper/delegation.go:161 +0xc5
github.com/cosmos/cosmos-sdk/x/distribution.handleMsgWithdrawDelegatorRewardsAll(0x1d44260, 0xc001629b60, 0xc000a44ec0, 0xc, 0xc0001ac900, 0x14, 0x20, 0x1d3c7a0, 0xc0001e6a00, 0xc000120000, ...)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/distribution/handler.go:48 +0xe3
github.com/cosmos/cosmos-sdk/x/distribution.NewHandler.func1(0x1d44260, 0xc001629b60, 0xc000a44ec0, 0xc, 0x1d46a20, 0xc0019702c0, 0x0, 0x0, 0x0, 0x0, ...)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/distribution/handler.go:17 +0x211
github.com/cosmos/cosmos-sdk/x/distribution/simulation.SimulateMsgWithdrawDelegatorRewardsAll.func1(0xc000afd6e0, 0xc00010a380, 0x1d44260, 0xc000a1aae0, 0xc000a44ec0, 0xb, 0xc0001ce400, 0x9, 0x9, 0xc000938000, ...)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/distribution/simulation/msgs.go:57 +0x23c
github.com/cosmos/cosmos-sdk/x/mock/simulation.createBlockSimulator.func1(0xc000afd6e0, 0xc00010a380, 0x1d44260, 0xc000a1aae0, 0xc000a44ec0, 0xb, 0xc0001ce400, 0x9, 0x9, 0x0, ...)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/mock/simulation/simulate.go:255 +0x37a
github.com/cosmos/cosmos-sdk/x/mock/simulation.SimulateFromSeed(0x1d4ec60, 0xc000199300, 0xc00010a380, 0x1c5e330, 0x16, 0xc00010a460, 0xe, 0xe, 0xc0004798d0, 0x0, ...)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/x/mock/simulation/simulate.go:177 +0x12d6
github.com/cosmos/cosmos-sdk/cmd/gaia/app.TestGaiaImportExport(0xc000199300)
/Users/rigelrozanski/go/src/github.com/cosmos/cosmos-sdk/cmd/gaia/app/sim_test.go:298 +0x33e
testing.tRunner(0xc000199300, 0x1c5e310)
/usr/local/Cellar/go/1.11/libexec/src/testing/testing.go:827 +0xbf
created by testing.(*T).Run
/usr/local/Cellar/go/1.11/libexec/src/testing/testing.go:878 +0x353
Oh, yes, we're losing precision because accum is less than 1. The cited example in the original post is misleading... the error is in the least significant decimal digit, not the integer digit.
```package main
import "fmt"
import "github.com/cosmos/cosmos-sdk/types"
func main() {
a := types.NewDecWithPrec(7470, 10)
b := types.NewDecWithPrec(4222968745, 10)
c := types.NewDecWithPrec(4222968745, 10)
d1 := a.Mul(b)
d := a.Mul(b).Quo(c)
fmt.Println(a, b, c, d1, d)
}
0.0000007470 0.4222968745 0.4222968745 0.0000003155 0.0000007471
```
In retrospect this isn't surprising... the error happens upon multiplication since we're clipping decimal digits, and then when we divide by a number ~1 or < 1, the error doesn't get "buried", and if << 1 it'll get amplified.
@alexanderbez @ValarDragon using another library would not have helped!
Conclusion of convo with Jae - this is expected behaviour and we should be using a function at the higher level to correct this behaviour.
aka, update from:
var withdrawalTokens DecCoins
if accum.Equal(vi.DelAccum.Accum) {
// required due to rounding faults
withdrawalTokens = vi.DelPool
} else {
withdrawalTokens = vi.DelPool.MulDec(accum).QuoDec(vi.DelAccum.Accum)
}
To
withdrawalTokens := vi.DelPool.MulDec(accum).QuoDec(vi.DelAccum.Accum)
withdrawalTokens = min(vi.DelPool, withdrawalTokens)
I imagine we should probably build this into a function within the decimal library and this may come up again.
Makes sense but scary!
I imagine we should probably build this into a function within the decimal library and this may come up again.
Seconded, this is not obvious at all just reviewing code, we should encapsulate this kind of protection directly in the library rather than case-by-case in callers.
Awesome! Looks like we have some actionable items for this issue.
The only actionable item would be to add this type of a min function into the decimal library - this has been already fixed 2825.
Should we close since #2825 is merged?
I think we should move the relevant code to the library instead of requiring that callers handle it.
Agree, let's leave this issue open for that ^
Will this still be an issue in F1? Also is this issue still relevant?
this is a minor few line fix can be added to a "final minor things to double check before launching tracking issue" if that exists yet - (@jackzampolin want to make that?)
@rigelrozanski That list is the #prelaunch tag, so its in the right place.
Just went through and reviewed the code, this situation does not exist anymore due to F1 (no accum mechanism) - additionally it can be noted that staking is not subject to the same issue.
Most helpful comment
@alexanderbez @ValarDragon using another library would not have helped!