Nim: Add bigints to standard library

Created on 17 Jun 2020  Â·  23Comments  Â·  Source: nim-lang/Nim

Summary

Add arbitrary precision integers to standard library

Solution

Add the bigints library (it is in nimble) to standard library

https://github.com/def-/nim-bigints

Stdlib

Most helpful comment

Yeah, maybe we should really do that. The compiler itself would benefit.

All 23 comments

Yeah, maybe we should really do that. The compiler itself would benefit.

Before sanctioning a particular implementation of bingints into the stdlib, some performance improvements are in order. The fact is, there is also https://github.com/FedeOmoto/bignum, which is a wrapper aroung GMP, and if I recall correctly it is significantly faster (maybe things have improved lately). See also the discussion in https://forum.nim-lang.org/t/3677. The ideal would be to have a fast, pure nim implementation of bigints

  • https://github.com/nim-lang/RFCs/issues/228 would allow nicer syntax 123'bigint instead of initBigInt"123" but can be done as a syntax upgrade so is not a blocker.
  • https://github.com/FedeOmoto/bignum didn't get an update in 5y and basic usage seems to fail (https://github.com/FedeOmoto/bignum/issues/2) /cc @FedeOmoto , but could be revived; that being said, if it's a wrapper, it may not work at CT unless using CT FFI

The ideal would be to have a fast, pure nim implementation of bigints

for something like bigint, I'd say performance matters more than pure nim implementation if I had to choose between the 2, unless performance difference is small enough. But it also has to work at CT, which requires either: CT FFI, pure nim implementation fallback for the VM, or integrating via vmops (which is not ideal as it ties the compiler to a bignum library)

For the Nim compiler I don't care much about the speed, it's much more important that it has no dependencies.

def-/nim-bigints is maintained by contributors. It is not a good candidate.

Why not? Sometimes software is finished.

Why not? Sometimes software is finished.

If you look at the TODO, it's clear that this isn't finished yet. It's a good start though.

I think it's important to have a pure Nim BigInt library, rather than calling out to some potentially faster other (C) based library. IMO, Nim is a systems programming language, and if we can't have a Nim (maybe Nim + asm) library that's roughly as performant as a C + asm library, then Nim needs to be enhanced.

Personally I'd prefer stint due to it's maturity, but it appears that the library depends on parts of nim-stew...

Also, shouldn't something like this be added to fusion instead of the stdlib?

I think varints can be replaced by a new bigints on stdlib, then stdlib remains kinda constant in size.

I think varints can be replaced by a new bigints on stdlib, then stdlib remains kinda constant in size.

Er uh, I think that's not how it works. ;-)

I was thinking varints suppose to be a bigint kinda thing but was never completely implemented, sorry.

GMP is a big no-no due to the license and not working at compile-time

I don't mind contributing a bigint from scratch, it's not like I wrote those 3 times already:

Note that Constantine backend or Stint current refactor are either 2x faster than GMP on modular arithmetic to sometimes slower for bigints below 1024 bits (didn't test much beyond and didn't test on non-modular arithmetic)

The main reasons behind stew dependencies are threefold:

In terms of BigInt usage, I would need to know more about the use-cases:

  • What kind of API: out-of-place like integers proc +(a, b: BigInt): BigInt or in-place like GMP
  • Do we target embedded? Because BigInt require dynamic memory management, and so the API would have to be gmp-like instead of "integer-like" i.e. only in-place operations
  • Do we want it thread-safe?
  • Do we assume that people might do BigInt operation in a loop, in that case if we use the out-of-place API we might want a caching allocator as memory allocation will be a bottleneck

If the compiler needs big integers, shouldn't it finally start to use Nimble packages? stint, as it exists right now is is a mature implementation of fixed-size integers. Duplicating it wouldn't be beneficial for anyone. The notion that the standard library is what the compiler needs to use is a damaging policy that prevents the Nim team to benefit from collaboration with any other team. It wouldn't be that hard to add a bunch of git submodules and -p:<path> directives in the compiler nim.cfg file and solve the problem in an afternoon.

@zah Agreed. However, I'm in no rush in adding big integer support for Nim and busy with other work considered more important.

If the compiler needs big integers, shouldn't it finally start to use Nimble packages? stint, as it exists right now is is a mature implementation of fixed-size integers. Duplicating it wouldn't be beneficial for anyone. The notion that the standard library is what the compiler needs to use is a damaging policy that prevents the Nim team to benefit from collaboration with any other team. It wouldn't be that hard to add a bunch of git submodules and -p:<path> directives in the compiler nim.cfg file and solve the problem in an afternoon.

But, in competitive programming, the bigints are important, and you can't use external libraries

What is competitive programming? Sites like HackerRank? IOI competitions? No libraries allowed there? Can't bring your own hard drive?

If it helps competitive programming it's a side-effect but I don't think Nim should fall into competitive-programming driven development. Unless those people doing competitive programming have demonstrated that they stick around in the Nim community and so we can rely on them to maintain the code in the standard library (which is the main blocker for any module there).

Maybe some inspiration can be D stdlib BigInt ?: https://dlang.org/phobos/std_bigint.html

I think competitive programming is an example use case, but maybe others benefit too, like Scientific Nim projects.

If the compiler needs big integers, shouldn't it finally start to use Nimble packages?

I agree; what this would require is:

  • locking down all recursive dependencies
    to avoid a situation in which some unrelated change breaks bootstrap
  • introducing fork mirrors regularly updated for each recursive dependency
    to avoid a situation in which a bad force push breaks bootstrap

@timotheecour, tracking the dependencies as git submodules solves both problems. You only have to do a little bit of manual work in adding the transitive dependencies by yourself.

@zah since you brought this up can you please create either an issue or RFC so it can be discussed exclusively and separately in a dedicated location instead of here (it's more general than bigint) ? I'll followup there

proposal

mini-gmp (subset of https://gmplib.org/) could be a good candidate to port/wrap:

  • single header, single source, no dependency
  • API-compatible with gmp so users could link against -lgmp for gmp-performane
  • could be used at CT using vmops
  • most importantly, performance is better than other alternatives apart from gmp itself:

The performance target for mini-gmp is to be at most 10 times slower than the real GMP library, for numbers of size up to a few hundred bits.

the only potential concern is licensing; as I understand, dynamic linking to gmp shouldn't cause issues under LGPL; if that's a problem (eg static linking needed), an alternative (slower but binary compatible) implementation could be bundled with nim, and users could opt to use mini-gmp or gmp for performance as needed

GMP is a big no-no due to the license and not working at compile-time

vmops should solve the problem working at CT (that's how new hashes work in VM for eg); and there's also --experimental:compiletimeFFI

benchmark: computing factorial(100) 1 million times

obviously a very partial benchmark but it's a start, feel free to suggest improvements/corrections

(all tests with clang -O2 or similar; on a recent mac)

# times in seconds
gmp: 1.48
EDIT: kaushalmodi/bignum (nimble install bignum): 1.50
mini-gmp: 2.24
python: 11s
std.bigint (D): 12s
nim using pkg/bigints: 14s
nim using pkg/stint: 41s
Big-Integer-C: 45s
tiny-bignum-c: 2260s

D:

for D's std.bigint (https://dlang.org/phobos/std_bigint.html) with:

ldmd2 -of/tmp/z05 -O -inline -release $timn_D/tests/bigints/z04.d
version   1.23.0 (DMD v2.093.1, LLVM 9.0.1)
import std.bigint;
import std.stdio;
void test1(bool isEcho){
  BigInt n = "100";
  BigInt ret = BigInt("1");
  while(n!=0){
    ret = ret * n;
    n = n-1;
  }
  if(isEcho) writeln(ret);
}

void main(){
  for(int i=0;i<1000000;i++){
    test1(false);
  }
  test1(true);
}

Big-Integer-C:

see test here: https://github.com/ilia3101/Big-Integer-C/pull/3

python

def test1():
  n = 100
  ret = 1
  while(n!=0):
    ret = ret * n
    n = n-1
  # print(ret)

def main():
  for i in range(0,1000000): test1()
main()

nim + pkg/bigints

import pkg/bigints

proc test1(isEcho: bool) =
  let n = 100
  var n2 = n.initBigInt
  var ret = 1.initBigInt
  while n2 != 0:
    # ret = ret * n2
    ret *= n2
    n2.dec
  if isEcho:
    echo ret

proc main=
  for i in 0..<1000000:
    test1(isEcho=false)
  test1(isEcho=true)

main()

nim + pkg/stint

https://github.com/status-im/nim-stint

import pkg/stint

proc test1(isEcho: bool)=
  const d = 1024
  var n = 100.stint(d)
  var ret = 1.stint(d)
  while n != 0.stint(d):
    ret = ret * n
    n = n-1.stint(d)
  if isEcho:
    echo ret

proc main=
  for i in 0..<1_000_000:
    test1(isEcho=false)
  test1(isEcho=true)
main()

EDIT: nim + pkg/bignum

this option (https://github.com/kaushalmodi/bignum) hasn't been considered in above thread; it's a fork of https://github.com/FedeOmoto/bignum that got a few updates to make it work (unfortunately under same nimble name see https://github.com/FedeOmoto/bignum/issues/3)
it (unsurprisingly) gives same performance as gmp, via a high level wrapper around nim's nim gmp wrapper; test file used in above benchmark:

when true:
  import pkg/bignum
  proc test1(n,ret: var Int, isEcho: bool)=
    discard n.set 100
    discard ret.set 1
    while n != 0:
      ret *= n
      n.dec(1)
    if isEcho:
      echo ret

  proc main=
    var n = 100.newInt
    var ret = 1.newInt
    for i in 0..<1_000_000:
      test1(n, ret, isEcho=false)
    test1(n, ret, isEcho=true)
  main()

links

https://github.com/kokke/tiny-bignum-c
GMP bignums vs. Python bignums: performance and code examples

The Fastest BigInt In The West – Wilfred Hughes::Blog
https://www.xspdf.com/resolution/53301490.html

alternative not evaluated: openssl bn

see https://www.openssl.org/docs/man1.0.2/man3/bn.html, its license should not cause concern

Hard no for GMP in the compiler. The license would be a nightmare.

Coincidentally, I've started to export Constantine bigints internal into a self-contained variable-size bigint library https://github.com/SciNim/megalo

Regarding Stint perf, the blocker is this: https://github.com/status-im/nim-stint/issues/87 which leads to quadratic bad behavior. A complete rewrite of the backend is WIP: https://github.com/status-im/nim-stint/pull/104

Was this page helpful?
0 / 5 - 0 ratings