Nano-node: Use a unique PoW algorithm

Created on 18 Jan 2018  Â·  34Comments  Â·  Source: nanocurrency/nano-node

There are blake2b ASICs coming out with very high hash rates. These are being developed to mine coins such as siacoin. In order to avoid these being used to spam the RaiBlocks network, we should use a unique PoW algorithm. With a unique PoW algorithm, there would be much less of an incentive to develop ASICs for this unique algorithm, as you cannot make money by consistently spamming RaiBlocks.

I propose (pseudocode): blake2b(blake2b(input) ^ b"RaiBlocks"). As this is approximately twice as difficult to compute (compared to just blake2b, the current algorithm), I'd recommend also halving the difficulty.

enhancement major

Most helpful comment

I think Argon2d might be a good choice as a PoW algorithm. @FSMaxB Argon2i is optimized to be resistant to side-channel attacks, while Argon2d is resistant to cracking with specialized hardware. Proof of Work needs the later, while the former is quite irrelevant. Argon2 would also give us the opportunity to adjust the memory requirements for computing the PoW in addition to the time requirement (difficulty) to keep up with hardware improvements.

Simply changing Blake2b for Argon2d should make the network much more spam resistant. Who decides whether this gets implemented?

The change to a "unique" combination of algorithms like @PlasmaPower proposed could still be implemented if it is found worthwhile.

All 34 comments

How do you check PoW for this (What is the inverse operation or check operation?) Likely a very obvious answer - I'm just curious

I was lucky enough to score a A3 when it first became available (sold out in 15 minutes). I should have mine within about 9 days in California - but those in China likely already have them. To provide a link to the product page:

https://shop.bitmain.com/productDetail.htm?pid=00020180116164357365a2ljX8gx06D3

Sia is rumored to be releasing one that will be far more powerful but very specific to SiaCoin (16nm) with 7nm ASICs on the way this year, we definitely want to make sure we are covered. Even if the A3 couldn't be purposed for an attack, it will still be a concern that is raised often when people see Blake2b.

Would be nice to change the algo regardless so it isn't an issue.

@PlasmaPower posted a calculation of the A3 as:

my calculations: (2^64 - 0xffffffc000000000)/(2^64) * 815e9
which turns out to be 12k tx/s

for a single A3 unit.

@george-viaud What I posted there is the algorithm for checking PoW. Calculating PoW is just trying a bunch of different possible work values, until you find one that works.

Ahh that's how it works. Interesting. Thanks!

I do not know much about ASIC development, but how expensive would it be to tweak the design for a regular blake2b ASIC to compute blake2b(blake2b(input) ^ b"RaiBlocks")? Shouldn't the changes needed be relatively small, because most of the circuitry remains the same (the blake2b function) and the changes are rather trivial (xor with a constant and two evaluations of the blake2b instead of one)? Why not use a memory-hard hash function like scrypt (scrypt(scrypt(input) ^ b"RaiBlocks"))? This should make the use of specialised hardware to spam the network even harder.

You make a good point about modifying blake2b ASICs. There are scrypt ASICs though. Perhaps we should use a multi-algo function, not because it's inherently ASIC resistant, but because it could be very unique.

Looking at the pseudocode for blake2b on Wikipedia, I believe we could modify the Mix function by swapping the arguments x and y (as named in Wikipedia's pseudocode). Alternatively, we could modify one of the rotate amounts in that function.

I'm no expert here, but I believe these changes would significantly alter the function, while still preserving its cryptographic properties.

I do not like the idea of reimplementing existing and tested cryptographic functions for the sake of being unique. Regarding your specific change suggestion: I do not think that changing x and y will have an significant effect on the cost to spam the network. It will only result in swapping every other byte of the plaintext message with it successor. Therefore an attacker could just do this swap in the block bytes he wants to send, give the resulting message to an usual blake2b ASIC and reverse the swap (including the nonce found by the ASIC) and he will get a valid proof of work for the altered blake2b function. This only requires very small changes to the spamming software and none to the ASIC!

I'm not an expert on cryptography but it seems to me the logical long term solution would be to use a combination of two very different algorithms for PoW to mitigate the use of a single machine designed for one algorithm

If this is even possible...

I agree with @Nevsor, re-implementing an existing crypto algorithm seems risky.

If you used a multi algo, would it be performed in parallel (multiple results) or serial? And in either case why couldn't an attacker use multiple ASICs?

If anything, a memory hard algorithm like Argon2i should be used. This would eliminate most of the concerns that ASICS could be used to spam the network.

Memory hard algorithms are a step in the right direction, but are not a perfect solution.

I just found out that @Nevsor has written exactly the same.

I think Argon2d might be a good choice as a PoW algorithm. @FSMaxB Argon2i is optimized to be resistant to side-channel attacks, while Argon2d is resistant to cracking with specialized hardware. Proof of Work needs the later, while the former is quite irrelevant. Argon2 would also give us the opportunity to adjust the memory requirements for computing the PoW in addition to the time requirement (difficulty) to keep up with hardware improvements.

Simply changing Blake2b for Argon2d should make the network much more spam resistant. Who decides whether this gets implemented?

The change to a "unique" combination of algorithms like @PlasmaPower proposed could still be implemented if it is found worthwhile.

@Nevsor I modified blake2b with my suggestion of changing the order of x and y in the mix function, and as I thought, it significantly changed the result. Changing the byte order did not affect it. However, I do think that changing the rotation amounts would be a better change.

Here are the results of swapping x and y in the mix function, using an 8 byte output as the PoW calculations do.

Blake2b hash of an empty string: E4A6A0577479B2B4
Blake2b hash of ab in ASCII: 0E52B5F187DE1088
Blake2b hash of ba in ASCII: EE3F6B40991A6C7A

Next I tested it with the modified mix function.

Mix modified blake2b hash of an empty string: E17D1F1ECEDE1923
Mix modified blake2b hash of ab in ASCII: FAFD51206B6688D0
Mix modified blake2b hash of ba in ASCII: 5ADE3D6D0ED11A47

After that I reverted the mix changes and changed the rotation amounts from 32, 24, 16, 63 to the rotation amounts used in the original Blake: 32, 25, 16, 11. As specified in the Blake2 paper section 2.2 "Rotations optimized for speed", the rotation amounts were only changed in Blake2 for speed.

Rotation modified blake2b hash of an empty string: 291CD5FF2FADEB27
Rotation modified blake2b hash of ab in ASCII: 18E7542989D2E725
Rotation modified blake2b hash of ba in ASCII: 86E2200F4CAE66FE

Note: this section has been edited. Previously I just changed one rotation.

I prefer the change of the rotation amounts, since the Blake2 paper indicates that it preserves security.

Personally I'd choose either something like what monero uses, or the algo that vertcoin uses. Both have been proven cryptographically secure. The later lyra2 contains a lot of parameters that can be adjusted.

@develerltd Argon2 was the winner of the password hashing contest, so it has undergone quite some scrutiny by cryptographers.

Note though that no practical cryptographical algorithm can be proven secure. (one time pads are proven secure for example, but they are not practical).

The issue here is that lyra2 has many parameters that can easily be
adjusted which would make ASICs uneconomical to develop as its trivial to
change the algorithm

On Sat, Jan 20, 2018 at 1:38 PM, Max Bruckner notifications@github.com
wrote:

@develerltd https://github.com/develerltd Argon2 was the winner of the
password hashing contest, so it has undergone quite some scrutiny by
cryptographers.

Note though that no practical cryptographical algorithm can be proven
secure. (one time pads are proven secure for example, but they are not
practical).

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/clemahieu/raiblocks/issues/506#issuecomment-359172144,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AQRY9WF11omhEWRfsLDFgDzUOoglWj2Iks5tMexGgaJpZM4RiPm2
.

--

Darren Cresswell
Contract Developer | Develer Limited
E-mail: [email protected]
Phone:
Website: http://www.develer.co.uk

Please consider the environment before printing this email
WARNING: Computer viruses can be transmitted via email. The recipient
should check this email and any attachments for the presence of viruses.
Develer Limited accepts no liability for any damage caused by any virus
transmitted by this email. E-mail transmission cannot be guaranteed to be
secure or error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The sender
therefore does not accept liability for any errors or omissions in the
contents of this message, which arise as a result of e-mail transmission.

WARNING: Although Develer Limited has taken reasonable precautions to
ensure no viruses are present in this email, the company cannot accept
responsibility for any loss or damage arising from the use of this email or
attachments.

Develer Limited is a limited company registered in England and Wales. |
Company Registration No. 09817616 | Registered Offices: SUITE 1 SECOND
FLOOR EVERDENE HOUSE, DEANSLEIGH ROAD, BOURNEMOUTH, UNITED KINGDOM, BH7 7DU

Wait where did we end up with cuckoo? Seems like a really great option and from my discussion with Colin he is also a fan. May be interesting to look at some combination using Cuckoo in the mix. Would surely be unique, no?

having multiple algorithms doesn't actually stop ASICs from being made. see X10 and X11 ASICs. even if you mix 2-3 "unique" algos in a chain, it's just a matter of time for somebody to come with an ASIC for it.
the algo(s) can't rely on being only computationally expensive, argon2 does it all, because it has diminishing returns https://password-hashing.net/argon2-specs.pdf
so my vote would be on argon2 as well

What if transactions (at least below a particular threshold) required some sort of collateral that would be automatically released / returned to the sending account under conditions of non-quorum after a period of time after being first seen? The idea to increase risk for bad actors, especially in the event of double-spend attempts (forcing quorum) w/ small transactions (or even large?) Obviously not a fully-formed idea - just thought it may have some value with further thought.

The application of hashes and nonces in the Sia algorithm is sufficiently different than how ours is used to be completely incompatible. ASICs are "Application Specific", which means they are optimized as much as possible for that algorithm such that minimal data is passed to the chip to be processed. Thus, using the chips as generic blake2b hashing machines would be inefficient and tax the data bus to each chip.

Rather, the design would have been optimized on FPGA to take the minimal input data and do as much of the algorithm as quickly as possible, either returning nothing or a solution. Thus the data to start with would be sent out to every ASIC chip (many on the board in parallel) and all work on the solution at the same time.

While we use the same hashing algorithm the pow algorithm is vastly different.

To bring up cuckoo cycle throttling, has anyone investigated this or know of any obvious issues? From the description it seems to resist ASICs. https://github.com/tromp/cuckoo

Two concerns I'd like to look at:
1) Is the proof size variable length?
2) Is the proof size large?

See also issue #1064

We are going to be pushing moving to a different work calculation mechanism down the road. For now we will prioritize blocks to vote on based on their work (#1298), use a dynamic floor (#196), and in the future come up with a different work mechanism (this issue).

@clemahieu Cuckoo Cycle is no longer considered ASIC resistant (article by the author: John Tromp)
"I now claim Cuckoo Cycle to be ASIC friendly. Its main quality is that it is the simplest possible PoW, and that it functions as a proof of SRAM."
https://forum.aeternity.com/t/cuckoo-cycle-no-longer-considered-asic-resistant/814

Grin initially opted to use the new Cuckoo Cycle PoW algorithm, also purported to be ASIC resistant due to being memory latency bound. This means that the algorithm is bound by memory bandwidth rather than raw processor speed with the hope that it will make mining possible on commodity hardware. In August 2018 the Grin team made an announcement that they have become aware that it was likely that an ASIC would be available for the Cuckoo cycle algorithm at launch of their mainnet. While they acknowledge that ASIC mining is inevitable they are concerned that the current ASIC market is very centralized (i.e. Bitmain).
https://forum.hypernum.com/threads/grin-vs-beam-a-comparison.208/

Initially, Grin intended to use 2 PoW algorithms (which sounds like a novel approach): Each block mined with the primary Proof-of-Work tends to increase the secondary scaling factor, meaning that every time a “primary” block is found it gets easier and easier to mine with the secondary proof-of-work algorithm. In contrast, when a “secondary” block is found the secondary scaling factor tend to reduce. Which means that it gets harder to mine with the secondary PoW (since you need a higher solution difficulty to reach to the target difficulty) and easier to mine with the primary proof-of-work.
https://blog.blockcypher.com/an-introduction-to-grin-proof-of-work-103aaa9f66ce

My vote, same as @Nevsor and @FSMaxB would goto Argon2d which is specifically designed for cryptocurrencies.
https://github.com/P-H-C/phc-winner-argon2/blob/master/argon2-specs.pdf

I considered argon early on but the issue I ran in to was the verification wasn’t trivial. It took the same large amount of memory to verify as it does to generate a single round. We use argon2 for the key derivation function which benefits from this slow verification. Has argon added a fast verification node?

@clemahieu Fast hash verification aka The Merkle-tree-based PoW based on Argon2 (called MTP) has recently been implemented by Zcoin.

The MTP-Argon2 requires 2 GB of RAM to make a proof, and can be initialized and start producing proofs in less than 1 second – performance hardly beaten by competitors like Equihash for the same amount of RAM.

Development of MTP originally began in 2017. After researchers had found issues in the original paper, Zcoin launched several bounties to address both issues at the theoretical level as well as in the implementation. The original researchers successively addressed the first version’s issues including further enhancements with their revised paper partially funded by Zcoin and published in January 2018. Zcoin released the first public version of MTP on their testnet in May. Since then it has been tested and refined and went on their mainnet on 10 Dec. 2018.

Link below contains links to the revised academic paper and their bounty program results
https://zcoin.io/mtp-zcoins-new-proof-of-work-algorithm

Link to source code
https://github.com/zcoinofficial/zcoin/tree/master/src/crypto/MerkleTreeProof

After briefly reading over the paper this seems to address the issues I had with using the argon2 kdf for PoW generation.

I'm going to do another pass on the paper and some of the source code in the next few days. Do they have an implementation outside the zcoin repository? It looks like their repository is covered under a general MIT license though it'd be nice if they had plans to separate it out.

@clemahieu I wasnt able to find any other implementation, the researchers seem to have collaborated with zcoin for their revised paper. Maybe there are opportunities to build further on what zcoin has coded or do a collaboration so that synergies in terms of sharing experiences can be exchanged? (depending on code quality and zcoin's willingness to collaborate), on the other hand maybe its better to code a new clean implementation and perhaps collaborate with the researchers.

The proofs look rather large:

    uint8_t hashRootMTP[16]; // 16 is 128 bit of blake2b
    uint64_t nBlockMTP[mtp::MTP_L*2][128]; // 128 is ARGON2_QWORDS_IN_BLOCK
    std::deque<std::vector<uint8_t>> nProofMTP[mtp::MTP_L*3];
    constexpr int8_t MTP_L = 64;

chanced upon Itsuku which claims to have 1/16th the proof size. couldn't find any implementation except for this python library though

@nano-ark _"Although the benchmarks of Itsuku are not available, we could
guesstimate that the 32-fold increase in the number of memory accesses would result in 10-15x decrease in performance. Therefore, to fit in 1 second Itsuku would have to be run with 100-200 MiB at most."_ (section 4.6, page 9 https://arxiv.org/pdf/1606.03588.pdf)
MTP authors actually refer to Itsuku in their paper (hinting that it does not align with the memory-hard function concept), Proof sizes are indeed smaller: MTP-Argon2 186 KiB vs Itsuku 11 KiB

This might sound silly, but what would be the advantages and disadvantages of using AES as a basis for proof of work scheme?

AES has specialized instructions to accelerate it on both modern PCs and mobile devices. I'm not sure if GPUs or ASICs would provide tremendous benefit (but I might be wrong! tell me about it).

In any case, the difference might be smaller than with other hash functions (including with sha256 that also has acceleration on some architechtures) because encryption is extremely improtant for modern computing (networking in particular) and hashing isn't.

There are many constructions that enable AES to be used for proof of work.

Here's a simple one (my own):

Generate two keys from a seed K1 = HASH(seed||1), K2 = HASH(seed||2).

Encrypt (0, 1, 2, 3, ...) using both keys until you've found a pair of encrypted blocks are sufficiently similar (have prefix of n common bits). Output the index used.

The index can limited to the range [0, m] to prevent various attacks that exploit the reversibility of encryption (I don't currently remember what these were when I came up with this construction).

Edit:

Of course, you can remove the need for the verifier to initialize the keys by doing something like this:

Let K1 and K2 be two constants (nothing up my sleeve values):

For both keys encrypt the sequence (seed ^ 0, seed ^ 1, seed ^ 2, seed ^ 3, ...)

Now the verifier only has to compute E_k1(seed ^ solution) and E_k2(seed ^ solution)

Edit:

Of course as a "basis" for a PoW I meant it could be combined with some memory-hard approach.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

paulmelis picture paulmelis  Â·  6Comments

bbedward picture bbedward  Â·  3Comments

bartclaeys picture bartclaeys  Â·  3Comments

AugustoResende picture AugustoResende  Â·  3Comments

frankh picture frankh  Â·  3Comments