Lisk-sdk: Move maintenance of unconfirmed state from database to main memory

Created on 9 Jul 2018  路  8Comments  路  Source: LiskHQ/lisk-sdk

Background

Currently, we maintain the unconfirmed state for transactions in the unconfirmed queue by using unconfirmed columns in the database. Which means that for every transaction added to the unconfirmed queue in the transaction pool, we perform database writes on that transaction. Since transaction moves from queued to unconfirmed queue back and forth multiple times before (possibly) becoming part of the blockchain, node takes a big performance hit because of it.

Summary of proposed solution

The node should maintain the unconfirmed state of accounts relevant to transactions present in the transaction pool in memory so that it can verify an incoming transaction against the confirmed state of the blockchain and also against the unconfirmed state of the transaction pool.
The node can implement this behavior by maintaining a queue of transactions, and a hashmap of accounts in the transaction pool.

Implementation summary

Whenever there is an incoming transaction, we check if the account is present in memory. If it is not, we fetch that account from the database and maintain in it pool鈥檚 hashmap and verify the transaction against it. If it is present in the pool, we use the existing account from the pool. And so after verification, we will update the unconfirmed properties of the account.

Caveat:

The transactions are not saved in the block based on the order in which they were received, we expect to have these transactions change order based on the current implementation of the order of transactions to be put in the block (Also, with dynamic fees in the pipeline, the order of the transaction will change much more frequently). This leads to a problem, how to check whether the transaction is valid in the order they are expected to go into the blockchain. A node should check to verify transaction based on the order because of the possibility of conflicting transactions.

Conflicting transactions are those transactions which are dependent on each other in such a way that they are valid separately, but when processed together can lead to the invalidation of one or more transaction. For example, say one user has 10 LSK in his account if he makes two transactions sending 10 LSK and sends it to a node. Then only one of the transaction can be processed and become part of the blockchain.

Solution

The simplest solution is to reorder all transactions once a new transaction is received, and if the order is changed then reverify and reapply all of the transactions. This will mean that for each new transaction added to the pool, the performance hit will be: BigO(N). This is not a very scalable solution. So node should define certain rules to handle for all scenarios which can lead to conflicting transactions.

In Lisk, there are different scenarios which can lead to conflicting transactions based on each transaction type. A node should handle for all the scenarios based on certain rules such that the behavior is consistent and it does not need to reverify and reapply all transactions everytime a new transaction is added.

Assumption: A block is an smallest discrete step in a blockchain, this means that when transactions are put in a block, they should behave as if they are all applied at the same time. This assumption will imply that all transactions in the current block of the blockchain should on their own be valid against the state of the blockchain present as of the last block. This assumption also implies that it鈥檚 not possible to have a transaction which is invalid based on the state of the blockchain as of the last block of the blockchain but is only valid because of some transaction which came before it in the same block. For example, if A sends 10 LSK to a new account B where B did not initially have any LSK then B cannot spend that money in the same block irrespective of whether the transaction from B comes after A in the block transactions.

Listing all transaction types and their properties and effects on the blockchain:

For transfer transaction:

If there are two or more transfer transactions from the same account, then it means that it鈥檚 possible that those transactions are conflicting.

For second signature transaction:

If there are two or more second signature transactions from the same account, then it means that those transactions are definitely conflicting. And all transactions that come after second signature transaction are affected by second signature transaction.

For delegate registration transaction:

Delegate username is unique, so all the delegate registration transactions are possibly conflicting transactions.

For vote transaction:

All vote transactions from the same account are possibly conflicting because it鈥檚 possible that someone tried to vote same delegate in two different transactions.

For multi transaction:

All multisignature registration transactions from the same account are conflicting. And all transactions that come after multi registration transaction are affected by multi registration transaction.

For dapp registration transaction:

All dapp registration transactions are possibly conflicting with other dapp registration transactions.

Edge case (In case of dynamic fee and also due to order of transactions in the same block):

In case of block creation and deletion, generator of the block needs to have his account reloaded/updated if it exists in pool.
In case of block creation and deletion, all accounts included in transactions needs to be reloaded/updated if they are part of the pool.

So, to summarise, there are 6 scenarios which can lead to a conflicting transaction:

  1. Transaction type is unique for a particular account.
  2. Transactions from the same account are conflicting with each other.
  3. Transaction type effects the state of the account for all transactions after that one.
  4. Transaction type contain some data which needs to be unique across blockchain.
  5. Transaction is from a delegate who forged the last block.
  6. There is a new block forged/received which has transactions affecting accounts in the pool.

Rules:

  1. Transaction should be verified against confirmed state of the blockchain, and confirmed and unconfirmed state of the account.
  2. If a transaction type contains data which should be unique, node should maintain a hashmap of all the unconfirmed transactions of the same type. And verify that the uniqueness of the incoming transaction holds true.
  3. If there is a block change, node should reload accounts which were updated in that block and re verify and apply (unconfirmed) those transactions. This includes accounts which were involved in transactions included in the block and the generator of the block.

These three rules will enable us to handle all 6 scenarios
For scenario 1, 2 and 3, rule 1 will ensure that we have consistent data in the memory.
For scenario 4, we have rule 2, which will ensure that the uniqueness of the transaction is ensured.
For scenario 5 and 6, we have rule 3 which will ensure that pool maintains a valid state of the accounts even after new blocks are forged.

Which version(s) does this affect? (Environment, OS, etc...)

1.0.0

Most helpful comment

On theoretical level its looks an executable plan. I believe we will face some hiccups to completely move unconfirmed state in memory, but those can be sorted out when faced.

If we can do it, believe me we can also do in-memory snapshotting, by skipping unconfirmed sate and performing confirmed state computation in-memory.

All 8 comments

Only thing I would add here would be

Transaction should be verified against confirmed state of the blockchain, and confirmed and unconfirmed state of the account.

I think state management for the account in the memory is unnecessary here because it will increase complexity for the first attempt.
If we hit the bottleneck, we can introduce the cacheing mechanism for the account.
So the rule should be,

Transaction should be verified against confirmed state of the blockchain/account
and verified against transactions in the transaction pool.

Regarding rule 2:
Perhaps I understood it wrongly but, is it not the uniqueness of the data what should be preserved and checked in the transaction pool?
For example, if there are two accounts registering the same name for a delegate at the same time (respect to the blockchain), the transactions are going to be unique in the transaction pool (different ID, public key etc) but one of them needs to be necessarily invalid because of the name uniqueness.

I agree with @SargeKhan and @shuse2, moreover:

  • Remove the unconfirmed state from mem_accounts tables,
  • Perform DB reads only for checks against of the confirmed state (1 per every recevied transaction) and DB writes only to apply confirmed transaction's state while applying a block,
  • Check the unconfirmed state of upcoming transactions in memory by verifying them against all of the already queued transactions in transaction pool.

_Why not to remove transactions unconfirmed state completely?_

To prevent potential malicious attacks of spamming the network with transactions, which are all valid from the confirmed state perspective, but invalidating each other within one block slot. Let's say the sender's account balance is 1.1 LSK, the sender sends 1000 transactions, all spending 1 LSK. All 1000 are valid against the confirmed state. 999 transactions would be invalidated only when a new block containing one of them is being applied - after 10 seconds. Within that time transaction pool is overfilled and keeps on rejecting all other, possibly valid transactions.

_How to perform unconfirmed transactions checks in memory?_

  • Received transaction is being validated against the confirmed state. Node performs database read for all affected by transaction accounts. Only one DB read per transaction will be needed within 10 sec block slot.
  • Transaction gets checked against of all already applied transactions in transaction pool.

    • Needs to be optimised to check only data important from the perspective of a specific tranasction type.

    • Additional Data Structures - Maps are maintained in memory to optimise checks during validation of delegate/dapp/vote transactions.

  • When new block is being received:

    • remove duplicated transactions from transaction pool,

    • if not all of the transaction applied block contains were found in transaction pool:



      • revalidate all transactions in transaction pool against every new transaction found in block, which was not found in transaction pool.



On theoretical level its looks an executable plan. I believe we will face some hiccups to completely move unconfirmed state in memory, but those can be sorted out when faced.

If we can do it, believe me we can also do in-memory snapshotting, by skipping unconfirmed sate and performing confirmed state computation in-memory.

@shuse2, I agree that it will be complicated to manage account state in memory. And I really like what you suggested, and I agree with it.
@IkerAlus, you are right. That is what I meant, but after rereading the rule 2, I see why it might have caused confusion.
@MaciejBaj, thanks for describing the problem. I think we all agree on the proposed solution. So, I will make child issues based on what you described.
@nazarhussain, that's great! Can you share what problems you think might occur? So, we make a more informed decision?

e.g.

  • HTTP API need to send unconfirmed balances of accounts.
  • So if the unconfirmed state of account is managed under transaction pool then both will be inconsistent.
  • If we say that we will manage the unconfirmed states of accounts in memory across application then its difficult to implement and manage in memory
  • In prior case more better option is to maintain account cache for unconfirmed states in a central key/value store and invalidate on block processing

That's just one use case popped up, but we may face similar issues specially sharing the states of accounts in different modules.

But as I said earlier we may face hiccups but we should sort out those and move with this change.

Thanks! I will keep these in mind when I am creating child-issues. So, the work for each issue is self-contained.

The issue is part of our Roadmap (https://lisk.io/roadmap) - "Architecture and Design: Improve transaction processing efficiency" and will be re-opened as a milestone. Until now, the work is in progress in Lisk Elements repository - https://github.com/LiskHQ/lisk-elements/milestone/12.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ScrewchMcMuffin picture ScrewchMcMuffin  路  3Comments

willclarktech picture willclarktech  路  4Comments

MaciejBaj picture MaciejBaj  路  3Comments

Nazgolze picture Nazgolze  路  3Comments

karek314 picture karek314  路  3Comments