Parity-ethereum: DB atomic swap failed: renameat2 failed with code: 22 at the end of a warp sync

Created on 12 Jul 2018  ·  30Comments  ·  Source: openethereum/parity-ethereum

_Before filing a new issue, please provide the following information._

I'm running:

  • Which Parity version?: 2.0
  • Which operating system?: Linux
  • How installed?: binaries
  • Are you fully synchronized?: no
  • Which network are you connected to?: ethereum

At the end of a warp-sync and before having the normal sync, I get the error: DB atomic swap failed: renameat2 failed with code: 22

This does not prevent the sync to continue.

2018-07-12 12:51:39  Syncing snapshot 1723/1726     #6901   25/25 peers   5 MiB chain 31 MiB db 0 bytes queue 10 KiB sync  RPC:  0 conn,    0 req/s,  102 µs
2018-07-12 12:51:46  Syncing snapshot 1724/1726     #6901   25/25 peers   5 MiB chain 31 MiB db 0 bytes queue 10 KiB sync  RPC:  0 conn,    0 req/s,  102 µs
2018-07-12 12:51:50  DB atomic swap failed: renameat2 failed with code: 22
2018-07-12 12:51:56  Syncing #5945000 0xbaf7…07d7     0.00 blk/s    0.0 tx/s    0 Mgas/s     28+   62 Qed  #5945095   25/25 peers   103 KiB chain 34 KiB db 6 MiB queue 264 KiB sync  RPC:  0 conn,    0 req/s,  102 µs
2018-07-12 12:52:01  Syncing #5945000 0xbaf7…07d7     0.00 blk/s    0.0 tx/s    0 Mgas/s     44+  301 Qed  #5945350   25/25 peers   520 KiB chain 240 KiB db 32 MiB queue 1 MiB sync  RPC:  0 conn,    0 req/s,  102 µs
2018-07-12 12:52:06  Syncing #5945001 0xbdc6…bd91     0.20 blk/s   31.2 tx/s    1 Mgas/s      0+  505 Qed  #5945508   25/25 peers   836 KiB chain 2 MiB db 51 MiB queue 297 KiB sync  RPC:  0 conn,    0 req/s,  102 µs
B9-blocker 🚧 F2-bug 🐞 M4-core ⛓ P2-asap 🌊

Most helpful comment

Another instance of #9176 - mine keeps failing and re-starting the entire snapshot sync:

2018-07-27 07:58:19  Syncing snapshot 1810/1812        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 07:58:24  Syncing snapshot 1810/1812        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 07:58:29  Syncing snapshot 1811/1812        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 07:58:34  DB atomic swap failed: MoveFileTransactedW failed with code: 5
2018-07-27 07:58:39  Syncing snapshot 0/1816        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 07:58:44  Syncing snapshot 0/1816        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 07:58:49  Syncing snapshot 0/1816        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:39:36  Syncing snapshot 1809/1812        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:39:41  Syncing snapshot 1810/1812        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:39:46  Syncing snapshot 1811/1812        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:39:50  DB atomic swap failed: MoveFileTransactedW failed with code: 5
2018-07-27 20:39:51  Syncing snapshot 1812/1812        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:40:01  Syncing snapshot 0/1816        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:40:11  Syncing snapshot 0/1816        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs

Using 2.0.0, Windows 10, db killed, default options (no command line args or anything in config)

All 30 comments

Is that something for you @ngotchac?

@Tbaut
The right person for this is @debris (as many other things :crab:)

But this is just a warning that atomic swap of the directories (including it files) failed and parity fallbacks to non-atomic swap instead!

Maybe, just change the warn! message a bit something like DB atomic swap of directories failed during restore falling back to non-atomic swap

Yes, it's not a critical warning, but just a warning. I will change the message into something less negative. :)

More importantly, why is it failing? It this something that happened only once? The error number seems to point to EINVAL which for renameat2 means:

"The new pathname contained a path prefix of the old, or, more generally, an attempt was made to make a directory a subdirectory of itself."

@Tbaut what are the cli options that parity was started with?

It was --warp-barrier and --no-ancient-blocks@debris

I also had this just now with --warp-barrier

For https://github.com/paritytech/parity-ethereum/issues/9176 it's even restarting the whole warp-sync, making it impossible to sync.

Another instance of #9176 - mine keeps failing and re-starting the entire snapshot sync:

2018-07-27 07:58:19  Syncing snapshot 1810/1812        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 07:58:24  Syncing snapshot 1810/1812        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 07:58:29  Syncing snapshot 1811/1812        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 07:58:34  DB atomic swap failed: MoveFileTransactedW failed with code: 5
2018-07-27 07:58:39  Syncing snapshot 0/1816        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 07:58:44  Syncing snapshot 0/1816        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 07:58:49  Syncing snapshot 0/1816        #0   24/25 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:39:36  Syncing snapshot 1809/1812        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:39:41  Syncing snapshot 1810/1812        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:39:46  Syncing snapshot 1811/1812        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:39:50  DB atomic swap failed: MoveFileTransactedW failed with code: 5
2018-07-27 20:39:51  Syncing snapshot 1812/1812        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:40:01  Syncing snapshot 0/1816        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-07-27 20:40:11  Syncing snapshot 0/1816        #0   27/50 peers      9 KiB chain    3 MiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs

Using 2.0.0, Windows 10, db killed, default options (no command line args or anything in config)

Just hit this also. I'll only add that patience prevailed and something was clearly happening behind the scenes that would have indicated everything was fine, but the logs "appeared" as if the entire process had stalled.

2018-08-16 16:04:20 UTC Syncing snapshot 1903/1903  #6115000    6/50 peers    102 KiB chain    1 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-08-16 16:04:15 UTC Syncing snapshot 1903/1903  #6115000    6/50 peers    102 KiB chain    1 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-08-16 16:04:10 UTC Syncing snapshot 1903/1903  #6115000    6/50 peers    102 KiB chain    1 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-08-16 16:04:01 UTC Syncing snapshot 1903/1903  #6115000    6/50 peers    102 KiB chain    1 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-08-16 16:04:01 UTC Syncing snapshot 1903/1903  #6115000    6/50 peers     32 KiB chain    1 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-08-16 16:03:53 UTC DB atomic swap failed: renameat2 failed with code: 22
2018-08-16 16:03:50 UTC Syncing snapshot 1903/1903  #6115000    6/50 peers     88 KiB chain    1 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-08-16 16:03:45 UTC Syncing snapshot 1902/1903  #6115000    6/50 peers     88 KiB chain    1 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-08-16 16:03:40 UTC Syncing snapshot 1900/1903  #6115000    6/50 peers     88 KiB chain    1 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-08-16 16:03:35 UTC Syncing snapshot 1900/1903  #6115000    6/50 peers     88 KiB chain    1 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs

It stayed like that for a few minutes and then eventually was chugging along happily at the tip of the chain.

Using the following (same call as in fs-swap):

#include <string.h>
#include <errno.h>
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/fs.h>

int main(int argc, char *argv[]) {
    syscall(SYS_renameat2, AT_FDCWD, argv[1], AT_FDCWD, argv[2], RENAME_EXCHANGE);
    printf("%s\n", strerror(errno));
    return 0;
}
gcc renameat.c -o renameatx



md5-b76ccf8a32aebea39ceb75379c761e23



mkdir a b && ./renameatx a b

Fails on my laptop (Invalid Argument == err code 22) but succeeds on my VPS:

image

Note slightly different kernel version

@ascjones this is strange, I've tested this c program on ext4 and btrfs and it worked fine

EINVAL An invalid flag was specified in flags.
EINVAL The filesystem does not support one of the flags in flags.
renameat2() was added to Linux in kernel 3.15.

I posted yesterday about one node that corrected itself. That node did not have a wallet unlocked on it.

I have a second node I've been trying to launch that does have an unlocked wallet attached. And now the behavior is like drgrog's comment here. The configurations between the nodes (besides the wallet) are otherwise identical.

Both nodes are using the parity docker image latest tag

Note that there are two different failure cases reported here, on linux with renameat2 failed with code: 22 and on Windows with MoveFileTransactedW failed with code: 5. Code 5 on windows is ERROR_ACCESS_DENIED https://docs.microsoft.com/en-us/windows/desktop/debug/system-error-codes--0-499-

I am still getting this on Windows....every single time without a fail...any ideas what to do?

@alphasignal I was also getting the MoveFileTransactedW failed with code: 5 every time I tried to sync with Parity 2.0 on Windows. I ended up installing Parity 1.11.8 instead and that ended up syncing without any issues.

@Nanolucas this is just a warning and it does not affect correctness of the application at all

that said, I'll fix it

@debris I understand that it's a warning, but the actual issue is that once the warning is received, the next step is that the sync restarts from 0. So it spends the time to put something like 30GB on disk and then once it gets to 100% it wipes all progress and starts again, and hence it never finishes.

@debris I understand that it's a warning, but the actual issue is that once the warning is received, the next step is that the sync restarts from 0. So it spends the time to put something like 30GB on disk and then once it gets to 100% it wipes all progress and starts again, and hence it never finishes.

It should never happen. It should fallback to the behavior from 1.11.8

To whom this issue concerns, could you run parity-ethereum with -l snapshot=trace flag and attach the logs here?

I am on Windows 10, running Parity-Ethereum v2.0.1-beta-e7dc0bed1-20180726. Here is what I get with -l snapshot=trace (trying to sync Kovan, since it's much faster there):

2018-08-23 12:25:19  IO Worker #3 INFO import  Syncing snapshot 410/410        #0    5/25 peers      9 KiB chain    7 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-08-23 12:25:23  IO Worker #2 TRACE snapshot  finalizing restoration
2018-08-23 12:25:24  IO Worker #2 TRACE snapshot  Replacing client database with "C:\\Users\\mg\\AppData\\Local\\Parity\\Ethereum\\chains\\kovan\\db\\9bf388941c25ea98\\snapshot\\restoration\\db"
2018-08-23 12:25:24  IO Worker #2 WARN kvdb_rocksdb  DB atomic swap failed: MoveFileTransactedW failed with code: 5
2018-08-23 12:25:24  IO Worker #2 TRACE snapshot  removing old snapshot dir at C:\Users\mg\AppData\Local\Parity\Ethereum\chains\kovan\db\9bf388941c25ea98\snapshot\current
2018-08-23 12:25:24  IO Worker #2 TRACE snapshot  copying restored snapshot files over
2018-08-23 12:25:24  IO Worker #3 INFO import  Syncing       #0 0xa3c5…2cb9     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+    0 Qed        #0    5/25 peers      9 KiB chain    7 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-08-23 12:25:26  IO Worker #1 TRACE snapshot  Snapshot is now initialized with 0 completed chunks.
2018-08-23 12:25:26  IO Worker #0 TRACE snapshot  verifying transition to epoch at block 0
2018-08-23 12:25:26  IO Worker #0 TRACE snapshot  Verified epoch transition for epoch at block 0
2018-08-23 12:25:34  IO Worker #3 INFO import  Syncing snapshot 23/410        #0    5/25 peers      9 KiB chain    7 KiB db  0 bytes queue   10 KiB sync  RPC:  0 conn,    0 req/s,    0 µs

And this continues forever, re-syncing from scratch every time the swap fails.

Here is a log with a warp-sync from scratch and -l snapshot=trace (removed the irrelevant part when warp-syncing): https://gist.github.com/Tbaut/e06f298c80637871686869754435644b#file-kovan-renameat2-snapshot-trace-log

I also have the same problem with Parity-Ethereum/v2.0.1-beta-e7dc0bed1-20180726 windows client, trying to sync to Tobalaba. -l snapshot=trace log:
https://gist.github.com/ngyam/1a6daecab6725c8ed0dff3c1fe03dec1#file-tobalaba_windows_client_log-txt

I'm using Parity-Ethereum/v2.0.1-beta.
Just started my node and got the error DB Atomic Swap Failed: MoveFileTransactionW failed with code: 5

Did a DB kill and started syncing, it gives the same error once it reaches the end of the sync. Any current fix without falling back on Parity 1.11.8?

Bumping the prio

Is there any resolution to this? I'm among many people who experience MoveFileTransactedW failed with code: 5 on Windows 8 and my machine just never syncs. 1.11.8 started to have issues that my UI just wont start, both the browser and client. I tried to revert back to 1.11.8 and even earlier versions and it works the first few times and then the UI problems start again. I basically can't use parity at all now. That being said, the syncing time makes it a very costly test to run. Any solutions?

There is no solution yet (apart from running it on linux/docker). We're working on it.

Has this had any progress at all? Receiving:

DB atomic swap failed: MoveFileTransactedW failed with code: 5

When warp sync reaches 100%. Just loops. Windows 10. Exe Install, sync'ing mainnet, parity version 1.11.10 (UI 0.3.4)

(Ropsten seems to sync fine)

Update: I've discovered It doesnt just start from scratch once sync reaches 100%, if the sync is interrupted or parity is closed down (eg to turn off computer) then the sync is starting from 0% again.

Been doing some more testing, it doesnt restart the sync if the computer is put to sleep rather than turned off. Resumes as if nothing happened at all (no activity in command prompt indicating sleep occured)

Was this page helpful?
0 / 5 - 0 ratings