Describe the bug
My node on UTC time successfully created a block in epoch 41 like scheduled:
The block was lost to a competitive fork, the node was thus NOT in a fork at the time of block creation.
Winning blockhash:
According to the explorer this block is on the same chainlength 125640 but on a different slot 8643 and a completely different time almost 40 minutes later:
January 24, 2020 01:01:43
Mandatory Information
jcli --full-version jcli 0.8.7 (8.7-364cd84e, release, linux [x86_64]) - [rustc 1.40.0 (73528e339 2019-12-16)]jormungandr --full-version jormungandr 0.8.7 (8.7-364cd84e, release, linux [x86_64]) - [rustc 1.40.0 (73528e339 2019-12-16)]To Reproduce
Steps to reproduce the behavior:
Expected behavior
Slot of winning blockhash should match own losing leaders_log
Timestamp of winning blockhash should be closely aligned to own losing leaders_log (maybe a few ms quicker, which is why the competitive fork won)
It should be possible to look at blockhash of winning pool to better understand why the competitive fork was lost, the current provided information on slot and timestamp does not make any sense.
Why do you think it was lost to a competitive fork?
Your node probably made a block on fork which different chain length and slot numbers means.
Explorer shows your LOCAL time, while output from your node leaders logs is in UTC. No time distortions.
P.S. Pooltool was fed by fake data, so you shouldn't rely on it now.
P.S.S 41.8603 - missing from the current chain completely, so it was not lost to a competitive fork. Your node just failed to produce it. Check what parent hash your node took to create that slot.
Thank you for looking at this adascanBovich, following up on your questions:
1. Because I run two nodes in failover mode which means my pool is at the top of the blockchain pretty much all the time and because pooltool.io did display my lost block as a competitive fork, I do assume pooltool.io bases this decision on the parent hash: https://pooltool.s3-us-west-2.amazonaws.com/8e4d2a3/pools/3a192bc7285d7ddfd77c0b854b4008bbf4c5fe53d1ad42770f6ffdf9b3d4c7d6/byepoch/41/winloss.json
The error might of course also be with pooltool, I will direct papa carp here to take a look.
2. As I wrote the chainlength is exactly the same at 125640 for both my block and the winning blockhash. Timestamp and slot differ. This is what is confusing.
3. The latest blockhash on epoch 43 at the time of writing this is
https://explorer.incentivized-testnet.iohkdev.io/block/fc974de9fd7700d00b5a34caa0c7ce77808de85223a6d38c55f452343dcd8486
with the following timestamp: January 26, 2020 09:50:31
Currrent time UTC is January 26, 2020 09:12:16 UTC
My local time January 26, 2020 10:12:16 UTC
So I don't know what time the Explorer shows. If the Explorer does not use UTC as the JCLI does this is a bug in itself and needs fixing. The timestamps should at least name the timezone they are to be interpreted in.
Even assuming for a moment you are right which means my block was created at
UTC finished_at_time: "2020-01-24T00:00:23.015275516+00:00"
The winning block at January 24, 2020 01:01:43
My local time at block creation was 01:23 am, so the timing does not make any sense either way.
No matter where you are in the world the minute of the hour is always the same, the hour or day may differ but not the minute, so something is very wrong here.
P.S.: Papa Carp, any comment? In this specific case you got data from my pool which I vouch for and from the final blockchain, so even if you got fake data in the past (like max blockheight) this should not matter for this specific case.
P.P.S.: Very good idea to check my parent hash at block creation time, this could verify this was indeed a competitive fork also from my end. Unfortunately my journalctl log already is too far ahead, can't look back at the time of the competitive fork anymore, I just enabled persistent logging of journalctl for the next occurance.
Papa Carp, you do base competetive fork decision on identical parent hash surely, can you comment please?
As I wrote the chainlength is exactly the same at 125640 for both my block and the winning blockhash. Timestamp and slot differ. This is what is confusing.
It just means that your node didn't took hash of 125639, but some older one, and network rejected it. Next valid block on chain was 41.8643 that is why it have 125640 height.
Even assuming for a moment you are right which means my block was created at
UTC finished_at_time: "2020-01-24T00:00:23.015275516+00:00"
The winning block at January 24, 2020 01:01:43
My local time at block creation was 01:23 am, so the timing does not make any sense either way.
Once again you can't call it winning block. Winning blocks have same value for slots. Like you produced 41.8603 but on explorer you seeing different pool id that created the same slot. Your node just failed to produce valid block, then next produced valid block was at 41.8643 that is it.
FYI value 41.8643 - just representation of the timestamp. You can calculate it like:
genesis_timestamp + epoch * slots_per_epoch * slot_duration + slot * slot_duration
And you can get all the values from api/v0/settings api call.
Your slot have timestamp of:
1576264417 + 41 * 43200 * 2 + 8603 * 2 = 1579824023
datetime.utcfromtimestamp(1579824023).ctime()
>> Fri Jan 24 00:00:23 2020
Slot 41.8643:
1576264417 + 41 * 43200 * 2 + 8643 * 2 = 1579824103
datetime.utcfromtimestamp(1579824103).ctime()
>> Fri Jan 24 00:01:43 2020
So I just can't understand what is the problem here. There are two completely different slots. Why are you calling 8643 competitive fork to 8603? There is no connection between them.
Thanks for elaborating on this adascanBovich, there might be multiple issues here. One issue seems to be at least with pooltool.io competitive fork detection.
I was calling 8643 a competitive fork to 8603 because pooltool.io did: https://pooltool.s3-us-west-2.amazonaws.com/8e4d2a3/pools/3a192bc7285d7ddfd77c0b854b4008bbf4c5fe53d1ad42770f6ffdf9b3d4c7d6/byepoch/41/winloss.json
I just wanted to better understand this to learn how I can avoid this moving forward.
My original leader_log said:
After it was finished it said:
I now looked at my first "competitive fork" according to pooltool.io in epoch 39:
{
"allcompetitions": {
"121269": {
"height": 121269,
"players": [
{
"pool": "b3146f94b9dbf51f2838056538ebec0cc9d84aece3be4c885de954dd7b392b74",
"hash": "86814cac7ed2bf0742df3c8da2ff40d3dbad675a307ddfe7c9daff96a2382c06"
},
{
"pool": "3a192bc7285d7ddfd77c0b854b4008bbf4c5fe53d1ad42770f6ffdf9b3d4c7d6",
"hash": "b617da12c6da397a6c43fbe59ceb5ae683e166bc84c1c0a37817d7438933e6f7"
}
],
"winner": "b3146f94b9dbf51f2838056538ebec0cc9d84aece3be4c885de954dd7b392b74"
}
},
"w": 0,
"l": 1
}
My leader_log said prior to creation:
After creation is said:
The winning block
https://explorer.incentivized-testnet.iohkdev.io/block/86814cac7ed2bf0742df3c8da2ff40d3dbad675a307ddfe7c9daff96a2382c06
has the same chain_length: 121269
again a different slot: 31254
and again a different time (of course also if calculated by formula): January 22, 2020 13:35:25
So why did my node twice create a block apparently on schedule in the second case 3 slots later in the first case 40 slots earlier but ended up with the same chain_length of an actual block on the final chain with a different slot? This is still not plausible to me.
If my node was not in sync at the top of the blockchain the resulting chain_length from the viewpoint from my node should have been shortened by the amount my node was lagging behind or the node should have failed creating a block or it should have been rejected right away. I do not see any rejection message in my leader_log.
Anyway https://pooltool.io/competitive is now down due to manipulation. I just wanted to follow-up on a number of random other competitive forks, to see if I find a sample where slot and chain_length actually match, which is what I expected looking at my own competitive fork cases but found otherwise.
Now with all the info you provided there is clearly an issue with pooltool competitive forks detection. You should trust only your node logs for 100% accurate info.
For the future - make sure you will save all the logs at least for 1 epoch long. And check them in time.
If your node says block was scheduled at 39.31257 it is not possible to compete with 31254 slot. So pooltool is wrong here.
Unfortunately I didn't keep my logs long enough to verify this for sure, but from the looks of it I would bet I have also seen the issue described here #1513 twice so far
Closing this issue, as I didn't keep my log around to debug this further to really understand the root cause. The main bug is with pooltool, Papa Carp is on this.
@Straightpool thank you for the follow up. Much appreciated.