Minecraftforge: Chunk Loading while server is under stress causes NBT data for Tile Entities to be destroyed

Created on 21 Sep 2017  路  17Comments  路  Source: MinecraftForge/MinecraftForge

Steps to reproduce:

  1. Create server with forge forge-1.12.1-14.22.1.2485-universal and appliedenergistics2-rv5-alpha-4.jar. Default Configs.
  2. Copy world file to server directory
    world.zip
  3. Launch server with args (only important thing is the low memory, rest is just copy pasted from what I was using): java -Xms256M -Xmx1G -XX:+UseParNewGC -XX:+CMSIncrementalPacing -XX:+CMSClassUnloadingEnabled -XX:ParallelGCThreads=4 -XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -jar forge-1.12.1-14.22.1.2485-universal.jar nogui
  4. Restrict Client to 1G RAM & Start client with the same forge & AE version.
  5. Move Character to X=-215 Y=110 Z=-28 and look at ME controller w/ all those nice entities in view.
  6. Completely close client.
  7. Completely Close Server.
  8. Restart Server and Client.
  9. Log in client to Server. See parts of ME network missing or disconnected. May need to restart the client 1-3 times, closing the client completely each time. (Example: https://i.imgur.com/TRFrklM.png)
  10. forced crash log: https://pastebin.com/yE1L5MMd

Bonus Points: use forge version forge-1.12.1-14.22.0.2474-universal for even more missing blocks and issues (it's worse in that version)

Reference these 2 threads for information:
https://github.com/AppliedEnergistics/Applied-Energistics-2/issues/3013
https://github.com/thiakil/Applied-Energistics-2/issues/76

I have reproduced NBT data wipes with just Thermal expansion installed, just Applied Llamagistics installed, and now just Applied energistics installed. With both forge 14.22.0.2474 and 14.22.1.2485. (all on MC1.12.1)

My guess is that it has nothing to do with the entities or AE network size, but rather some timing issue with loading chunks. Slower servers/computers or those with restricted amounts of memory might hit it more often? The entities and larger networks just slow it down enough to trigger the bug.

Oh, and I just got it to wipe TE machine NBT data without Applied Llamagistics installed. Crash report: https://pastebin.com/9rKSe7PP

All 17 comments

Can you get this happen for vanilla TE's? Chests are probably an easy one to test.

I'm at work ATM, I will try when I get home. I'm assuming that I can just stuff 4 chests facing all 4 different directions w/ an assortment of colored wool and repeat the stress above.

It would be REALLY helpful if I could find a way to stress the server without spawning a few hundred entities. Any Ideas?

Hello. I'm not part of ATM but I am helping them troubleshoot an issue and ATM has a similar problem with a very broad swath of tile entities in SMP.

We simply walk away about 8 chunks and come back and we lose NBT data and sometime tile entities. It's extremely reliable and we see "failure writing chunks" errors.

Stack traces available here, with some annotations of me walking away and returning for a few attempts.. I am sure the ATM team can provide a copy of the map and modset upon request as well: https://gist.github.com/OrdinatorStouff/9ce7148065e75f84549c0d6f6df8e7aa

@williewillus I cannot. I've tried for about an hour using both forge versions I tried previously. I've tried hoppers, chest, and furnaces w/ a variety of server & client memory sizes and aside from the occasional out of heap space exception everything ran as expected.

I will continue trying to reproduce with JUST forge installed. Perhaps I can run it in a VM to starve it of CPU time.

that's weird that it doesn't happen to vanilla. does it happen to non-AE TE's? if not then maybe this is an AE bug

Yes. I've reproduced it with Thermal Expansion Machines. They lose their upgrades, input/output assignments, and orientation.

I believe, if needed, I can get it to happen with IC2 wires and machines as well. I've had the same issue with those as I have with the other 2 mods in normal play throughs.

Edit: relevant bug https://github.com/CoFH/Feedback/issues/518

I'm re-doing my test for TE using my mature world to make sure I speak the truth.

To add to @KirinDave's comments, ATM team has been getting reports of this "NBT loss" for a couple weeks now. Not sure when we first started hearing about it. Thermal Expansion's TEs are one of the most commonly affected as was the initial subject of this issue report for ATM3. @KingLemming has chimed in as well on the CoFH issue tracker that it's not a Thermal issue, but something affecting NBT.

We've seen Mekanism pipes lose connection info, AE2 skystone chests lose their contents, Industrial Foregoing TEs losing contents and configuration, Botania mana spreaders facing direction and mana pool contained mana and at least a few others but these are the big ones off the top of my head. Interestingly, we too have not seen firsthand nor heard reports of vanilla TEs being affected.

Initial troubleshooting seemed to link the corruption to an intermittent crash that included JourneyMap and ForgeEvents but after removing JourneyMap from the server, there were no more fatal crashes but we could repeatedly "reset" machine NBT by flying a few chunks away and flying right back. During this time the latest.log (same that @KirinDave linked earlier) shows a CME happening somewhere in ChunkMap each time that the TEs lost NBT by him going away and coming back.

We had not ruled out the possibility that some other mod could be causing this (as opposed to Forge), and @maruohon has had some theories like that if a mod were writing to TEs in the world from within TileEntity#writeToNBT() in another thread before/after the chunk save method is called.

This testing is being done on the ATC server that is currently still running Forge 2464 BUT the ATM3 pack/server of ours is using 2485 and with every mod at their latest version as of a day or two ago. We don't want to change the ATC server contents yet because as-is this issue is very easily reproducible at this time.

I re-did my TE test and can confirm that I am able to get corruption with just Thermal Expansion mods + forge (https://pastebin.com/ufaYbiwt). However I cannot get this into a 100% reproducible state (it took me about an hour to reproduce this failure). It's really hard to get the Thermal Expansion TE's to fail compared to the AE ones.

For reference, I received a number unexplainable bug reports that could be caused by this in both Colossal Chests and Integrated Dynamics:

The size of the NBT tag does not seem to have an influence on TE data loss in these cases.

Also note that these cases are not limited to 1.12.x. There are also reports going back to 1.10 for AE2 and I heard of some cases in 1.9.x or maybe earlier with other mods.

We even had a few reports going back to 1.7.10. These might not be related, but could be. But currently I cannot find them.

So there is certainly a chance that the async chunkloading introduced in 1.7.10 could be related to it. It certainly would explain something like a race condition under load.

I've been able to reproduce something like this in 1.10.2 with just Botania and Baubles. I created a superflat world and created a giant line of filled mana pools going north. I would then run up and down the line with a high level speed potion effect. After enough running, I would notice chunks where all the mana pools were now empty.

On my server running the Sprout modpack, I was frequently having this problem in certain chunks. Chunkloading the chunks has prevented the issue from reoccurring.

The interesting thing I found is that most machines (ender io and extra utilities) in the affected chunks would not lose their data, instead, they would no longer tick. However, Botania mana pools would lose their contents and mana spreaders would lose their direction.

Edit: In the chunks affected by this on my server, there were also chests in the chunk (unaffected) and hoppers (stopped ticking)

We have also experienced the hoppers stopping ticking as well; was just reported by a player on ATC (but never mention of vanilla chests).

Just for the sake of continuing discussion and exploring possible common links, ATC and ATM3 both have
B:alwaysSetupTerrainOffThread=true enabled in forge.cfg by default. We also include foamfix in both packs/servers which has a number of chunk loading/unloading related tweaks as well but that wouldn't be a common case for these tests where you can reproduce with just forge and 1 or 2 mods only.

Some ATM3 users noticed that if you set I:dormantChunkCacheSize=0 on servers (the default for our test server was 50) the problem goes away for the server. I'm not sure how that interacts with the SSP test map originally proposed, but it makes a substantial impact on SMP servers that I've helped report on so far.

This PR is a potential fix for this issue: https://github.com/MinecraftForge/MinecraftForge/pull/4162

Just FYI, all my testing has been w/ I:dormantChunkCacheSize=0
That being said:

Bonus Points: use forge version forge-1.12.1-14.22.0.2474-universal for even more missing blocks and issues (it's worse in that version)

I'm finding that the issues prevalent in 2474 have been mostly fixed in 2485 for me. That's not saying I haven't been able to reproduce oddities in 2485 (as shown in the world provided), but my mature world (which couldn't survive a server restart without doing a magic chunk load dance) has survived several days without losing data with no interaction on my part.

I noticed that having Chunk Animator installed shows a lot of loading of chunks/subchunks (when nerdpoling) that I didn't expect.

I am closing this because #4162 was merged. If you see this exact issue again please reply with details.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bs2609 picture bs2609  路  3Comments

MJRLegends picture MJRLegends  路  3Comments

ChiriCuddles picture ChiriCuddles  路  3Comments

blay09 picture blay09  路  3Comments

darthvader45 picture darthvader45  路  3Comments