Trinitycore: StartNetwork failed to bind socket acceptor

Created on 16 Jul 2017 · 41Comments · Source: TrinityCore/TrinityCore

Description: Due to https://github.com/TrinityCore/TrinityCore/issues/19999 being closed (logically xD) I decided to start a new Issue as already suggested by @nawuko in that issue. You guys have to say me what to do and I can report you everything you want to without posting millions of comments in this issue. You already sent me some commands in IRC some weeks ago to see the state of a port (ntp command or something like that IIRC) but I forgot about it, if you want just send me the command and I can post the results.
As pointed out in that issue, it is related to https://github.com/TrinityCore/TrinityCore/commit/7874bee7bfb70e0e039f91173cff212e9572de09, I have already reverted it and it worked correctly.
I don't have SOAP enabled (default worldserver.conf with just logs, data folder paths and DB info changed).
Have to note that im on a test server which means im the only player in the server so no other players connecting at the same time as me, etc.

Current behaviour: Same as the issue closed. In my case I have to wait a lot less maybe due to having a better dedicated server. If worldserver is restarted ingame by using .worldserver restart and then you open it with an autorestarter or just manually (with ./worldserver) without waiting for the socket to be available (in my case ~30 seconds), the following error shows up and you have to try again.

StartNetwork failed to bind socket acceptor
Failed to initialize network
Couldn't bind to 0.0.0.0.8085

Just a curious note that might be useful, in my case if I close the worldserver with ctrl+C, the port gets available instantly, that means no wait time.

Expected behaviour: The port should be avaiable after closing the worldserver.

Steps to reproduce the problem:

Start worldserver and wait until it ends.
Restart the server ingame with .server restart.
Try openning the worldserver again as fast as you can.

Branch(es): 3.3.5

TC rev. hash/commit: https://github.com/TrinityCore/TrinityCore/commit/348b02155bcadb1cda78d1bcca222c37e170ab5a

TDB version: TDB 3.3.5 63

Operating system: Ubuntu 17.04 (also happened on Debian 7)

Branch-3.3.5a Branch-master Comp-Core Sub-Miscellaneous

Source

Raydor

👍2

All 41 comments

$TC - SUP

I also have this problem, but when I turn on the soap, the problem becomes fundamental

worldserver
on worldserver.cfg enable soap 7878
if "worldserver" restarted and stopped , To reboot the problem
.server restart
Not running Again

Error:

StartNetwork failed to bind socket acceptor
Failed to initialize network
Couldn't bind to 0.0.0.0:7878
terminate called without an active exception
Segmentation fault (core dumped)

Freeze worldserver.

To solve the problem, only the restart of the Linux system! ! !

This update is problematic for TrinityCore
7874bee

Maybe a gentleman helped. The problem was solved

ghost on 16 Jul 2017

it's not closed, it's locked to users. don't open a new ticket.

Aokromes on 17 Jul 2017

😕1

Actually, I want this issue to be open (better description) and not locked to users (but @igrc please stop posting)

Shauren on 17 Jul 2017

❤1

No need to restart Linux helped me to restart any network daemon, such as ssh

mazdafil on 17 Jul 2017

I dont need to restart the network but I have to try a few times to start the server so it works

IlPicasso on 17 Jul 2017

Not the network, and for example, ssh server

mazdafil on 18 Jul 2017

I confirm this problem appear when serveur crash ou when you ctrl + c your worldserveur while there are players connected. You have to reboot the server a second time to boot properly.

Pixel-Pirate on 21 Jul 2017

Confirm,

When you restart or crash the server, the server fails when startup X time depending on the players who have the server (I think)

2017-08-31_10:42:28 World initialized in 0 minutes 29 seconds
2017-08-31_10:42:28 StartNetwork failed to bind socket acceptor
2017-08-31_10:42:28 Failed to initialize network

This occurs after this commit: https://github.com/TrinityCore/TrinityCore/commit/7874bee7bfb70e0e039f91173cff212e9572de09

And if you have Metrics enabled you get a crash when the server startup after crash o restart (if you disable metrics not crash, but occurs the above explained)
(I think the crash only happens if the grafana are installed in another machine, not sure)
Crashlog: https://gist.github.com/Jildor/a0c1466109addd4311e1e3639f3bdda6

@Shauren Can you take a look here?

Jildor on 31 Aug 2017

Try setting your Wired MTU to something really high (like >8192) - I know there was a bug for some Atheros cards (especially AR8161). Works for me :)

Szone on 2 Sep 2017

@Szone Thats a hell high lvl language you used there, can you translate that to normal human being for idiots like me, please?

Raydor on 2 Sep 2017

@Raydor , he's talking about Maximum Transmission Unit. If your device supports large packets you can change your MTU value to 9000 bytes (by default it's 1500 bytes).

Eliminationzx on 2 Sep 2017

👍1

i can confirm this bug/problem on Ubuntu server 16.04 LTS.
When it hangs it doesnt come up again,
To fix it i need to take down my restarter and worldserver.. wait some time.
then start the restarter and then it works..
seems temp. sollution would be a function int he restarter that it waits longer before it restarts the worldserver..
havent looked into making that happen..

dikkedeur on 7 Sep 2017

Valgrind reports lots of these when server stops (ctrl+C on console or .server restart ingame) that MIGHT (I don't know if this is even related to this issue, just posting so if someone with knowledge of this can confirm) be related to incorrect thread start that causes problems when closing them with what I said:

==3610== 4 bytes in 1 blocks are still reachable in loss record 1 of 54
==3610==    at 0x4DAAB2F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3610==    by 0x4FEE2E8: xmalloc (in /lib/x86_64-linux-gnu/libreadline.so.7.0)
==3610==    by 0x4FCAB83: rl_set_prompt (in /lib/x86_64-linux-gnu/libreadline.so.7.0)
==3610==    by 0x4FCBF81: readline (in /lib/x86_64-linux-gnu/libreadline.so.7.0)
==3610==    by 0x1BE8477: CliThread() (CliRunnable.cpp:153)
==3610==    by 0x1BE435A: void std::_Bind_simple<void (*())()>::_M_invoke<>(std::_Index_tuple<>) (functional:1391)
==3610==    by 0x1BE35DB: std::_Bind_simple<void (*())()>::operator()() (functional:1380)
==3610==    by 0x1BE2373: std::thread::_State_impl<std::_Bind_simple<void (*())()> >::_M_run() (thread:197)
==3610==    by 0x6F3383E: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22)
==3610==    by 0x64356D9: start_thread (pthread_create.c:456)
==3610==    by 0x7828D7E: clone (clone.S:105)

I don't know much about valgrind, just googled something on how to use it and just ran it with some flags (valgrind --log-file="valgrindlog" --leak-check=full --show-leak-kinds=all -v --num-callers=20 --tool=memcheck ./worldserver) so if what I posted is useless or means nothing at all, at least I tried :P.

Off topic: There are other reports by valgrind in instance_sunwell_plateau, boss_gothik, TCSoapThread and SpellEffectInfo::CalcValue but again, due to I don't know if those should be there or not, I'll wait before reporting them.

Raydor on 11 Sep 2017

Try setting your Wired MTU to something really high (like >8192) - I know there was a bug for some Atheros cards (especially AR8161). Works for me :)

Apparently it's also works for me, no error for moment
PS : I have a network card "Intel"

DoctorKraft on 11 Sep 2017

Hitting dem driver bugs, heh.

Shauren on 11 Sep 2017

Looks like the same issue is hit here. The best solution proposed there is about setting SO_REUSEADDR socket option. Not quite sure where is has to be put (except SOAP socket handling), perhaps in the Socket(tcp::socket&& socket) ctor.

Olion17 on 24 Sep 2017

@Olion17 SO_REUSEADDR is a bad choice here - thats exactly the flag which makes worldserver silently fail to accept incoming connections when port is already in use by another application

Shauren on 24 Sep 2017

That flag doesnt fix the problem anyway, I tried it. (at least for me)

lachtanek on 24 Sep 2017

today this happen to me too twice.

2017-09-27_15:04:04 INFO  [server.worldserver] World initialized in 0 minutes 32 seconds
2017-09-27_15:04:04 ERROR [network] StartNetwork failed to bind socket acceptor
2017-09-27_15:04:04 ERROR [server.worldserver] Failed to initialize network

now i have to start and restart every time.

n4ndo on 28 Sep 2017

Any ideas? i try with MTU up to 9000.. even put waiting 1min the restarter but dont work. also before restart fuser -k PORT/tcp

For the moment i revert commit 7874bee7bfb70e0e039f91173cff212e9572de09

n4ndo on 28 Sep 2017

Confirmed

Undergarun on 17 Oct 2017

confirm

Laintime on 27 Oct 2017

any news?

avengerweb on 2 Nov 2017

Or any way to kill worldserver if "Failed to initialize network" ?

Undergarun on 7 Nov 2017

In case of that error worldserver should shut down by itself @Undergarun

Shauren on 8 Nov 2017

It should with return 1; But not happens because daemon keeps awaiting for some thread not closed. So i added World::StopNow(ERROR_EXIT_CODE); before return to fix that case. @Shauren

Undergarun on 8 Nov 2017

There are some kind of problem with sockets.

"Prevented sending of [WHAT_EVER_OPCODE] to non existent socket 1 to [Player: Foo GUID Full: XXX Type: Player Entry: 0 Low: X, Account: X]"

Happens with socket 1 and 2 and with whatever opcode. For a unknown reason, of course because i am not good in networking. Sockets are not properly closed when player disconnects.

Undergarun on 8 Nov 2017

Confirmed with Ubuntu 17.10
increasing the MTU is not working