Yugabyte-db: tserver does not start with 'Too many open files' error

Created on 12 Jul 2018  路  6Comments  路  Source: yugabyte/yugabyte-db

First, let me say Yugabyte looks awesome.
To my problem,
I'm on a mac, 10.13.2 and following https://docs.yugabyte.com/latest/quick-start/create-local-cluster/

I did ifconfig alias stuff then,

I ran
./bin/yb-ctl create
then
./bin/yb-ctl status
but it displays

2018-07-12 16:54:29,813 INFO: Server is running: type=master, node_id=1, PID=947, admin service=http://127.0.0.1:7000
2018-07-12 16:54:29,834 INFO: Server is running: type=master, node_id=2, PID=950, admin service=http://127.0.0.2:7000
2018-07-12 16:54:29,857 INFO: Server is running: type=master, node_id=3, PID=953, admin service=http://127.0.0.3:7000
2018-07-12 16:54:29,880 INFO: Server tserver-1 is not running
2018-07-12 16:54:29,898 INFO: Server tserver-2 is not running
2018-07-12 16:54:29,921 INFO: Server tserver-3 is not running

I looked at /private/tmp/yugabyte-local-cluster/node-1/disk-1/tserver.err


Could not create logging file: Too many open files
COULD NOT CREATE A LOGGINGFILE 20180712-165429.956!F0712 16:54:29.066678 44068864 reactor.cc:103] LibEV fatal error: (libev) error creating signal/async pipe: Too many open files [24]
Fatal failure details written to /tmp/yugabyte-local-cluster/node-1/disk-1/yb-data/tserver/logs/yb-tserver.FATAL.details.2018-07-12T16_54_29.pid956.txt
F20180712 16:54:29 ../../../../../src/yb/rpc/reactor.cc:103] LibEV fatal error: (libev) error creating signal/async pipe: Too many open files [24]
    @        0x10e873c8b  google::LogDestination::LogToSinks()
    @        0x10e872daf  google::LogMessage::SendToLog()
    @        0x10e873775  google::LogMessage::Flush()
    @        0x10e8735f3  google::LogMessage::~LogMessage()
    @        0x10e8744ee  google::ErrnoLogMessage::~ErrnoLogMessage()
    @        0x10dfbd1be  yb::rpc::(anonymous namespace)::LibevSysErr()
    @        0x10dc580a8  evpipe_init
    @        0x10dc59203  ev_async_start
    @        0x10dfb73c0  yb::rpc::Reactor::Init()
    @        0x10df991eb  yb::rpc::MessengerBuilder::Build()
    @        0x10ce51d43  yb::client::YBClientBuilder::Build()
    @        0x10ce3e4f6  yb::client::AsyncClientInitialiser::InitClient()
    @        0x10ce4047e  _ZNSt3__114__thread_proxyINS_5tupleIJNS_10unique_ptrINS_15__thread_structENS_14default_deleteIS3_EEEENS_6__bindIMN2yb6client22AsyncClientInitialiserEFvvEJPSA_EEEEEEEEPvSG_
    @     0x7fff78df26c1  _pthread_body
    @     0x7fff78df256d  _pthread_start
    @     0x7fff78df1c5d  thread_start

*** Check failure stack trace: ***
    @        0x10e87400a  google::LogMessage::Fail()
    @        0x10e873058  google::LogMessage::SendToLog()
    @        0x10e873775  google::LogMessage::Flush()
    @        0x10e8735f3  google::LogMessage::~LogMessage()
    @        0x10e8744ee  google::ErrnoLogMessage::~ErrnoLogMessage()
    @        0x10dfbd1be  yb::rpc::(anonymous namespace)::LibevSysErr()
    @        0x10dc580a8  evpipe_init
    @        0x10dc59203  ev_async_start
    @        0x10dfb73c0  yb::rpc::Reactor::Init()
    @        0x10df991eb  yb::rpc::MessengerBuilder::Build()
    @        0x10ce51d43  yb::client::YBClientBuilder::Build()
    @        0x10ce3e4f6  yb::client::AsyncClientInitialiser::InitClient()
    @        0x10ce4047e  _ZNSt3__114__thread_proxyINS_5tupleIJNS_10unique_ptrINS_15__thread_structENS_14default_deleteIS3_EEEENS_6__bindIMN2yb6client22AsyncClientInitialiserEFvvEJPSA_EEEEEEEEPvSG_
    @     0x7fff78df26c1  _pthread_body
    @     0x7fff78df256d  _pthread_start
    @     0x7fff78df1c5d  thread_start

Then I ran

sysctl kern.maxfiles
kern.maxfiles: 524288
sysctl kern.maxfilesperproc
kern.maxfilesperproc: 65535

So I'm stuck here.
Any help will be appreciated :)

kinquestion

Most helpful comment

sudo ulimit -n 1048576 didn't work, so I did a search and found a solution: https://superuser.com/a/1171026

It now works for me.

Summary, I did following to make it work for High Sierra.

Created \etc\sysctl.conf:

kern.maxfiles=1048576
kern.maxfilesperproc=1048576

Then I did what's described in the link above,
Restarted,
It works!

Thanks for the help!

All 6 comments

Hey @vincentvictoria , can you also post ulimit -a output?

Also, my local settings for the same sysctl's:

sysctl -a | grep maxfiles
27:kern.maxfiles: 1048576
39:kern.maxfilesperproc: 1048576

We should enhance our prereq docs section to add a note on this!

We are seeing about 1800 files in lsof for the 6 processes for Yugabyte.
Can you check the following command before you start?
lsof |wc

I changed the settings for sysctl as @bmatican.

sysctl -a | grep maxfiles

kern.maxfiles: 1048576
kern.maxfilesperproc: 1048576

ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 256
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1418
virtual memory          (kbytes, -v) unlimited

And I'm getting a slightly different error but essentially the same:

libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: random_device failed to open /dev/urandom: Too many open files
*** Aborted at 1531441747 (unix time) try "date -d @1531441747" if you are using GNU date ***
PC: @     0x7fff7338de3e __pthread_kill
*** SIGABRT (@0x7fff7338de3e) received by PID 2720 (TID 0x70000fad4000) stack trace: ***
    @     0x7fff734bff5a _sigtramp
    @            0x20008 (unknown)
    @     0x7fff732ea312 abort
    @     0x7fff712c5f8f abort_message
    @     0x7fff712c6113 default_terminate_handler()
    @     0x7fff72650eab _objc_terminate()
    @     0x7fff712e17c9 std::__terminate()
    @     0x7fff712e126d __cxa_throw
    @     0x7fff712b47af std::__1::__throw_system_error()
    @     0x7fff712a749d std::__1::random_device::random_device()
    @        0x105f32792 yb::Seed<>()
    @        0x105f32711 yb::ThreadLocalRandom()
    @        0x1058ea8ea yb::rpc::RpcRetrier::DelayedRetry()
    @        0x104c74c26 yb::master::GetLeaderMasterRpc::Finished()
    @        0x104c74abd yb::master::GetLeaderMasterRpc::GetMasterRegistrationRpcCbForNode()
    @        0x104c771ae _ZNSt3__110__function6__funcINS_6__bindIMN2yb6master18GetLeaderMasterRpcEFviRKNS3_6StatusERKNS_10shared_ptrINS3_3rpc10RpcCommandEEEN5boost9container22stable_vector_iteratorIPSC_Lb0EEEEJPS5_RiRKNS_12placeholders4__phILi1EEERSC_RSJ_EEENS_9allocatorISV_EEFvS8_EEclES8_
    @        0x104c77708 yb::master::(anonymous namespace)::GetMasterRegistrationRpc::Finished()
    @        0x1058cdb1e yb::rpc::OutboundCall::CallCallback()
    @        0x1058e1dc6 yb::rpc::Reactor::AssignOutboundCall()
    @        0x1058df08e yb::rpc::Reactor::ProcessOutboundQueue()
    @        0x1058e17eb yb::rpc::Reactor::AsyncHandler()
    @        0x105a28d59 ev_invoke_pending
    @        0x105a299ea ev_run
    @        0x1058df5ac yb::rpc::Reactor::RunThread()
    @        0x105f4a86a yb::Thread::SuperviseThread()
    @     0x7fff734c96c1 _pthread_body
    @     0x7fff734c956d _pthread_start
    @     0x7fff734c8c5d thread_start

Before yb-ctl create:

lsof |wc
9595 91101 1398418

After yb-ctl create (with the error):
9992 94581 1454919

Oh, as I suspected, seems like ulimit is taking precedence over sysctl, as your max open files in ulimit is 256...

@vincentvictoria Can you try sudo ulimit -n 1048576 ?

For reference, my settings:

ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1048576
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 2500
virtual memory          (kbytes, -v) unlimited

sudo ulimit -n 1048576 didn't work, so I did a search and found a solution: https://superuser.com/a/1171026

It now works for me.

Summary, I did following to make it work for High Sierra.

Created \etc\sysctl.conf:

kern.maxfiles=1048576
kern.maxfilesperproc=1048576

Then I did what's described in the link above,
Restarted,
It works!

Thanks for the help!

Hi @vincentvictoria - thanks for sharing the tip for High Sierra (Mac).

@rven1 : Could you please take an action item to document the recommended ulimit settings in the "Prerequisites" section of the docs for MacOS here https://docs.yugabyte.com/latest/quick-start/install/#macos? Thanks.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

robertpang picture robertpang  路  3Comments

rkarthik007 picture rkarthik007  路  5Comments

kmuthukk picture kmuthukk  路  4Comments

robertpang picture robertpang  路  3Comments

nocaway picture nocaway  路  3Comments