The Galaxy uwsgi process got segmentation fault occasionally on Mac OS X, which happened on multiple Mac laptops. The problem could go away after a restart or re-loggin of the OS X. It also could be gone after many retries of starting Galaxy.
I also see this message quite often!
!!! uWSGI process 99897 got Segmentation Fault !!!
*** backtrace of 99897 ***
0 uwsgi.so 0x00000001024f21f0 uwsgi_backtrace + 48
1 uwsgi.so 0x00000001024f2733 uwsgi_segfault + 51
2 libsystem_platform.dylib 0x00007fff68c54f5a _sigtramp + 26
3 libobjc.A.dylib 0x00007fff67dd75b1 _objc_fetch_pthread_data + 34
4 uwsgi.so 0x00000001024a5bf0 parse_sys_envs + 80
5 uwsgi.so 0x00000001024f3f80 uwsgi_setup + 5776
6 uwsgi.so 0x00000001024fde9b pyuwsgi_setup + 795
7 uwsgi.so 0x00000001024fdef3 pyuwsgi_run + 19
8 Python 0x00007fff4b69df89 PyEval_EvalFrameEx + 2917
9 Python 0x00007fff4b69d232 PyEval_EvalCodeEx + 1551
10 Python 0x00007fff4b69cc1d PyEval_EvalCode + 32
11 Python 0x00007fff4b6bbad1 PyParser_ASTFromFile + 287
12 Python 0x00007fff4b6bbb78 PyRun_FileExFlags + 130
13 Python 0x00007fff4b6bb6fa PyRun_SimpleFileExFlags + 706
14 Python 0x00007fff4b6cc96c Py_Main + 3064
15 libdyld.dylib 0x00007fff689d3115 start + 1
16 ??? 0x0000000000000017 0x0 + 23
*** end of backtrace ***
This looks like the same issue as should be fixed by unbit/uwsgi#1680 (which is not merged upstream but is part of the wheel). Are you both using the wheel from wheels.galaxyproject.org?
I am using that wheels source.
per conversation with @VJalili
I did uninstall (
pip uninstall uWSGI) and reinstalled (pip install --extra-index-url http://wheels.galaxyproject.org/ uWSGI); but I'm still experiencing the very same issue.
I reinstalled (./.venv/bin/pip install --index-url https://wheels.galaxyproject.org/simple/ --no-cache-dir uwsgi==2.0.15). The issue is still there.
@VJalili @qiagu Can you confirm that in both cases you have the same traceback? Specifically, in the issue I linked, frame 3 is:
3 libxpc.dylib 0x00007fffc1cd2169 xpc_array_apply + 64
However, in @VJalili's traceback it's
3 libobjc.A.dylib 0x00007fff67dd75b1 _objc_fetch_pthread_data + 34
This may or may not be the same problem as in the PR I linked.
Other questions:
$__CF_USER_TEXT_ENCODING set?__CF_USER_TEXT_ENCODING=$(printf "0x%x:0:0" $(id -u)) (I am not sure that this value is correct or even relevant - afaik it only needs to be set to something to work around the issue fixed in the PR..venv/lib/python2.7/ should reveal this)?!!! uWSGI process 45877 got Segmentation Fault !!!
*** backtrace of 45877 ***
0 uwsgi.so 0x000000010e1101f0 uwsgi_backtrace + 48
1 uwsgi.so 0x000000010e110733 uwsgi_segfault + 51
2 libsystem_platform.dylib 0x00007fffdb67bb3a _sigtramp + 26
3 libxpc.dylib 0x00007fffdb6bf138 xpc_array_apply + 21
4 uwsgi.so 0x000000010e0c3bf0 parse_sys_envs + 80
5 uwsgi.so 0x000000010e111f80 uwsgi_setup + 5776
6 uwsgi.so 0x000000010e11be9b pyuwsgi_setup + 795
7 uwsgi.so 0x000000010e11bef3 pyuwsgi_run + 19
8 Python 0x000000010de434d4 PyEval_EvalFrameEx + 14624
9 Python 0x000000010de3f9be PyEval_EvalCodeEx + 1617
10 Python 0x000000010de3f367 PyEval_EvalCode + 48
11 Python 0x000000010de5f5dd PyParser_ASTFromFile + 297
12 Python 0x000000010de5f680 PyRun_FileExFlags + 133
13 Python 0x000000010de5f1d1 PyRun_SimpleFileExFlags + 702
14 Python 0x000000010de70b6a Py_Main + 3094
15 libdyld.dylib 0x00007fffdb46c235 start + 1
16 ??? 0x0000000000000019 0x0 + 25
*** end of backtrace ***
Answer to questions:
./run.sh)echo $__CF_USER_TEXT_ENCODING0x31C78F3E:0x0:0x0
uname -aDarwin RJHB339 16.7.0 Darwin Kernel Version 16.7.0: Thu Jan 11 22:59:40 PST 2018; root:xnu-3789.73.8~1/RELEASE_X86_64 x86_64
sw_vers
ProductName: Mac OS X
ProductVersion: 10.12.6
BuildVersion: 16G1212
5.
UserDict.py@ -> /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/UserDict.py
. .venv/bin/activate
python -V
Python 2.7.10
Thanks for looking into this @natefoo.
The following seems to be _the_ solution to my problem; i.e., I can restart Galaxy now.
__CF_USER_TEXT_ENCODING=$(printf "0x%x:0:0" $(id -u))
Would that be possible to have this set automatically?
@VJalili
The line of code didn't make miracle for me.
I also got different libxpc.dylib. Right after that, Galaxy started successfully.
!!! uWSGI process 54169 got Segmentation Fault !!!
*** backtrace of 54169 ***
0 uwsgi.so 0x0000000106d0e1f0 uwsgi_backtrace + 48
1 uwsgi.so 0x0000000106d0e733 uwsgi_segfault + 51
2 libsystem_platform.dylib 0x00007fffdb67bb3a _sigtramp + 26
3 ??? 0x00007f90eb40690c 0x0 + 140260398885132
4 uwsgi.so 0x0000000106cc1bf0 parse_sys_envs + 80
5 uwsgi.so 0x0000000106d0ff80 uwsgi_setup + 5776
6 uwsgi.so 0x0000000106d19e9b pyuwsgi_setup + 795
7 uwsgi.so 0x0000000106d19ef3 pyuwsgi_run + 19
8 Python 0x0000000106a414d4 PyEval_EvalFrameEx + 14624
9 Python 0x0000000106a3d9be PyEval_EvalCodeEx + 1617
10 Python 0x0000000106a3d367 PyEval_EvalCode + 48
11 Python 0x0000000106a5d5dd PyParser_ASTFromFile + 297
12 Python 0x0000000106a5d680 PyRun_FileExFlags + 133
13 Python 0x0000000106a5d1d1 PyRun_SimpleFileExFlags + 702
14 Python 0x0000000106a6eb6a Py_Main + 3094
15 libdyld.dylib 0x00007fffdb46c235 start + 1
*** end of backtrace ***
Interesting, it appears to be the same issue but my fix hasn't worked for you. I'll do some testing and see if I can reproduce this. @VJalili are you on 10.12?
10.13.3 to be exact.
Upgraded to 10.13.3 today. Haven't met a single occurrence so far. I'll report if the problem happens again. Thanks to all of you guys!
Hrm, ok. @VJalili, are you logged in directly or via SSH? Is $__CF_USER_TEXT_ENCODING not already set? It should be set if you're logged in directly. It should only unset if you log in via ssh.
Either way, the wheel is supposed to work around the issue. Can you make sure you're getting the fixed wheel by reinstalling it with:
$ ./.venv/bin/pip uninstall -y uwsgi
$ ./.venv/bin/pip install --index-url https://wheels.galaxyproject.org/simple/ --no-cache-dir --only-binary :all: uwsgi==2.0.15
Otherwise it could be pulling from your local cache.
@natefoo I am logged in directly, and the variable $__CF_USER_TEXT_ENCODING was not set, so I had to set it manually.
Before setting this variable, I tried using that wheel, but it did not fix the issue. The issue on my side was fixed only after I manually set the variable $__CF_USER_TEXT_ENCODING.
Ok, strange. Thanks. I'll see if I can reproduce this.
Seeing the same issue as @VJalili, traceback is:
executing: .venv/bin/uwsgi --yaml config/galaxy.yml --virtualenv /Users/mvandenb/src/galaxy/.venv --pythonpath lib --static-map /static/style=/Users/mvandenb/src/galaxy/static/style/blue --static-map /static=/Users/mvandenb/src/galaxy/static --die-on-term --hook-master-start 'unix_signal:2 gracefully_kill_them_all' --hook-master-start 'unix_signal:15 gracefully_kill_them_all' --enable-threads --py-call-osafterfork
!!! uWSGI process 84309 got Segmentation Fault !!!
*** backtrace of 84309 ***
0 uwsgi.so 0x000000010219d1f0 uwsgi_backtrace + 48
1 uwsgi.so 0x000000010219d733 uwsgi_segfault + 51
2 libsystem_platform.dylib 0x00007fff7033ff5a _sigtramp + 26
3 libobjc.A.dylib 0x00007fff6f4c25b1 _objc_fetch_pthread_data + 34
4 uwsgi.so 0x0000000102150bf0 parse_sys_envs + 80
5 uwsgi.so 0x000000010219ef80 uwsgi_setup + 5776
6 uwsgi.so 0x00000001021a8e9b pyuwsgi_setup + 795
7 uwsgi.so 0x00000001021a8ef3 pyuwsgi_run + 19
8 Python 0x0000000101eca61a PyEval_EvalFrameEx + 26934
9 Python 0x0000000101ec3aee PyEval_EvalCodeEx + 1617
10 Python 0x0000000101ec3497 PyEval_EvalCode + 48
11 Python 0x0000000101ee6e18 run_mod + 53
12 Python 0x0000000101ee6ebb PyRun_FileExFlags + 133
13 Python 0x0000000101ee6a0c PyRun_SimpleFileExFlags + 702
14 Python 0x0000000101ef83c2 Py_Main + 3094
15 libdyld.dylib 0x00007fff700be115 start + 1
*** end of backtrace ***
Not logged in remotely either (iTerm or Terminal behave the same), doing unset __CF_USER_TEXT_ENCODING works around the issue, re-installing via the pip command you mentioned has no effect.
@mvdbeek Also 10.12 or 10.13?
10.13.2 ... and the virtualenv's python is installed via homebrew fwiw.
Same problem with system python, so that shouldn't be it.
Good news, everyone! I recreated this on my own laptop. So I can stop pestering you with questions.
That is the first step toward fixing the issue @natefoo :) Good luck
Unfortunately, this issue popped-up again today; in other words, today I was not able to restart my Galaxy instance ;-(. Here is the traceback:
!!! uWSGI process 14612 got Segmentation Fault !!!
*** backtrace of 14612 ***
0 uwsgi.so 0x0000000108fc51f0 uwsgi_backtrace + 48
1 uwsgi.so 0x0000000108fc5733 uwsgi_segfault + 51
2 libsystem_platform.dylib 0x00007fff6a458f5a _sigtramp + 26
3 libobjc.A.dylib 0x00007fff695db5b1 _objc_fetch_pthread_data + 34
4 uwsgi.so 0x0000000108f78bf0 parse_sys_envs + 80
5 uwsgi.so 0x0000000108fc6f80 uwsgi_setup + 5776
6 uwsgi.so 0x0000000108fd0e9b pyuwsgi_setup + 795
7 uwsgi.so 0x0000000108fd0ef3 pyuwsgi_run + 19
8 Python 0x00007fff4cea1f89 PyEval_EvalFrameEx + 2917
9 Python 0x00007fff4cea1232 PyEval_EvalCodeEx + 1551
10 Python 0x00007fff4cea0c1d PyEval_EvalCode + 32
11 Python 0x00007fff4cebfad1 PyParser_ASTFromFile + 287
12 Python 0x00007fff4cebfb78 PyRun_FileExFlags + 130
13 Python 0x00007fff4cebf6fa PyRun_SimpleFileExFlags + 706
14 Python 0x00007fff4ced096c Py_Main + 3064
15 libdyld.dylib 0x00007fff6a1d7115 start + 1
*** end of backtrace ***
and none of the previously discussed solutions help this time.
@natefoo hope this could be helpful:
When my Galaxy instance fails to start, I see the aforementioned message; however, when it starts, I see the following message:
!!! no internal routing support, rebuild with pcre support !!!
*** WARNING: you are running uWSGI without its master process manager ***
your processes number limit is 1418
your memory page size is 4096 bytes
detected max file descriptor number: 256
building mime-types dictionary from file /etc/apache2/mime.types...1002 entry found
lock engine: OSX spinlocks
thunder lock: disabled (you can enable it with --thunder-lock)
uWSGI http bound on localhost:8080 fd 4
spawned uWSGI http 1 (pid: 17661)
uwsgi socket 0 bound to TCP address 127.0.0.1:54688 (port auto-assigned) fd 3
Python version: 2.7.10 (default, Jul 15 2017, 17:16:57) [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.31)]
--- Python VM already initialized ---
Python main interpreter initialized at 0x7fda82e002a0
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 103616 bytes (101 KB) for 4 cores
*** Operational MODE: threaded ***
I've figured out the cause and have a potential solution but am hoping to get some feedback from the uWSGI developers.
For anyone encountering problems, you should be able to use the following temporary workaround:
$ ./.venv/bin/pip uninstall -y uwsgi && ./.venv/bin/pip install --no-cache-dir uwsgi==2.0.15
This will compile from the source tarball on PyPI.
Thanks to @natefoo's wonderful work on solving this issue. I haven't had any problem with ./run.sh since upgrading to 10.13.3. However, it seems that the issue doesn't go away when running ./run_reports.sh.
!!! uWSGI process 61133 got Segmentation Fault !!!
*** backtrace of 61133 ***
0 uwsgi.so 0x000000010dbe71f0 uwsgi_backtrace + 48
1 uwsgi.so 0x000000010dbe7733 uwsgi_segfault + 51
2 libsystem_platform.dylib 0x00007fff50324f5a _sigtramp + 26
3 ??? 0x00007fd7e9d00644 0x0 + 140565317420612
4 uwsgi.so 0x000000010db9abf0 parse_sys_envs + 80
5 uwsgi.so 0x000000010dbe8f80 uwsgi_setup + 5776
6 uwsgi.so 0x000000010dbf2e9b pyuwsgi_setup + 795
7 uwsgi.so 0x000000010dbf2ef3 pyuwsgi_run + 19
8 Python 0x00007fff32d6df89 PyEval_EvalFrameEx + 2917
9 Python 0x00007fff32d6d232 PyEval_EvalCodeEx + 1551
10 Python 0x00007fff32d6cc1d PyEval_EvalCode + 32
11 Python 0x00007fff32d8bad1 PyParser_ASTFromFile + 287
12 Python 0x00007fff32d8bb78 PyRun_FileExFlags + 130
13 Python 0x00007fff32d8b6fa PyRun_SimpleFileExFlags + 706
14 Python 0x00007fff32d9c96c Py_Main + 3064
15 libdyld.dylib 0x00007fff500a3115 start + 1
16 ??? 0x0000000000000016 0x0 + 22
*** end of backtrace ***
I'm planning to replace these wheels shortly - see galaxyproject/starforge#199 - which should fix these issues permanently.
@natefoo Great! Thanks.
@natefoo for the following command to work:
./.venv/bin/pip uninstall -y uwsgi && ./.venv/bin/pip install --no-cache-dir uwsgi==2.0.15
I need to first change the pinned version of uWsgi to 2.0.15, which is defined as the following:
Otherwise Galaxy will first uninstall v2.0.15 and install v2.0.17 before it starts.
@VJalili just change the version in the command to 2.0.17, it should match the requirement
./.venv/bin/pip uninstall -y uwsgi && ./.venv/bin/pip install --no-cache-dir uwsgi==2.0.17
@martenson That works, thanks.
Thanks to @natefoo and others. This issue has already been fixed. I'll close it soon.
@qiagu I am not sure it was, see the uwsgi issue linked above :/
@martenson I haven't had any issue since upgrading to 2.0.17 about 3 months ago. The upgrade seems to work for @VJalili as well.
@qiagu that is because you compiled your own uwsgi with the workaround command @natefoo provided I assume
I see. It's just a workaround. @martenson, you think I should keep this issue open?
@qiagu yes, please
@martenson sounds good.
Would not it be possible to build a wheel that has the "correct" uWSGI? (a temporary solution till @natefoo's work on uWSGI is merged and put in their next release. )
@VJalili That is the cause of this issue.
My mistake; I thought it is because a _buggy_ wheel is used by default at first start-up.
This should be fixed (albeit not upstream) with the wheel in galaxyproject/starforge#213, which I've placed on wheels.galaxyproject.org. If you have an affected venv, uninstall uWSGI (./.venv/bin/pip uninstall -y uwsgi) remove your pip cache (rm -rf ~/.cache/pip) and start Galaxy as normal.
Most helpful comment
I've figured out the cause and have a potential solution but am hoping to get some feedback from the uWSGI developers.
For anyone encountering problems, you should be able to use the following temporary workaround:
This will compile from the source tarball on PyPI.