I wrote a neural network from scratch in Numpy to classify text. I am trying to use Ray to parallelize the process when I use this network to perform inference.
My code first generates a computation graph for each of the things I need to classify, and then calls ray.get() on the objects.
Something like:
for index, row in data.iterrows():
sub_results.append(
classify.remote(row["Subject"])
)
results = [ray.get(i) for i in sub_results]
Everything works well when I am running things on my local machine (a 2017 MacBook Pro running MacOS High Sierra 10.13.6), with or without local_mode=False, but when I try running the same code on an EC2 instance, things start failing with some kind of an OpenBLAS issue.
I tried looking for the OpenBLAS keyword and found a comment here that suggested I should set OPENBLAS_NUM_THREADS=1, but the problem persists.
I understand this could be more of an issue with OpenBLAS than with Ray, but would appreciate any inputs. Thank you 馃檱 .
(venv) [ec2-user@ip-172-31-1-151 ray-demo]$ OPENBLAS_NUM_THREADS=1
(venv) [ec2-user@ip-172-31-1-151 ray-demo]$ python infer.py > log_singhai.txt
Process STDOUT and STDERR is being redirected to /tmp/raylogs/.
Waiting for redis server at 127.0.0.1:39056 to respond...
Waiting for redis server at 127.0.0.1:63831 to respond...
Warning: Reducing object store memory because /dev/shm has only 13237538816 bytes available. You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you may need to pass an argument with the flag '--shm-size' to 'docker run'.
Starting local scheduler with the following resources: {'GPU': 0, 'CPU': 36}.
======================================================================
View the web UI at http://localhost:8894/notebooks/ray_ui92558.ipynb?token=3932d5bc240c8063108aa57ad86f2481ab88204fbd7538fe
======================================================================
Precomputed 2000 Ray objects.
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 281243 max
Traceback (most recent call last):
File "/home/ec2-user/singhai/venv/lib/python2.7/site-packages/ray/workers/default_worker.py", line 9, in <module>
import ray
File "/home/ec2-user/singhai/venv/lib/python2.7/site-packages/ray/__init__.py", line 28, in <module>
import pyarrow # noqa: F401
File "/home/ec2-user/singhai/venv/lib/python2.7/site-packages/ray/pyarrow_files/pyarrow/__init__.py", line 45, in <module>
import pyarrow.compat as compat
File "/home/ec2-user/singhai/venv/lib/python2.7/site-packages/ray/pyarrow_files/pyarrow/compat.py", line 23, in <module>
import numpy as np
File "/home/ec2-user/singhai/venv/lib/python2.7/site-packages/numpy/__init__.py", line 142, in <module>
from . import add_newdocs
File "/home/ec2-user/singhai/venv/lib/python2.7/site-packages/numpy/add_newdocs.py", line 13, in <module>
from numpy.lib import add_newdoc
File "/home/ec2-user/singhai/venv/lib/python2.7/site-packages/numpy/lib/__init__.py", line 8, in <module>
from .type_check import *
File "/home/ec2-user/singhai/venv/lib/python2.7/site-packages/numpy/lib/type_check.py", line 11, in <module>
import numpy.core.numeric as _nx
File "/home/ec2-user/singhai/venv/lib/python2.7/site-packages/numpy/core/__init__.py", line 16, in <module>
from . import multiarray
KeyboardInterrupt
terminate called after throwing an instance of 'std::system_error'
what(): Resource temporarily unavailable
The worker with ID 4cc83b46f701fd7b11b87b66e78fe35ab12894f2 died or was killed while executing the task with ID 2a1649c4d3fbe8eae429c76813d37a4d18e163ac
What happens when you try running export OMP_NUM_THREADS=1; OPENBLAS_NUM_THREADS=1; also?
That makes it work.
Thanks @richardliaw 馃檱.
What happens when you try running
export OMP_NUM_THREADS=1; OPENBLAS_NUM_THREADS=1;also?
IMO if Ray Tune is intended to be used in universal data science environments (as opposed to dedicated server-like containers), it should not require throttling of threads for all other packages installed in a container - these settings listed above by @richard4912 are used by multiple python packages to control their threading limits, including numpy and its reverse dependencies (which includes e.g. seaborn).
Suggestion for Ray Tune docs: add single-threaded environment (setting of all *_NUM_THREADS env vars to "1") as a stability requirement, together with the obligatory installation of ray[tune] (not just ray). Both of these were required to avoid python kernel crashes (sometimes with memory dumps) for all worker processes spawned by Ray Tune.
@mirekphd can you actually post a separate issue about the kernel crashes and your working environment?
I'd like to know more and see what exactly we should write in the documentation for other users facing this issue.
Also, I didn't quite parse what you said - your comment:
"Tune should not require throttling of threads for all other packages"
seemed to contradict your later comment,
"in the docs docs: add single-threaded environment (setting of all *_NUM_THREADS env vars to "1") as a stability requirement".
Could you help clarify that? thanks!
Most helpful comment
That makes it work.
Thanks @richardliaw 馃檱.