Hi there,
I have two GPUs and I could train in parallel using the "gputouse" option in train_network. I'd like to also analyze videos in parallel (as I have many) so would it be possible to include the same "gputouse" option in the analyze_videos function? If I start two different ones in parallel now I get a tensorflow error:
InternalError: Failed to create session.
Cheers,
Michael
Ps: I guess it's just including this line:
os.environ['CUDA_VISIBLE_DEVICES'] = str(gputouse)
good idea (edited*: which Alex evidently already had implemented - hehe); typically I just run separate docker containers and they are individually linked to a specific GPU, but for non-Docker users, this is indeed useful.
Yes, and you can set it e.g. by:
deeplabcut.train_network(config,shuffle=1,trainingsetindex=0,_gputouse=3_)
then just run it for your other GPU in another terminal...
See:
https://github.com/AlexEMG/DeepLabCut/blob/efa95129061b1ba1535f7361fe76e9267568a156/deeplabcut/pose_estimation_tensorflow/training.py#L12
Yes, for training. The same option would be good to have for the "analyze_videos" function.
Wait, doesn't that exist as well: https://github.com/AlexEMG/DeepLabCut/blob/master/deeplabcut/pose_estimation_tensorflow/predict_videos.py#L34
Mhm, indeed. Then all is good and I did something silly :)
I'm having a similar issue trying to train a network and analyze videos on 2 separate GPUs. The analyze videos function seems to call a function that utilizes both GPU's. Any ideas? Thanks!
this is with just running analyze videos function
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.43 Driver Version: 418.43 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:08:00.0 Off | N/A |
| 48% 65C P2 258W / 260W | 10856MiB / 10989MiB | 100% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 208... Off | 00000000:42:00.0 On | N/A |
| 41% 41C P2 64W / 260W | 438MiB / 10981MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 4108 C /home/kenzie/anaconda3/envs/DLC/bin/python 10845MiB |
| 1 1709 G /usr/lib/xorg/Xorg 18MiB |
| 1 1816 G /usr/bin/gnome-shell 58MiB |
| 1 2077 G /usr/lib/xorg/Xorg 108MiB |
| 1 2206 G /usr/bin/gnome-shell 85MiB |
| 1 4108 C /home/kenzie/anaconda3/envs/DLC/bin/python 155MiB |
+-----------------------------------------------------------------------------+
Just pass which GPU you want to use, e.g.
deeplabcut.train_network(config,gputouse=1)
or
deeplabcut.analyze_videos(config,videos,videotype='avi',shuffle=1,trainingsetindex=0,gputouse=0):
We tried that too. It's just with analyze_videos. I can train 2 networks simultaneously but when I have one network running and try to analyze_videos on the second GPU I get a device CUDA:0 not supported by XLA service error
When only one GPU is installed, analyze_videos will only utilize one GPU. But when the second GPU is installed, analyze_videos tries to utilize both even when I specify which GPU using gputuose so then I get the error
Thanks for all the details. I looked into the code again and noticed that the environment variable is set after the TF session is initialized (for the predict code). I just swapped the order and updated the github repo (not pypi so far). Could you please download 2.0.6.3 and check if it works now for you?
that worked! thank you!
Version released: https://pypi.org/project/deeplabcut/2.0.6.3/