Taichi fail to load in Ubuntu 20.04 using VM since 0.5.15, tried with both VMware Player and VirtualBox. Version 0.5.14 and before are fine.


Thank for the patient! This issue is because of the recent added OpenGL backend in v0.5.15.
It's heard that VM does not have good OpenGL support. Could you run glxinfo | grep OpenGL and verify that? And, do you have the same problem on your physical machine (not in VM)?
Sure, VMs are not famous for their OpenGL compatibility. I got 3.3 on VMware but only 2.1 in VirtualBox. Sadly I don't have a proper environment on my physical machine. Could you please don't assume a gl version and make it work with CPU only, just like before? The glxinfo from VMs are attached


I got 3.3 on VMware but only 2.1 in VirtualBox.
While Taichi requires 4.3 to work.
We should detect the version before calling into glfwCreateWindow and return false at that situation:
https://github.com/taichi-dev/taichi/blob/e0ef399b2c2f053918e1013fea9850022c96bced/taichi/backends/opengl/opengl_api.cpp#L517-L519
But the problem comes:
We need an OpenGL context to call glGetString(GL_VERSION).
We need glfwCreateWindow to get an OpenGL context.
We need glGetString(GL_VERSION) to determine weather to call glfwCreateWindow.
https://stackoverflow.com/questions/46510889/how-can-i-know-which-opengl-version-is-supported-by-my-system
Could you please don't assume a gl version and make it work with CPU only, just like before?
Possible temporary solution: remove L455-L456 from ~/.local/lib/python3.8/site-packages/taichi/core/util.py:
if ti_core.with_opengl():
supported_archs.append('opengl')
A related issue: https://github.com/glfw/glfw/issues/766
Thanks for the reply. I know it probably requires the compute shader for OpenGL to really shine. However, having OpenGL 4.3 is almost impossible currently for most VMs at my best knowledge, so I'd like to fall back on x64 for now.
Removing L455-456 in util.py fixes the line 'import taichi as ti', but when I do 'ti.init(arch=ti.x64)', I still get pretty much the same error. In the callstack I can still see OpenGL being initialized. I'm not quite sure what's going on behind the scene but it seems that ti_core.with_opengl() is still true even if arch=ti.x64 is passed in.

Thank for the information, I found another with_opengl in L271-L272 from ~/.local/lib/python3.8/site-packages/taichi/lang/__init__.py:
if ti_core.with_opengl():
archs.append(opengl)
but it seems that ti_core.with_opengl() is still true even if arch=ti.x64 is passed in.
Note that with_opengl is no expected to return false with arch=ti.x64 specified, it basically detects if the OpenGL driver is available, and return false only when driver unavailable, instead of a manual specifed arch.
However, with_opengl crashed into segment fault when detecting OpenGL availability...
It would be straightforward if we can catch that SIGSEGV, and return false on that condition.
python-pseudo code:
def with_opengl():
try:
return initialize_opengl()
except SegmentFault:
return False
https://github.com/taichi-dev/taichi/blob/e0ef399b2c2f053918e1013fea9850022c96bced/taichi/core/logging.cpp#L157-L162
https://docs.python.org/3/library/faulthandler.html#module-faulthandler
This also applies to CUDA backend, which is commonly reported to be crash on start up (@yuanming-hu), what do you think?
Btw, you can run TI_LOG_LEVEL=trace python test.py to print more details about the internal process.
BTW, is it possible to detect OpenGL version inside with_opengl(), something like this? Report true only if version >= 4.3?
BTW, is it possible to detect OpenGL version inside
with_opengl(), something like this? Reporttrueonly ifversion >= 4.3?
Thank for the suggestion, I hope so, but we can't call glGetInteger before glfwCreateWindow. I think we will stick to the catch-segmentation-fault approach, which is also helpful for CUDA.
We need an OpenGL context to call glGetString(GL_VERSION).
We need glfwCreateWindow to get an OpenGL context.
We need glGetString(GL_VERSION) to determine weather to call glfwCreateWindow (cause segfault)
A probably easier solution: can we simply have a .taichiconfig to disable OpenGL manually in certain environments?
Yes, we can, if you mean, users without environment manually add TI_WITH_OPENGL=0 in .bashrc?
Oh, making use of environment variables does sound like a good solution for *nix users! Let's use something like TI_ENABLE_OPENGL?
We should also consider how to make taichi work out-of-box without setting anything like an envvar. On the other hand, we don't want to set TI_ENABLE_OPENGL=0 by default. Do you have an idea on how to achieve both?
Thanks for all the timely replies. You guys are amazing!
You're welcome, thank you for pointing out the bug and valuable informations!
@yuanming-hu Can we release #962 with v0.6.4 tonight? So that @TroyZhai could try out TI_ENABLE_OPENGL=0 and see if it works.
Also note that this is a temporary solution given that it's hard to figure out why. We must find out an ultimate solution for this issue at some point.
Sure - I have meetings in the morning but I'll release v0.6.4 in a couple of hours.
@TroyZhai We just now released v0.6.4. When you get a chance, could you upgrade and run with TI_ENABLE_OPENGL=0? Please let us know if that works.
No rush on this at all. Thank you!
@TroyZhai We just now released v0.6.4. When you get a chance, could you upgrade and run with
TI_ENABLE_OPENGL=0? Please let us know if that works.No rush on this at all. Thank you!
Great news! I can confirm that it works as expected on my VMs when I set "export TI_ENABLE_OPENGL=0". Thanks all!
Awesome!
I'm closing this thanks to the hard work by @archibate.
Cool! But how about to add this usage to doc? Potentially a chapter called Troubleshooting, contains TI_ENABLE_OPENGL and TI_USE_UNIFIED_MEMORY, etc., so that these will solve more people's problem.
Sounds good! Should we mode the following items in the README file there as well?
- On Ubuntu 19.04+, please sudo apt install libtinfo5.
- On Windows, please install Microsoft Visual C++ Redistributable if you haven't.
A chapter named Installation sounds good. We can address all compatibility issues there. Maybe we can put it before Hello world?
These text in Hello world should also be moved there:
First of all, let鈥檚 install Taichi via pip:
# Python 3.6+ needed python3 -m pip install taichi
Most helpful comment
Great news! I can confirm that it works as expected on my VMs when I set "export TI_ENABLE_OPENGL=0". Thanks all!