Taichi: [Backend] New OpenGL Compute Shader Backend

Created on 17 Feb 2020  路  7Comments  路  Source: taichi-dev/taichi

Concisely describe the proposed feature
I would like to add an OpenGL backend to the compiler so that my pre-pascal GPU can be utilized.

Describe the solution you'd like
If LLVM IR could be converted into GLSL, that's good, we create codegen_llvm_opengl.cpp.
Otherwise, following Metal's step, we would have codegen_opengl.cpp.

Additional comments
Also helps WebGL, which means we can run taichi code on browser! See #394
References:
https://github.com/apache/incubator-tvm/pull/672

feature request stale

All 7 comments

Sounds good! Note that Taichi would require OpenGL compute shader (since OpenGL 4.3, a fairly advanced version) to fully function.

I don't think there's a way to compile LLVM IR into OpenGL... So we have to use Metal's source-to-source approach: https://github.com/taichi-dev/taichi/issues/396 I think there are a lot of things to borrow from the lessons learned during the Metal backend development by @k-ye. We should also consider make a common base class for C-like language codegen (C, Metal, OpenGL, OpenCL, etc.) to avoid duplicated code.

Also helps WebGL, which means we can run taichi code on browsers!

We still need an investigation into the possibility of compiling general Taichi programs into WebGL, since in my understanding compute shaders in WebGL are a little premature.

Compute shader? Maybe I can try one in fragment shader. We can store tensor in texture with height=1, according to https://github.com/apache/incubator-tvm/pull/672.

I found GLES really good at dealing linears: https://blog.csdn.net/ylbs110/article/details/52074826
We can make good use of it. Combined with frag&vert shaders, we have hopes to compile taichi code this way!

Compute shader? Maybe I can try one in fragment shader. We can store tensor in texture with height=1, according to apache/incubator-tvm#672.

Not every Taichi program can be compiled into a normal fragment shader. For example, you may need atomic_add, or multiple outputs. We do need compute shaders for bigger compute capabilities.

I can share some insights from #396

We started by supporting only the dense SNode, which is literally just an array of primitives. The way Metal handles this is to pass in just a chunk of memory of raw bytes (char). We then do pointer arithmetic to figure out the memory location mapped by the index, and use reinpterpret_cast to cast that location into the corresponding primitive type. You can get a basic sense of it in MetalStructCompiler:

https://github.com/taichi-dev/taichi/blob/ab1bfed008518c2dbe206e883d2dda81bfb8b587/taichi/backends/struct_metal.h#L18

I'm not super familiar with OpenGL or its computer shader, so it's unclear to me whether this works for OpenGL..

You may also need to have a basic understanding of 1) OffloadedStmt

https://github.com/taichi-dev/taichi/blob/ab1bfed008518c2dbe206e883d2dda81bfb8b587/taichi/statements.h#L172

and 2) how the offload IR pass works, since that's what translates the for .. in x in python into the GPU kernels and defines the boundary conditions.

I think the major thing is around memory layout. Once that's figured out, the actual codegen is not that hard to write -- lots of the times you are literally just translating the Taichi IR into a higher-level language.

Very excited to see a new (and much more widely-used) backend!

PS: Halide is also a great reference. Maybe you can borrow some of their implementation as well

Warning: The issue has been out-of-update for 50 days, marking stale.

@github-actions not stale at all! The OpenGL backend now is quite well-done and the issue should be closed now :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yuanming-hu picture yuanming-hu  路  4Comments

archibate picture archibate  路  4Comments

jackalcooper picture jackalcooper  路  4Comments

yuanming-hu picture yuanming-hu  路  3Comments

liaopeiyuan picture liaopeiyuan  路  3Comments