Is it correct that
Why do we have so different precisions? Are they correct?
Thanks!
This is for GPU compatibility. Most common GPUs do math with 32bit (or less) precision. When you want to run code on the GPU using theano, you have to set theano.config.floatX to float32. So PyMC3 checks if floatX has been set, and then reduces the precision on integers also.
yes but why do we set sometimes int16, other times int32 etc? shouldn't we have a standard like
everywhere?
I don't remember where I saw it but it's
floatX == float32 => int16
floatX == float64 => int64
Ok but only for storage, right?
Because GPU operations are only allowed on float32 (https://stackoverflow.com/questions/32229882/are-int-operations-possible-on-gpu-in-theano)
And from here (http://deeplearning.net/software/theano/tutorial/using_gpu.html) "The backend supports all regular theano data types (float32, float64, int, ...), however GPU support varies and some units can鈥檛 deal with double (float64) or small (less than 32 bits like int16) data types. You will get an error at compile time or runtime if this is the case."
and (https://stackoverflow.com/questions/34520471/how-to-force-theano-to-parallelize-an-operation-on-gpu-test-case-numpy-bincoun) "Currently (as of Jan. 4th, 2016) Theano and CUDA do not support any operations on any data type rather than float32"
Sorry for being pedantic, but I would like to understand and help (if needed)
I'd be surprised if the motivation for those casts is storage... I assumed GPU support. Maybe @kyleabeauchamp can weigh in? he has done a lot of work on this.
Yeah, I believe the int casting is to prevent various exceptions when running models (tests and simple models) on GPUs.
I don't want to weigh in too strongly on what the "best" approach is. I think the work we've done so far mainly ensures that things "can" run. I'm not expert enough to say what we "should" be done.
I would imagine the Theano GPU has some support for integers, but not full 64 bit long integers. Thus, we have to try to detect an "appropriate integer precision model" given the float precision model being used.
If it helps I can try to summarize the arithmetic conversion rules we are dealing with here a bit:
The arithmetic conversion rules tell us what type the artihmetic operations should have, so for example, what type the sum of an int32 and a float32 should be.
Python: This is I guess the easiest, as there are only two build-in numeric types, int and float. Int grows to arbitrary length and doesn't have any overflows, float is stored as a float64 internally. The sum of int types is int again, and sum of int and float or float and float is float.
C/C++ (and I think also cuda): Focusing on cases where floats are involved (the other stuff is a mess, especially if you also look at undefined behaviour): If one of the types is a long double, you get a long double. If not, but if one is a double, you get a double. If that isn't the case either but one is a float, you get a float. So we have
int64 * double -> double
int64 * float -> float
int32 * float -> float
int16 * float32 -> float32
int32 * float32 -> float64
If we add a scalar and a vector (or anything with ndim>0), the dtype of the result depends on the value of the scalar:
>>> np.array(1, dtype='int64') + np.array([0], dtype='int8')
array([1], dtype=int8)
>>> np.array(1, dtype='float32') + np.array([0], dtype='int8')
array([1], dtype=float32)
I'm not entirely sure about the exact rules, but see np.promote_type and np.result_type for details.
tt.as_tensor_variable, it will look for the smallest dtype that can fit the value:>>> tt.as_tensor_variable(1).dtype
'int8'
>>> tt.as_tensor_variable(128).dtype
'int16'
(I'd call this madness by the way: (tt.as_tensor_variable(127) + tt.as_tensor_variable(1)).eval() -> -127)
Lists of integers seem to always be converted to int64.
The analog is true for floats:
>>> tt.as_tensor_variable(0.).dtype
float32
>>> tt.as_tensor_variable(0.5).dtype
float32
>>> tt.as_tensor_variable(0.1).dtype # 0.1 can't be represented in base 2...
float64
>>> tt.as_tensor_variable([0.]).dtype
float64
Numpy arrays are always converted to a theano var of the same type.
For its arithmetic, it follows the numpy convention, but ignores the rule about scalars and arrays (this would be hard for non-constant scalars, as then the dtype of the results couldn't be determined at compile time)
As gpus don't always have good support for double (thanks to nvidia, apparently they disable double support on their cheaper cards so that they have an easier time selling the expensive cards), we have a strong incentive to use in16 instead of int32, because of int32 * float -> double in both theano and numpy.
This is a great explanation! Maybe you should put it somewhere into the docs so that it doesnt get lost @aseyboldt !
I think it could be part of the theano quick start?
@junpenglao Maybe in the advanced section? It seems a bit too detailed for the introduction.
I've also been thinking a bit about how we should handle dtypes in general. How about something like this, which shouldn't differ that much with what we do currently:
tt.as_tensor_variable, but prefer the dtype of the RV for python variables instead of using the smallest possible:as_tenser_var (keeping the same dtype)np.can_cast if we can convert it to the variables dtype, or the corresponding float dtype if it is a continuous param for a discrete variable. Throw an error if this is not possible. I think this should prevent users from shooting themselves into the foot with integer overflows, when they don't explicitly cast their data (In which case I think we should assume that people know what they are doing), and give reasonable error messages if someone tries to only use float32.
But it would require (small?) changes to all distributions, and would also prevent some possible valid model specifications, like using a larger dtype for one of the parameters than the variable itself. But I don't think this should be much trouble in practice.
Sounds reasonable to me.
The madness seems to mostly from tt.as_tensor_variable, would it be ok if we force dtype as int8 in the usage of as_tensor_variable whenever int is involved?
Not sure what you mean about int8?
Sorry I meant int16, as
we have a strong incentive to use int16 instead of int32
When by "force the dtype to int16" you mean "raise an exception when we get an int32 or int64 in a RV that has dtype int16", then that is what I mean :-)
Should this be closed by #3300?
Yes, I think so.
Most helpful comment
If it helps I can try to summarize the arithmetic conversion rules we are dealing with here a bit:
The arithmetic conversion rules tell us what type the artihmetic operations should have, so for example, what type the sum of an int32 and a float32 should be.
Python: This is I guess the easiest, as there are only two build-in numeric types,
intandfloat. Int grows to arbitrary length and doesn't have any overflows,floatis stored as a float64 internally. The sum of int types isintagain, and sum ofintandfloatorfloatandfloatisfloat.C/C++ (and I think also cuda): Focusing on cases where floats are involved (the other stuff is a mess, especially if you also look at undefined behaviour): If one of the types is a
long double, you get along double. If not, but if one is adouble, you get adouble. If that isn't the case either but one is afloat, you get afloat. So we haveIf we add a scalar and a vector (or anything with ndim>0), the dtype of the result depends on the value of the scalar:
I'm not entirely sure about the exact rules, but see
np.promote_typeandnp.result_typefor details.If we convert a python int to a theano variable using
tt.as_tensor_variable, it will look for the smallest dtype that can fit the value:(I'd call this madness by the way:
(tt.as_tensor_variable(127) + tt.as_tensor_variable(1)).eval() -> -127)Lists of integers seem to always be converted to int64.
The analog is true for floats:
Numpy arrays are always converted to a theano var of the same type.
For its arithmetic, it follows the numpy convention, but ignores the rule about scalars and arrays (this would be hard for non-constant scalars, as then the dtype of the results couldn't be determined at compile time)
As gpus don't always have good support for double (thanks to nvidia, apparently they disable double support on their cheaper cards so that they have an easier time selling the expensive cards), we have a strong incentive to use in16 instead of int32, because of
int32 * float -> doublein both theano and numpy.