Pymc3: Discrete variables: int16 vs int32 vs int64

Created on 29 Jun 2017  路  16Comments  路  Source: pymc-devs/pymc3

Is it correct that

  • here (https://github.com/pymc-devs/pymc3/blob/master/pymc3/distributions/distribution.py#L124) we have float32 => int16, else int64
  • here (https://github.com/pymc-devs/pymc3/blob/master/pymc3/distributions/discrete.py#L286) we have the mode casted to int32

Why do we have so different precisions? Are they correct?

Thanks!

gpu

Most helpful comment

If it helps I can try to summarize the arithmetic conversion rules we are dealing with here a bit:
The arithmetic conversion rules tell us what type the artihmetic operations should have, so for example, what type the sum of an int32 and a float32 should be.

  • Python: This is I guess the easiest, as there are only two build-in numeric types, int and float. Int grows to arbitrary length and doesn't have any overflows, float is stored as a float64 internally. The sum of int types is int again, and sum of int and float or float and float is float.

  • C/C++ (and I think also cuda): Focusing on cases where floats are involved (the other stuff is a mess, especially if you also look at undefined behaviour): If one of the types is a long double, you get a long double. If not, but if one is a double, you get a double. If that isn't the case either but one is a float, you get a float. So we have

int64 * double -> double
int64 * float -> float
int32 * float -> float
  • numpy: If you add two scalars or two arrays, numpy looks for the smallest dtype that both operands can be casted to without any loss of precision and uses that as the return type. Except for int64, which actually can't be represented as a float64, but is still treated that way. As int32 can't be casted safely to float32, we get
int16 * float32 -> float32
int32 * float32 -> float64

If we add a scalar and a vector (or anything with ndim>0), the dtype of the result depends on the value of the scalar:

>>> np.array(1, dtype='int64') + np.array([0], dtype='int8')
array([1], dtype=int8)
>>> np.array(1, dtype='float32') + np.array([0], dtype='int8')
array([1], dtype=float32)

I'm not entirely sure about the exact rules, but see np.promote_type and np.result_type for details.

  • Theano: I couldn't find a formal definition of the rules, but I think I mostly worked it out by now:-
    If we convert a python int to a theano variable using tt.as_tensor_variable, it will look for the smallest dtype that can fit the value:
>>> tt.as_tensor_variable(1).dtype
'int8'
>>> tt.as_tensor_variable(128).dtype
'int16'

(I'd call this madness by the way: (tt.as_tensor_variable(127) + tt.as_tensor_variable(1)).eval() -> -127)
Lists of integers seem to always be converted to int64.
The analog is true for floats:

>>> tt.as_tensor_variable(0.).dtype
float32
>>> tt.as_tensor_variable(0.5).dtype
float32
>>> tt.as_tensor_variable(0.1).dtype  # 0.1 can't be represented in base 2...
float64
>>> tt.as_tensor_variable([0.]).dtype
float64

Numpy arrays are always converted to a theano var of the same type.
For its arithmetic, it follows the numpy convention, but ignores the rule about scalars and arrays (this would be hard for non-constant scalars, as then the dtype of the results couldn't be determined at compile time)

As gpus don't always have good support for double (thanks to nvidia, apparently they disable double support on their cheaper cards so that they have an easier time selling the expensive cards), we have a strong incentive to use in16 instead of int32, because of int32 * float -> double in both theano and numpy.

All 16 comments

This is for GPU compatibility. Most common GPUs do math with 32bit (or less) precision. When you want to run code on the GPU using theano, you have to set theano.config.floatX to float32. So PyMC3 checks if floatX has been set, and then reduces the precision on integers also.

yes but why do we set sometimes int16, other times int32 etc? shouldn't we have a standard like

  • floatX == float32 => int32
  • floatX == float64 => int64

everywhere?

I don't remember where I saw it but it's
floatX == float32 => int16
floatX == float64 => int64

Ok but only for storage, right?
Because GPU operations are only allowed on float32 (https://stackoverflow.com/questions/32229882/are-int-operations-possible-on-gpu-in-theano)

And from here (http://deeplearning.net/software/theano/tutorial/using_gpu.html) "The backend supports all regular theano data types (float32, float64, int, ...), however GPU support varies and some units can鈥檛 deal with double (float64) or small (less than 32 bits like int16) data types. You will get an error at compile time or runtime if this is the case."
and (https://stackoverflow.com/questions/34520471/how-to-force-theano-to-parallelize-an-operation-on-gpu-test-case-numpy-bincoun) "Currently (as of Jan. 4th, 2016) Theano and CUDA do not support any operations on any data type rather than float32"

Sorry for being pedantic, but I would like to understand and help (if needed)

I'd be surprised if the motivation for those casts is storage... I assumed GPU support. Maybe @kyleabeauchamp can weigh in? he has done a lot of work on this.

Yeah, I believe the int casting is to prevent various exceptions when running models (tests and simple models) on GPUs.

I don't want to weigh in too strongly on what the "best" approach is. I think the work we've done so far mainly ensures that things "can" run. I'm not expert enough to say what we "should" be done.

I would imagine the Theano GPU has some support for integers, but not full 64 bit long integers. Thus, we have to try to detect an "appropriate integer precision model" given the float precision model being used.

If it helps I can try to summarize the arithmetic conversion rules we are dealing with here a bit:
The arithmetic conversion rules tell us what type the artihmetic operations should have, so for example, what type the sum of an int32 and a float32 should be.

  • Python: This is I guess the easiest, as there are only two build-in numeric types, int and float. Int grows to arbitrary length and doesn't have any overflows, float is stored as a float64 internally. The sum of int types is int again, and sum of int and float or float and float is float.

  • C/C++ (and I think also cuda): Focusing on cases where floats are involved (the other stuff is a mess, especially if you also look at undefined behaviour): If one of the types is a long double, you get a long double. If not, but if one is a double, you get a double. If that isn't the case either but one is a float, you get a float. So we have

int64 * double -> double
int64 * float -> float
int32 * float -> float
  • numpy: If you add two scalars or two arrays, numpy looks for the smallest dtype that both operands can be casted to without any loss of precision and uses that as the return type. Except for int64, which actually can't be represented as a float64, but is still treated that way. As int32 can't be casted safely to float32, we get
int16 * float32 -> float32
int32 * float32 -> float64

If we add a scalar and a vector (or anything with ndim>0), the dtype of the result depends on the value of the scalar:

>>> np.array(1, dtype='int64') + np.array([0], dtype='int8')
array([1], dtype=int8)
>>> np.array(1, dtype='float32') + np.array([0], dtype='int8')
array([1], dtype=float32)

I'm not entirely sure about the exact rules, but see np.promote_type and np.result_type for details.

  • Theano: I couldn't find a formal definition of the rules, but I think I mostly worked it out by now:-
    If we convert a python int to a theano variable using tt.as_tensor_variable, it will look for the smallest dtype that can fit the value:
>>> tt.as_tensor_variable(1).dtype
'int8'
>>> tt.as_tensor_variable(128).dtype
'int16'

(I'd call this madness by the way: (tt.as_tensor_variable(127) + tt.as_tensor_variable(1)).eval() -> -127)
Lists of integers seem to always be converted to int64.
The analog is true for floats:

>>> tt.as_tensor_variable(0.).dtype
float32
>>> tt.as_tensor_variable(0.5).dtype
float32
>>> tt.as_tensor_variable(0.1).dtype  # 0.1 can't be represented in base 2...
float64
>>> tt.as_tensor_variable([0.]).dtype
float64

Numpy arrays are always converted to a theano var of the same type.
For its arithmetic, it follows the numpy convention, but ignores the rule about scalars and arrays (this would be hard for non-constant scalars, as then the dtype of the results couldn't be determined at compile time)

As gpus don't always have good support for double (thanks to nvidia, apparently they disable double support on their cheaper cards so that they have an easier time selling the expensive cards), we have a strong incentive to use in16 instead of int32, because of int32 * float -> double in both theano and numpy.

This is a great explanation! Maybe you should put it somewhere into the docs so that it doesnt get lost @aseyboldt !

I think it could be part of the theano quick start?

@junpenglao Maybe in the advanced section? It seems a bit too detailed for the introduction.

I've also been thinking a bit about how we should handle dtypes in general. How about something like this, which shouldn't differ that much with what we do currently:

  • Each RV has a dtype, which can be set by the user. Throw errors if an int is used for a continuous dtype, and vice versa. If nothing is explicitly set, use float64/int64 or float32/int16 depending on floatX.
  • For each parameter, convert it to a theano var in a similar way as tt.as_tensor_variable, but prefer the dtype of the RV for python variables instead of using the smallest possible:

    • if it is a theano var, don't change it

    • If it is numpy, use as_tenser_var (keeping the same dtype)

    • If it is pure python, check with np.can_cast if we can convert it to the variables dtype, or the corresponding float dtype if it is a continuous param for a discrete variable. Throw an error if this is not possible.

  • For each parameter, and also for observed values, check if the dtype is valid. By valid I mean: It is fine if the dtype can be casted safely to the variables dtype (or the corresponding float dtype for float parameters in discrete RVs), otherwise throw an error.

I think this should prevent users from shooting themselves into the foot with integer overflows, when they don't explicitly cast their data (In which case I think we should assume that people know what they are doing), and give reasonable error messages if someone tries to only use float32.

But it would require (small?) changes to all distributions, and would also prevent some possible valid model specifications, like using a larger dtype for one of the parameters than the variable itself. But I don't think this should be much trouble in practice.

Sounds reasonable to me.
The madness seems to mostly from tt.as_tensor_variable, would it be ok if we force dtype as int8 in the usage of as_tensor_variable whenever int is involved?

Not sure what you mean about int8?

Sorry I meant int16, as

we have a strong incentive to use int16 instead of int32

When by "force the dtype to int16" you mean "raise an exception when we get an int32 or int64 in a RV that has dtype int16", then that is what I mean :-)

Should this be closed by #3300?

Yes, I think so.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yarlett picture yarlett  路  5Comments

michaelosthege picture michaelosthege  路  5Comments

mmargenot picture mmargenot  路  6Comments

sempwn picture sempwn  路  3Comments

jonathanhfriedman picture jonathanhfriedman  路  6Comments