Deeplearning4j: SameDiff generation of unsigned integers produces unexpected results

Created on 27 Dec 2019  路  4Comments  路  Source: eclipse/deeplearning4j

Issue Description

Hello, new user here. I can use the following Kotlin code to successfully generate uniform random DataType.INT values. (If there is a more efficient way to generate the needed shape SDVariable, or a direct way to generate random UINT64s using the standard ND4J random functions instead of SameDiff, please let me know).

val sameDiff = SameDiff.create()
val floatShape: FloatArray = floatArrayOf(3.0F, 3.0F)
val indaShape: INDArray = Nd4j.create(floatShape).castTo(DataType.INT)
val sdShape: SDVariable = sameDiff.constant(indaShape)
val rng: SDVariable = sameDiff.random().uniform("data", 0.0, 10.0, sdShape, DataType.INT)
val outputs: Map<String, INDArray> = sameDiff.output(emptyMap<String, INDArray>(), "data")
val data: INDArray = outputs["data"]!!
println(data)
print(data.dataType())

This code produces the expected output: a 3x3 matrix of integer values in the range [0, 9] with a dataType() of INT.

However, if I provide UINT16, UINT32, or UINT64 as the dataType argument to the SDRandom.uniform overload that I am using, it does not work as expected. If I provide UINT32 or UINT64, the dataType() matches the unsigned type that I requested, but the values printed are in the all over the range for a signed integer of the corresponding size. They aren't even byte-equivalent to the specified unsigned integer range, since if I specify a very narrow range, more distinct random values are produced than exist in that range. For UINT16. I instead get the following exception when trying to print the result:

Exception in thread "main" java.lang.ClassCastException: class org.bytedeco.javacpp.indexer.ShortRawIndexer cannot be cast to class org.bytedeco.javacpp.indexer.UShortIndexer (org.bytedeco.javacpp.indexer.ShortRawIndexer and org.bytedeco.javacpp.indexer.UShortIndexer are in unnamed module of loader 'app')
at org.nd4j.linalg.api.buffer.BaseDataBuffer.getLong(BaseDataBuffer.java:1483)
at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.getLong(BaseCudaDataBuffer.java:1496)
at org.nd4j.linalg.api.ndarray.BaseNDArray.getLong(BaseNDArray.java:1749)
at org.nd4j.linalg.string.NDArrayStrings.vectorToString(NDArrayStrings.java:300)
at org.nd4j.linalg.string.NDArrayStrings.format(NDArrayStrings.java:223)
at org.nd4j.linalg.string.NDArrayStrings.format(NDArrayStrings.java:257)
at org.nd4j.linalg.string.NDArrayStrings.format(NDArrayStrings.java:191)
at org.nd4j.linalg.string.NDArrayStrings.format(NDArrayStrings.java:168)
at org.nd4j.linalg.api.ndarray.BaseNDArray.toString(BaseNDArray.java:4932)
at org.nd4j.linalg.api.ndarray.BaseNDArray.toString(BaseNDArray.java:4923)
at org.nd4j.linalg.jcublas.JCublasNDArray.toString(JCublasNDArray.java:517)
at java.base/java.lang.String.valueOf(String.java:3352)
at java.base/java.io.PrintStream.println(PrintStream.java:977)

Is this operation intended to be able to produce unsigned values? (If not, consider this a feature request to that end.)

Version Information

  • Deeplearning4j version - 1.0.0-beta6
  • Platform information (OS, etc) - Windows 10
  • CUDA version, if used - 10.0
  • NVIDIA driver version, if in use - 436.30
Bug LIBND4J SameDiff

Most helpful comment

Right, it was never introduced there, since the goal was float types + int32. I will add support for these types somewhere this week.

All 4 comments

I'm afraid it was never tested tested against unsigned types, and wasn't even supposed to work with unsigned types. So we should either fix validation, or tweak operation. Probably tweak operation would be a better choice here.

cc @AlexDBlack

Hm... that stack trace doesn't look specific to random, it's about INDArray.toString.

As for uniform random - separate issue, but if sameDiff.random().uniform doesn't work for unsigned integers, I see no harm in adding that, with appropriate validation of course (non-negative lower bound, mainly)

So, the exception above was due to an issue in one constructor in BaseCudaDataBuffer, easy fix (fixed in my branch, will be merged soon).

As for actual values, the following datatypes look off for U(0,10):

  • BYTE (0s out)
  • SHORT (0s out)
  • UBYTE (0 or 1 only)
  • UINT16 (0 or 1 only)
  • UINT32 (garbage values, out of range)
  • UINT64 (garbage values, out of range)

Right, it was never introduced there, since the goal was float types + int32. I will add support for these types somewhere this week.

Was this page helpful?
0 / 5 - 0 ratings