Cudf: [FEA] Introduce `DecimalDtype` to cuDF

Created on 3 Nov 2020 · 3Comments · Source: rapidsai/cudf

As a first step towards supporting decimal fixed-point arithmetic, we need to introduce a new DecimalDtype dtype.

Requirements:

Similar to other cuDF dtypes, should inherit from ExtensionDtype. This requires us to define the abstract methods name and type. We could probably use 'decimal' and decimal.Decimal respectively.
Similar to PyArrow's Decimal128Type, should be constructed from a precision and a scale attribute:
```
>>> import cudf
>>> dt = cudf.DecimalDtype(precision=4, scale=2)
```
Should expose .from_arrow() and .to_arrow() methods that convert from and to PyArrow Decimal128Type.

cuDF (Python) feature request

Source

shwina

🚀1

Most helpful comment

Should it have the explicit storage size in the name like PyArrow? libcudf has 32-bit and 64-bit now, with plans to add 128-bit in the future.

In the call we had from the Python side I think we decided to only have a single DecimalDtype for now hooked up to DECIMAL64.

Then as a follow up we'll explore automatically handling type promotion / demotion based on the calculated precision value from operations.

If that exploration doesn't pan out we'll split the types into Decimal64Dtype and Decimal32Dtype.

@kkraus14 should this aim for 0.18?

No I think we can safely do this in 0.17.

kkraus14 on 4 Nov 2020

👍3 🚀1

Should it have the explicit storage size in the name like PyArrow? libcudf has 32-bit and 64-bit now, with plans to add 128-bit in the future.

harrism on 4 Nov 2020

@kkraus14 should this aim for 0.18?

harrism on 4 Nov 2020

Should it have the explicit storage size in the name like PyArrow? libcudf has 32-bit and 64-bit now, with plans to add 128-bit in the future.

In the call we had from the Python side I think we decided to only have a single DecimalDtype for now hooked up to DECIMAL64.

Then as a follow up we'll explore automatically handling type promotion / demotion based on the calculated precision value from operations.

If that exploration doesn't pan out we'll split the types into Decimal64Dtype and Decimal32Dtype.

@kkraus14 should this aim for 0.18?

No I think we can safely do this in 0.17.

kkraus14 on 4 Nov 2020

👍3 🚀1

Was this page helpful?

0 / 5 - 0 ratings

[FEA] Update Python implementation of fillna to use libcudf function

kkraus14 · 3Comments

[BUG] to_orc fails if one of the columns is a string column

ayushdg · 3Comments

[BUG] Writing negative timestamps in ORC doesn't match after its read in

razajafri · 3Comments

[BUG] DataFrame.empty doesn't perform as advertised in the documentation

yasmina-altair · 3Comments

[BUG] RunTimeError in `cudf::strings::starts_with`, `cudf::strings::ends_with` and `cudf::strings::find` when `target=''`

galipremsagar · 3Comments