Cudf: [QST]Dask not support librmm_cffi

Created on 24 Jun 2019  路  2Comments  路  Source: rapidsai/cudf

What is your question?
I use librmm.device_array to write nvcategory values into a device array and I use dask to map the function into workers. But I got the below error.

distributed.protocol.pickle - INFO - Failed to serialize . Exception: can't pickle CompiledFFI objects

It seems that the function containing librmm.device_array cannot be serialized.
I really appreciate if anyone has suggestions about this problem.

? - Needs Triage question

All 2 comments

@MikeChenfu , You can import librmm.device_array in the function where you do the categorization.

I ran into a very simlar issue: https://github.com/rapidsai/dask-cudf/issues/269 and that solved the problem.

# shift the import to inside the function

def categorize(df_part, uniques):
    from librmm_cffi import librmm
    for col in uniques.keys():
        keys = nvstrings.to_device(uniques[col])
        cat = nvcategory.from_strings(df_part[col].data).set_keys(keys)
        device_array = librmm.device_array(df_part[col].data.size(), dtype=np.int32)
        cat.values(devptr=device_array.device_ctypes_pointer.value)
        df_part[col] = cudf.Series(device_array)
    return df_part

Thanks @VibhuJawa. It resolves my issue.

Was this page helpful?
0 / 5 - 0 ratings