Incubator-superset: pyarrow does not know how to serialize objects of type

Created on 15 Oct 2019  路  7Comments  路  Source: apache/incubator-superset

sql_lab and also visualization doesn't work because of pyarrow can't serialize objects

Expected results

return query result

Actual results

pyarrow does not know how to serialize objects of type <class 'pandas.core.arrays.integer.IntegerArray'>.

Screenshots

Screenshot from 2019-10-15 18-24-34

How to reproduce the bug

  1. Go to sql_lab
  2. Run a query like SELECT *
  3. See error

Environment

  • superset version: branch master on 2019-10-12 18:38
  • python version: 3.7
  • node.js version: 10.16.3
  • npm version: 6.9.0

Checklist

Make sure these boxes are checked before submitting your issue - thank you!

  • [x] I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • [x] I have reproduced the issue with at least the latest released version of superset.
  • [x] I have checked the issue tracker for the same issue and I haven't found one similar.
#bug inactive

Most helpful comment

Thanks for reporting this issue. This may be related to https://github.com/apache/incubator-superset/issues/8225. Are you using Presto by chance?

You can disable msgpack and pyarrow serialization by setting RESULTS_BACKEND_USE_MSGPACK = False in superset_config.py until this issue is resolved.

All 7 comments

Issue-Label Bot is automatically applying the label #bug to this issue, with a confidence of 0.84. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

Please share database type and column types of the table causing the error.

It doesn't matter! this error happens on this code:

data = (
                pa.default_serialization_context()
                .serialize(cdf.raw_df)
                .to_buffer()
                .to_pybytes()
            )

here is the refrenc uri on sql_lab:
https://github.com/apache/incubator-superset/blob/master/superset/sql_lab.py#L275

Also, you can test this issue with trying this code:

import pandas as pd 
import pyarrow as pa
from  pandas.core.arrays.integer import IntegerArray

int_array = pd.array([1, None, 3], dtype=pd.Int32Dtype())
pa.default_serialization_context().serialize(int_array).to_buffer().to_pybytes()

Here is the arrow issue tracker Uri, that I reported this bug:
https://issues.apache.org/jira/browse/ARROW-6900?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=16952842#comment-16952842

Thanks for reporting this issue. This may be related to https://github.com/apache/incubator-superset/issues/8225. Are you using Presto by chance?

You can disable msgpack and pyarrow serialization by setting RESULTS_BACKEND_USE_MSGPACK = False in superset_config.py until this issue is resolved.

Yes. I'm using Presto. But I tried the sqlite too.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@blcksrx This should now be resolved via https://github.com/apache/incubator-superset/pull/8733

Was this page helpful?
0 / 5 - 0 ratings