Machinelearningnotebooks: Cannot read TabularDataset - DatasetExecutionError

Created on 7 Feb 2020  路  2Comments  路  Source: Azure/MachineLearningNotebooks

When I copy paste this sample code to read a tabular dataset in ML it doesn't work and it is really frustrating. I checked two hours on the documentation, I tried a lot of things and all the documentation seem deprecated. ... I checked as well the version of azureml-core (1.076)

# azureml-core of version 1.0.72 or higher is required
from azureml.core import Workspace, Dataset

subscription_id = '<subscription_id>'
resource_group = '<resource_group>'
workspace_name = '<workspace_name>

workspace = Workspace(subscription_id, resource_group, workspace_name)

dataset = Dataset.get_by_name(workspace, name='csv_file')
dataset.to_pandas_dataframe()

And I got this error message:

2020-02-07 10:05:10.961893 | ActivityCompleted: Activity=to_pandas_dataframe, HowEnded=Failure, Duration=703.32 [ms], Info = {'activity_id': '99688bb1-8e33-4f48-85b5-15185a3c6563', 'activity_name': 'to_pandas_dataframe', 'activity_type': 'PublicApi', 'app_name': 'TabularDataset', 'source': 'azureml.dataset', 'version': '1.0.76', 'completionStatus': 'Success', 'durationMs': 0.05}, Exception=DatasetExecutionError; 'MultiIndex' object has no attribute 'labels'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/data/dataset_error_handling.py in _try_execute(action, **kwargs)
     82         else:
---> 83             return action()
     84     except Exception as e:

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/dataprep/api/_loggerfactory.py in wrapper(*args, **kwargs)
    130                 try:
--> 131                     return func(*args, **kwargs)
    132                 except Exception as e:

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/dataprep/api/dataflow.py in to_pandas_dataframe(self, extended_types, nulls_as_nan)
    679             try:
--> 680                 return get_dataframe_reader().complete_incoming_dataframe(random_id)
    681             except _InconsistentSchemaError as e:

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/dataprep/api/_dataframereader.py in complete_incoming_dataframe(self, dataframe_id)
    245         from pyarrow import feather
--> 246         df = pyarrow.feather.concat_tables(partitions_dfs).to_pandas(use_threads=True)
    247         return df

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/pyarrow/table.pxi in pyarrow.lib.Table.to_pandas()

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/pyarrow/pandas_compat.py in table_to_blockmanager(options, table, memory_pool, categories)
    620     # ARROW-1751: flatten a single level column MultiIndex for pandas 0.21.0
--> 621     columns = _flatten_single_level_multiindex(columns)
    622 

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/pyarrow/pandas_compat.py in _flatten_single_level_multiindex(index)
    751         levels, = index.levels
--> 752         labels, = index.labels
    753 

AttributeError: 'MultiIndex' object has no attribute 'labels'

During handling of the above exception, another exception occurred:

DatasetExecutionError                     Traceback (most recent call last)
<ipython-input-17-7c5efbb67be1> in <module>
----> 1 dataset.take(4).to_pandas_dataframe()

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/data/_loggerfactory.py in wrapper(*args, **kwargs)
     76             with _LoggerFactory.track_activity(logger, func.__name__, activity_type, custom_dimensions) as al:
     77                 try:
---> 78                     return func(*args, **kwargs)
     79                 except Exception as e:
     80                     if hasattr(al, 'activity_info') and hasattr(e, 'error_code'):

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/data/tabular_dataset.py in to_pandas_dataframe(self)
    138         """
    139         dataflow = get_dataflow_for_execution(self._dataflow, 'to_pandas_dataframe', 'TabularDataset')
--> 140         df = _try_execute(dataflow.to_pandas_dataframe)
    141         return df
    142 

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/data/dataset_error_handling.py in _try_execute(action, **kwargs)
     83             return action()
     84     except Exception as e:
---> 85         raise DatasetExecutionError(str(e))

DatasetExecutionError: 'MultiIndex' object has no attribute 'labels'
MLOps doc-bug

Most helpful comment

Hi, Anis

Looks like you have incompatible versions of pandas and pyarrow libraries.
Pandas library deprecated labels in version 0.24 and pyarrow stopped using it too from version 0.12.0.

Could you run pip list and share versions of pandas and pyarrow, please so we can investigate more?

Also, could you try to update those packages by running pip install --upgrade pyarrow pandas and re-run the sample code?

All 2 comments

Hi, Anis

Looks like you have incompatible versions of pandas and pyarrow libraries.
Pandas library deprecated labels in version 0.24 and pyarrow stopped using it too from version 0.12.0.

Could you run pip list and share versions of pandas and pyarrow, please so we can investigate more?

Also, could you try to update those packages by running pip install --upgrade pyarrow pandas and re-run the sample code?

@myshylin

Hi, Anis

Looks like you have incompatible versions of pandas and pyarrow libraries.
Pandas library deprecated labels in version 0.24 and pyarrow stopped using it too from version 0.12.0.

Could you run pip list and share versions of pandas and pyarrow, please so we can investigate more?

Also, could you try to update those packages by running pip install --upgrade pyarrow pandas and re-run the sample code?

Awesome the pip upgrade command worked. I close the issue. Thank you !

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ahyerman picture ahyerman  路  3Comments

jarandaf picture jarandaf  路  4Comments

AakanchJoshi picture AakanchJoshi  路  4Comments

dumbledad picture dumbledad  路  3Comments

tylercmsft picture tylercmsft  路  4Comments