Pandas: read_excel throws ValueError: cannot specify usecols when specifying a multi-index header

Created on 26 Feb 2019  Â·  20Comments  Â·  Source: pandas-dev/pandas

df=pd.read_excel(xl_file, sheet_name=xl_sheet,header=[18,19], usecols='B,D:J,L')

Problem description

df=pd.read_excel(xl_file, sheet_name=xl_sheet,header=[18,19], usecols='B,D:J,L')

throws the following error after conda update:

ValueError: cannot specify usecols when specifying a multi-index header

The script run perfectly fine before.

Expected Output

Dataframe with multi-index columns, as before.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.1.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.24.1
pytest: 3.3.2
pip: 19.0.3
setuptools: 40.8.0
Cython: 0.27.3
numpy: 1.15.4
scipy: None
pyarrow: None
xarray: None
IPython: 7.3.0
sphinx: 1.6.6
patsy: 0.5.0
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: None
feather: None
matplotlib: 2.1.2
openpyxl: 2.4.10
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml.etree: None
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.1
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Bug IO Excel

Most helpful comment

I also got the same issue

All 20 comments

Which version was this working on previously? Also can you share the relevant data from the file you are reading?

Hi William.

Thanks for your quick feedback.

From conda list --revisions my previous version was 0.23.1 (conda-forge)

Attached you find one example of the files I read.
They are several of them, all equal in structure.
It works with none of them since the update.

This worked as expected before the update.
I received a dataframe with multi-index columns as expected.

Matthias

Von: William Ayd [mailto:[email protected]]
Gesendet: Dienstag, 26. Februar 2019 17:16
An: pandas-dev/pandas pandas@noreply.github.com
Cc: Hufnagel, Matthias (TSHAM) matthias.hufnagel@kantar.com; Author author@noreply.github.com
Betreff: Re: [pandas-dev/pandas] read_excel throws ValueError: cannot specify usecols when specifying a multi-index header (#25449)

Which version was this working on previously? Also can you share the relevant data from the file you are reading?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pandas-2Ddev_pandas_issues_25449-23issuecomment-2D467503309&d=DwMCaQ&c=zdK58V2JKULZdB8nuBRpog&r=viu6wFhKUAAQyfCRfccCyIRRc8exHnCfucKtObjpDexOlfGP1yLhpyPweI2b5iGT&m=RZUnM6_s4ZHW66-t95Ga16y9iX9591ZVsvQhnJfceFE&s=-Jipmzfk8h5yeoPoyqacrxkZAQnYHcqgq91GtQyH88Y&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMDlZGRWsflorXB07PihBxo-2Dbnu-2DjLwfks5vRV2-2DgaJpZM4bSbs2&d=DwMCaQ&c=zdK58V2JKULZdB8nuBRpog&r=viu6wFhKUAAQyfCRfccCyIRRc8exHnCfucKtObjpDexOlfGP1yLhpyPweI2b5iGT&m=RZUnM6_s4ZHW66-t95Ga16y9iX9591ZVsvQhnJfceFE&s=n2UslOArWjA8sk2Qvaks4VqHWLalOeo_jHY0KTTCPdI&e=.

Kantar Disclaimerhttp://www.kantar.com/disclaimer.html

Hi William.

As a follow up to your question about my installed version, I learned how to get back to a previous version of my conda environment.
Thanks for that in the first place.

Just to let you know:

I went back to my environment that included pandas 0.23.1 by running

C:\Users\hm113045\AppData\Local\Continuum\Anaconda3\Scripts

and read_excel works perfectly with multi-index header and usecols again.

Matthias

Von: William Ayd [mailto:[email protected]]
Gesendet: Dienstag, 26. Februar 2019 17:16
An: pandas-dev/pandas pandas@noreply.github.com
Cc: Hufnagel, Matthias (TSHAM) matthias.hufnagel@kantar.com; Author author@noreply.github.com
Betreff: Re: [pandas-dev/pandas] read_excel throws ValueError: cannot specify usecols when specifying a multi-index header (#25449)

Which version was this working on previously? Also can you share the relevant data from the file you are reading?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pandas-2Ddev_pandas_issues_25449-23issuecomment-2D467503309&d=DwMCaQ&c=zdK58V2JKULZdB8nuBRpog&r=viu6wFhKUAAQyfCRfccCyIRRc8exHnCfucKtObjpDexOlfGP1yLhpyPweI2b5iGT&m=RZUnM6_s4ZHW66-t95Ga16y9iX9591ZVsvQhnJfceFE&s=-Jipmzfk8h5yeoPoyqacrxkZAQnYHcqgq91GtQyH88Y&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMDlZGRWsflorXB07PihBxo-2Dbnu-2DjLwfks5vRV2-2DgaJpZM4bSbs2&d=DwMCaQ&c=zdK58V2JKULZdB8nuBRpog&r=viu6wFhKUAAQyfCRfccCyIRRc8exHnCfucKtObjpDexOlfGP1yLhpyPweI2b5iGT&m=RZUnM6_s4ZHW66-t95Ga16y9iX9591ZVsvQhnJfceFE&s=n2UslOArWjA8sk2Qvaks4VqHWLalOeo_jHY0KTTCPdI&e=.

Kantar Disclaimerhttp://www.kantar.com/disclaimer.html

Just saw that I had something wrong in my clipboard, Sorry

I was running

conda install --revision N

Matthias

Von: Hufnagel, Matthias (TSHAM)
Gesendet: Mittwoch, 27. Februar 2019 10:08
An: 'pandas-dev/pandas' reply@reply.github.com
Betreff: AW: [pandas-dev/pandas] read_excel throws ValueError: cannot specify usecols when specifying a multi-index header (#25449)

Hi William.

As a follow up to your question about my installed version, I learned how to get back to a previous version of my conda environment.
Thanks for that in the first place.

Just to let you know:

I went back to my environment that included pandas 0.23.1 by running

C:\Users\hm113045\AppData\Local\Continuum\Anaconda3\Scripts

and read_excel works perfectly with multi-index header and usecols again.

Matthias

Von: William Ayd [mailto:[email protected]]
Gesendet: Dienstag, 26. Februar 2019 17:16
An: pandas-dev/pandas <[email protected]pandas@noreply.github.com>
Cc: Hufnagel, Matthias (TSHAM) [email protected]matthias.hufnagel@kantar.com>; Author <[email protected]author@noreply.github.com>
Betreff: Re: [pandas-dev/pandas] read_excel throws ValueError: cannot specify usecols when specifying a multi-index header (#25449)

Which version was this working on previously? Also can you share the relevant data from the file you are reading?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pandas-2Ddev_pandas_issues_25449-23issuecomment-2D467503309&d=DwMCaQ&c=zdK58V2JKULZdB8nuBRpog&r=viu6wFhKUAAQyfCRfccCyIRRc8exHnCfucKtObjpDexOlfGP1yLhpyPweI2b5iGT&m=RZUnM6_s4ZHW66-t95Ga16y9iX9591ZVsvQhnJfceFE&s=-Jipmzfk8h5yeoPoyqacrxkZAQnYHcqgq91GtQyH88Y&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMDlZGRWsflorXB07PihBxo-2Dbnu-2DjLwfks5vRV2-2DgaJpZM4bSbs2&d=DwMCaQ&c=zdK58V2JKULZdB8nuBRpog&r=viu6wFhKUAAQyfCRfccCyIRRc8exHnCfucKtObjpDexOlfGP1yLhpyPweI2b5iGT&m=RZUnM6_s4ZHW66-t95Ga16y9iX9591ZVsvQhnJfceFE&s=n2UslOArWjA8sk2Qvaks4VqHWLalOeo_jHY0KTTCPdI&e=.

Kantar Disclaimerhttp://www.kantar.com/disclaimer.html

I don't think your attachment came through. Can you add to a comment directly instead?

I did close the issue by mistake, sorry.

From the excel files, I do not read the first worksheet, only the worksheets with min_/max_ prefix that are equal in structure .

Can you add code that is completely copy / pastable to reproduce?

Sorry, William.

Here is the code that should produce the error message if the Excel file I uploaded earlier resides in the same folder.
I could not validate if this really throws the reported error at the moment, as I downgraded my environment back to pandas 0.23.1 where I had (and still have) no issues.

import pandas as pd
df=pd.read_excel('BugReportSample.xlsx', sheet_name='max_brand',header=[18,19], usecols='B,D:J,L')

Strange - the code in the traceback hasn't been touched in 6 years. I also created a 0.23.1 environment locally and got the same error.

Can you post the output of pd.show_versions()?

I just got this same error when I went from 23.4 to 24.0, reverting fixed it.

here is my code & error message

def load_data(i_file_name=file_1):
    """reads i_file excel spreadsheet into df"""
    data = pd.read_excel((DATA_FOLDER + i_file_name), 
                     sheet_name="historical data by country", 
                     header=[15, 16], 
                     skip_rows=17,
                     usecols="G:CO",  #ValueError: cannot specify usecols when specifying a multi-index header
                     index_col = None, #line15
                    )
    return data

>>>load_data()

File "cell_impact_DF.py", line 121, in <module>
    data = load_data()
  File "cell_impact_DF.py", line 15, in load_data 
    index_col = None,
  File "...lib\site-packages\pandas\util\_decorators.py", line 188, in wrapper
    return func(*args, **kwargs)
  File "...lib\site-packages\pandas\util\_decorators.py", line 188, in wrapper
    return func(*args, **kwargs)
  File "...lib\site-packages\pandas\io\excel.py", line 375, in read_excel
    **kwds)
  File "...lib\site-packages\pandas\io\excel.py", line 718, in parse
    **kwds)
  File "...lib\site-packages\pandas\io\excel.py", line 601, in parse
    **kwds)
  File "...lib\site-packages\pandas\io\parsers.py", line 2156, in TextParser
    return TextFileReader(*args, **kwds)
  File "...lib\site-packages\pandas\io\parsers.py", line 895, in __init__
    self._make_engine(self.engine)
  File "...lib\site-packages\pandas\io\parsers.py", line 1132, in _make_engine
    self._engine = klass(self.f, **self.options)
  File "...lib\site-packages\pandas\io\parsers.py", line 2171, in __init__
    ParserBase.__init__(self, kwds)
  File "...lib\site-packages\pandas\io\parsers.py", line 1391, in __init__
    raise ValueError("cannot specify usecols when "
ValueError: cannot specify usecols when specifying a multi-index header

Hi William.

Thanks for taking care of this.

I posted the show_versions() output in the first place, I attach it to the mail below.

Just to clarify (as you wrote that you have the same error using pandas 0.23.1)

I see the reported error only when I use the new 0.24.1 pandas version

After I went back to 0.23.1 my code runs fine.

Matthias

Output of show_versions (taken from my posting in github as my current installation is different now, as I explained)
INSTALLED VERSIONS
commit: None
python: 3.7.1.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.24.1
pytest: 3.3.2
pip: 19.0.3
setuptools: 40.8.0
Cython: 0.27.3
numpy: 1.15.4
scipy: None
pyarrow: None
xarray: None
IPython: 7.3.0
sphinx: 1.6.6
patsy: 0.5.0
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: None
feather: None
matplotlib: 2.1.2
openpyxl: 2.4.10
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml.etree: None
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.1
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Von: William Ayd [mailto:[email protected]]
Gesendet: Freitag, 1. März 2019 18:39
An: pandas-dev/pandas pandas@noreply.github.com
Cc: Hufnagel, Matthias (TSHAM) matthias.hufnagel@kantar.com; State change state_change@noreply.github.com
Betreff: Re: [pandas-dev/pandas] read_excel throws ValueError: cannot specify usecols when specifying a multi-index header (#25449)

Strange - the code in the traceback hasn't been touched in 6 years. I also created a 0.23.1 environment locally and got the same error.

Can you post the output of pd.show_versions()?

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pandas-2Ddev_pandas_issues_25449-23issuecomment-2D468747669&d=DwMCaQ&c=zdK58V2JKULZdB8nuBRpog&r=viu6wFhKUAAQyfCRfccCyIRRc8exHnCfucKtObjpDexOlfGP1yLhpyPweI2b5iGT&m=iX21_SrtmkrwWrb5pjWeDC0AL1m3X_IzVMJ6mDMDCDo&s=3gYCW5CPIdziZqzyrWCWntaYbLs7_UD4RLnSH4H5v_k&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMDlZM-5FJydRX70vNHmbHXTDy7iS3lR5-5Fks5vSWXIgaJpZM4bSbs2&d=DwMCaQ&c=zdK58V2JKULZdB8nuBRpog&r=viu6wFhKUAAQyfCRfccCyIRRc8exHnCfucKtObjpDexOlfGP1yLhpyPweI2b5iGT&m=iX21_SrtmkrwWrb5pjWeDC0AL1m3X_IzVMJ6mDMDCDo&s=9psaZU_NrC1suKUyUWQ0JZ1ZsxMRdfCz4R5PfGmxA7I&e=.

Kantar Disclaimerhttp://www.kantar.com/disclaimer.html

Hi William.

Is there any update on this? I was really happy and thankful that you did take care of this. This strange behaviour is keeping me from updateing my environment now. Would be great to know if there is a chance that this might get fixed in a future release.
Any feedback highly appreciated.
Again, thanks for taking care in the first place.

Kind regards

Matthias

@matthiastns we have almost 3000 issues and are all a small group of volunteers - in order to have things fixed a PR would advance the cause

Thanks a lot Jeff. Highly appreciated. I'm not a native speaker and I hope I was able to express how much I appreciate that all of you take care of this. At least I tried to do that in my last post. Sorry if it nevertheless came over as too much demanding or even offending.

Unfortunately I'm not quite sure what a PR is. I would initiate(?) one, if necessary. Something like an official request? Not sure either, if this is appropriate in this case or if this will be received like calling for emergency for a minor issue.
It's hard to guess from outside.

@matthiastns no, it would be a patch for your issue, see here: http://pandas.pydata.org/pandas-docs/stable/development/contributing.html

this is something you or others in the community would need to do.

@jreback
Thanks for your immidiate feedback.
So, from going through the linked documentation, I guess "PR" stands for "pull request".

This is a real challange for me. I have some experience with git but I'm not an software developper in the first place.
So I have no experience with common development techniques, methods and ways of working.

But I will think about it. Could be also rewarding to dive into that.
My main concern is, that finding the cause of my particular issue might be really tricky. Why? As William stated earlier, the code of the function that causes the error has not been touched for 6 years. If this is true, we might have a side effect from somewhere else. My feeling is that diving into this will be a rough task to solve for a code development newbie with no experience in the pandas code base at all.

Has there been any update on this issue? @matthiastns have you managed to resolve this?

Any help will be greatly appreciated!

I also got the same issue

Was this page helpful?
0 / 5 - 0 ratings