Pandas: items returned by itertuples do not act like namedtuples.

Created on 12 Apr 2016  路  4Comments  路  Source: pandas-dev/pandas

I don't know if this is a doc bug, a code bug, or an expectation error on my part.
The documentation of itertuples says this function will "Iterate over DataFrame rows as namedtuples", but the resulting items are not namedtuple and cannot be indexed by name like a namedtuple.

Code Sample, a copy-pastable example if possible

import pandas as pd

df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]},
                     index=['a', 'b'])
for row in df.itertuples():
    print(row)
    print(row['col1'])
$ python pandasbug.py 
Pandas(Index='a', col1=1, col2=0.10000000000000001)
Traceback (most recent call last):
  File "pandasbug.py", line 8, in <module>
    print(row['col1'])
TypeError: tuple indices must be integers, not str

Expected Output

$ python pandasbug.py 
Pandas(Index='a', col1=1, col2=0.10000000000000001)
1
Pandas(Index='b', col1=2, col2=0.20000000000000001)
2

output of pd.show_versions()

$ python -c 'import pandas as pd; pd.show_versions()'

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-45-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.0
nose: None
pip: 8.1.1
setuptools: 2.2
Cython: None
numpy: 1.11.0
scipy: None
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.5.2
pytz: 2016.3
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: 0.9.4
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext)
jinja2: None
boto: None
Reshaping Usage Question

Most helpful comment

You can simply use this instead to get around the lack of getitem access with namedtuples:
getattr(row, "col1")
I think that's the equivalent of row.col1 but allows you to use variables for the index, just like row["col1"].

All 4 comments

I don't think namedtuples can be indexed by name?


In [38]: namedtuple('y', list('abc'))(1,2,3)['a']
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-38-b6fb7ee08b27> in <module>()
----> 1 namedtuple('y', list('abc'))(1,2,3)['a']

TypeError: tuple indices must be integers or slices, not str

named tuples have restricted attributes (eg no spaces) and only allow attribute access, not getitem access

Ah, so an expectation bug, I should have been using print(row.col1) which works fine.

You can simply use this instead to get around the lack of getitem access with namedtuples:
getattr(row, "col1")
I think that's the equivalent of row.col1 but allows you to use variables for the index, just like row["col1"].

Was this page helpful?
0 / 5 - 0 ratings