Pandas: Wrong slicing behavior for DataFrame loc

Created on 17 Dec 2016  路  5Comments  路  Source: pandas-dev/pandas

Code Sample, a copy-pastable example if possible

>>> df
   0   1   2   3
0  1  21  51  61
1  2  22  52  62
2  3  23  53  63
>>> df[0:0]
Empty DataFrame
Columns: [0, 1, 2, 3]
Index: []
>>> df.loc[0:0]
   0   1   2   3
0  1  21  51  61

For loc, slicing is incorrect

The behavior exhibited by slicing of loc is incosistent with python array slicing.
For [0:0] it should have returned empty, but it is returning a row.

Expected Output

Similar to code show below for python array, pd.DataFrame.loc slicing should produce empty

E.g. Following code slices empty for [0:0]

>>> l = [0,1,2]
>>> l[0:0]
[]

Output of pd.show_versions()

pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Darwin
OS-release: 16.1.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.19.1
nose: None
pip: 9.0.1
setuptools: 25.2.0
Cython: None
numpy: 1.10.4
scipy: 0.17.0
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.2
pytz: 2016.3
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: 2.2.0-b1
xlrd: 0.9.4
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None

Indexing Usage Question

All 5 comments

http://pandas.pydata.org/pandas-docs/stable/indexing.html#selection-by-label

label slicing always includes the endpoint

positional (iloc) slicing matches python semantics

Isn't the example below producing empty slice ?
Shouldn't it be consistent with the following experience ?

>>> l = [0,1,2]
>>> l[0:0]
[]

read the docs

What would we loose by being consistent?
Just because we put something in the documentation doesn't make it right.

Performance over consistent behavior is not even a contest.

you are missing the point

.iloc is positional indexing
.loc is label indexing

these are two separate and distinct ways of selecting data
numpy and python only have positional concepts and thus have only 1 way of indexing

pandas can also index by labels; since these can be for example strings (or datetimes or integers)
you now have different semantics

again the docs are very complete on these concepts

and if you chose not to use label indexing, then just use .iloc and it will feel like python/numpy

Was this page helpful?
0 / 5 - 0 ratings

Related issues

songololo picture songololo  路  3Comments

MatzeB picture MatzeB  路  3Comments

andreas-thomik picture andreas-thomik  路  3Comments

marcelnem picture marcelnem  路  3Comments

amelio-vazquez-reina picture amelio-vazquez-reina  路  3Comments