Pandas: Nlargest on boolean return False first

Created on 19 Apr 2019  路  6Comments  路  Source: pandas-dev/pandas

Code Sample, a copy-pastable example if possible

import pandas as pd
pd.Series([True, False]).nlargest(1)

1 False
dtype: bool

Problem description

If you cast this to any other type, you will have the opposite and expected order.
More generally in python True > False. You still have the same order everywhere in pandas.

df = pd.DataFrame({'True': True, 'False': [False] * 2})
df['True'] > df['False']

The result is misleading and can easily lead to a debugging process.

  • [X] Check for duplicate
  • [X] Pandas 0.24.2

Expected Output

1 True
dtype: bool

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS

commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-17763-Microsoft
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: None
pip: 9.0.1
setuptools: 40.6.3
Cython: 0.26.1
numpy: 1.16.2
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: 0.5.0
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: 2.5.8
xlrd: None
xlwt: None
xlsxwriter: 1.1.2
lxml.etree: 4.2.5
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Algos Bug

Most helpful comment

@bpieper26 Thanks for pointing out. The problem is that a boolean array is converted to a uint array.

All 6 comments

Makes sense. Investigation and PRs are always welcome

This is interesting. Let me take a look.

Looks similar to closed issue #21426. Had a quick today but didn't have time to fix.

@bpieper26 Thanks for pointing out. The problem is that a boolean array is converted to a uint array.

Thanks for your help :)

My pleasure!

Was this page helpful?
0 / 5 - 0 ratings