[x] I have checked that this issue has not already been reported.
[x] I have confirmed this bug exists on the latest version of pandas.
[ ] (optional) I have confirmed this bug exists on the master branch of pandas.
import pandas as pd
import numpy as np
pd.set_option('use_inf_as_na', True)
pd.DataFrame({'test_data':[1,3,4,np.nan]}).to_csv('test_data.csv', na_rep='NaN')
pd.read_csv('test_data.csv',sep=',' ,na_values='NaN')
Causes ValueError:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/lena/anaconda3/envs/pandas_pip2/lib/python3.8/site-packages/pandas/io/parsers.py", line 686, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/lena/anaconda3/envs/pandas_pip2/lib/python3.8/site-packages/pandas/io/parsers.py", line 458, in _read
data = parser.read(nrows)
File "/home/lena/anaconda3/envs/pandas_pip2/lib/python3.8/site-packages/pandas/io/parsers.py", line 1201, in read
df = DataFrame(col_dict, columns=columns, index=index)
File "/home/lena/anaconda3/envs/pandas_pip2/lib/python3.8/site-packages/pandas/core/frame.py", line 467, in __init__
mgr = init_dict(data, index, columns, dtype=dtype)
File "/home/lena/anaconda3/envs/pandas_pip2/lib/python3.8/site-packages/pandas/core/internals/construction.py", line 250, in init_dict
missing = arrays.isna()
File "/home/lena/anaconda3/envs/pandas_pip2/lib/python3.8/site-packages/pandas/core/series.py", line 4795, in isna
return super().isna()
File "/home/lena/anaconda3/envs/pandas_pip2/lib/python3.8/site-packages/pandas/core/generic.py", line 7109, in isna
return isna(self).__finalize__(self, method="isna")
File "/home/lena/anaconda3/envs/pandas_pip2/lib/python3.8/site-packages/pandas/core/dtypes/missing.py", line 124, in isna
return _isna(obj)
File "/home/lena/anaconda3/envs/pandas_pip2/lib/python3.8/site-packages/pandas/core/dtypes/missing.py", line 157, in _isna
return _isna_ndarraylike(obj, inf_as_na=inf_as_na)
File "/home/lena/anaconda3/envs/pandas_pip2/lib/python3.8/site-packages/pandas/core/dtypes/missing.py", line 218, in _isna_ndarraylike
result = _isna_string_dtype(values, dtype, inf_as_na=inf_as_na)
File "/home/lena/anaconda3/envs/pandas_pip2/lib/python3.8/site-packages/pandas/core/dtypes/missing.py", line 246, in _isna_string_dtype
vec = libmissing.isnaobj_old(values.ravel())
File "pandas/_libs/missing.pyx", line 160, in pandas._libs.missing.isnaobj_old
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Nans are not recognized as expected anymore if pandas option use_inf_as_nais set to True. Occurred first after upgrading to pandas 1.1.0
pd.show_versions()commit : d9fff2792bf16178d4e450fe7384244e50635733
python : 3.8.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-42-generic
Version : #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.1.0
numpy : 1.19.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2
setuptools : 49.2.0.post20200712
Cython : None
pytest : 6.0.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.0
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.2
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None
Thanks @mhaselsteiner for the report.
Occurred first after upgrading to pandas 1.1.0
can confirm ok in 1.0.5, so marking as regression.
>>> pd.__version__
'1.0.5'
>>>
>>> pd.set_option('use_inf_as_na', True)
>>> df = pd.DataFrame({'test_data':[1,3,4,np.nan]})
>>> data = df.to_csv(na_rep='NaN')
>>> print(data)
,test_data
0,1.0
1,3.0
2,4.0
3,NaN
>>>
>>> from io import StringIO
>>> pd.read_csv(StringIO(data),sep=',' ,na_values='NaN')
Unnamed: 0 test_data
0 0 1.0
1 1 3.0
2 2 4.0
3 3 NaN
>>>
this issue starts to occur with #33656 cc @dsaxton
678a9ac7c198513367f6f1180c5fd2bf6bc6949b is the first bad commit
commit 678a9ac7c198513367f6f1180c5fd2bf6bc6949b
Author: Daniel Saxton <[email protected]>
Date: Sun May 10 12:12:45 2020 -0500
BUG: Fix StringArray use_inf_as_na bug (#33656)