Pandas: DOC: DataFrame.update() fails to change values in calling dataframe if new value is NaN

Created on 1 Jul 2017  路  5Comments  路  Source: pandas-dev/pandas

Code Sample, a copy-pastable example if possible

# Define dataframe
tempDF1 = pd.DataFrame({'c1':[1,2,3,4,5],
                       'c2':[True,True,False,False,True]})

# Select part of original dataframe
tempDF2 = tempDF1.loc[tempDF1['c2']==True,:].copy()

print('Original dataframe')
print(tempDF1)
print('\nSelection of dataframe')
print(tempDF2)

# Make some changes to selection
tempDF2.loc[:,'c1'] = tempDF2.loc[:,'c1'] + 100
tempDF2.iloc[2,0] = np.nan

# Update original dataframe with new values
tempDF1.update(tempDF2,overwrite=True)

print('\nSelection of dataframe - with changes')
print(tempDF2)
print('\nUpdated original dataframe')
print(tempDF1)

Problem description

DataFrame.update() function fails to update a dataframe with new NaN values. However, non-NaN values are updated to original dataframe with no issues (except the dtype of the dataframe is altered in the update process, namely int64 changed to float64).

Expected Output

Expected output would be

      c1     c2
0  101   True
1  102   True
2    3  False
3    4  False
4  NaN   True

However, actual output is:

      c1     c2
0  101.0   True
1  102.0   True
2    3.0  False
3    4.0  False
4    5.0   True

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.4.6.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8

pandas: 0.20.2
pytest: None
pip: 9.0.1
setuptools: 34.3.3
Cython: None
numpy: 1.13.0
scipy: None
xarray: None
IPython: 5.3.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: 2.4.7
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

Docs Missing-data good first issue

Most helpful comment

Is there a parameter to track NaN values and change old DF values to NaN in the update function?

All 5 comments

I think that's what it's supposed to do, though. From the documentation:

Modify DataFrame in place using non-NA values from passed DataFrame.

some examples in the doc-string would be great for this. @lvphj want to do a PR ?

You're right. Sorry, somehow I missed that in the docs. Yes, I'll add some examples to the docs and send a PR.

Is there a parameter to track NaN values and change old DF values to NaN in the update function?

Yes, what if I want to force NaN values in the update function?

Was this page helpful?
0 / 5 - 0 ratings