Pandas: Melting with not present column does not produce error

Created on 8 Nov 2018  路  4Comments  路  Source: pandas-dev/pandas

Code Sample

import pandas as pd
import numpy as np
# Generate data
people = ['Susie', 'Alejandro']
day = ['Monday', 'Tuesday', 'Wednesday']
data = [[person, d, *np.random.randint(0, 5, 2)]  for person in people for d in day]
df = pd.DataFrame(data, columns=['Name', 'day', 'burgers', 'fries'])
df.head()

Name | day | burgers | fries
-- | -- | -- | --
Susie | Monday | 4 | 0
Susie | Tuesday | 0 | 1
Susie | Wednesday | 0 | 1
Alejandro | Monday | 4 | 2
Alejandro | Tuesday | 2 | 0

# Melt on column that's not present in `df`
df.melt(['Name', 'day'], ['Burgers', 'fries'])

Outputs warning:

/home/ubuntu/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py:1472: FutureWarning: 
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
https://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike
  return self._getitem_tuple(key)

Problem description

This behavior should produce an error. Warnings aren't always taken seriously and if this melting operation were apart of a larger chain of operations, it would be unclear what the cause of the warning was since the warning does not directly address the actual problem. There is no scenario in which support for melting on non-present columns would be beneficial and there are clear reasons why it would be beneficial to alert users if they've made this mistake.

Expected Output

df.melt(['Name', 'day'], ['Burgers', 'fries']) should be treated like df.melt(['Name', 'Day']) (where Day is not present in the df) and produce a Traceback.

Error Reporting Reshaping good first issue

Most helpful comment

All 4 comments

That warning is coming indirectly via a .loc. I'm not sure when it's slated to be enforced.

Regardless, we could add a check to melt that all the names are valid. Interested in submitting a PR?

Just made the PR, I'm having some trouble running pandas from the source directory, I get this error:

ImportError: C extension: No module named 'pandas._libs.tslibs.conversion' not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace --force' to build the C extensions first.

But the setup failed somewhere along the way. Any suggestions for testing?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

idanivanov picture idanivanov  路  3Comments

songololo picture songololo  路  3Comments

Ashutosh-Srivastav picture Ashutosh-Srivastav  路  3Comments

matthiasroder picture matthiasroder  路  3Comments

ericdf picture ericdf  路  3Comments