Pandas: Pycharm linter flags output of concat as "type", not DataFrame

Created on 26 Jul 2017 · 10Comments · Source: pandas-dev/pandas

Minimum display example

df1 = pd.DataFrame({'A': ['A0']}, index=[0])
df2 = pd.DataFrame({'A': ['A4']}, index=[4])
result = pd.concat([df1, df2])

print(df1.loc[4])  # Pycharm doesn't flag
print(result.loc[4])  # Pycharm flags as "Unresolved attribute reference '.loc' for class 'type'"
                      # But runs fine anyway

Problem description

Pycharm's linter believes pd.concat as returning class "type", when it normally returns a dataframe.

I suspect that this is due to core.reshape.concat._Concatenator.get_result() using core.dtypes.concat._get_frame_result_type which sometimes returns SparseDataFrame (a type) and sometimes an object (normally a Dataframe?).

Extract from pandas.core.dtypes.concat

def _get_frame_result_type(result, objs):
    """
    return appropriate class of DataFrame-like concat
    if any block is SparseBlock, return SparseDataFrame
    otherwise, return 1st obj
    """
    if any(b.is_sparse for b in result.blocks):
        from pandas.core.sparse.api import SparseDataFrame
        return SparseDataFrame
    else:
        return objs[0]

Expected Output

The MDE should not throw an error in Pycharm.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.20.3
pytest: None
pip: 9.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.13.1
scipy: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: 2.4.8
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
pandas_gbq: None
pandas_datareader: None
None

I am using Pycharm Community Edition 2017.1.5.

Docs Dtypes

Source

jebob

👍2

Most helpful comment

@gfyoung I've been suppressing the warnings in my code using type hints instead.
foo = pd.concat([bar1, bar2]) # type: pd.DataFrame

jebob on 26 Jul 2017

👍6

All 10 comments

you should address this to PyCharm.

jreback on 26 Jul 2017

Link to Pycharm issue: https://youtrack.jetbrains.com/issue/PY-25326

jebob on 26 Jul 2017

@jebob : The Pycharm linter is prone to false positives like this because this is not Java (their first language under the IntelliJ suite) 😄 . In fact, some of the warnings that I've seen similar to the one that you've reported are complete nonsense.

This is not to say that the linter doesn't get it right some times, but generally, flake8 is your friend here and so is running the code manually.

gfyoung on 26 Jul 2017

@gfyoung I've been suppressing the warnings in my code using type hints instead.
foo = pd.concat([bar1, bar2]) # type: pd.DataFrame

jebob on 26 Jul 2017

👍6

That works too, just hopefully you don't have to make too many hints 😄

gfyoung on 26 Jul 2017

Pycharm reads this piece of the documentation from DataFrame.concat().

    -------
    concatenated : type of objects

and concludes that the return value is type type. This could be easily resolved on this end by using

    -------
    concatenated : object, type of objs

which is technically correct, equally informative to the naked eye, solves the issue AND fixes a typo (objects vs objs).

jebob on 26 Jul 2017

👍2

a doc change like that would be fine

jreback on 26 Jul 2017

I assume this was fixed with more recent versions of pandas via the doc change?

At work were using 0.20.3 for now and it still happens for me on pycharm 2018. Nothing major or alarming just wondering :)

Sadin on 17 May 2018

The issue is unrelated to your PyCharm version
It should be addressed in 0.23.0

gfyoung on 17 May 2018

👍3

Thought so, just confirming, thank you!

Sadin on 17 May 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

frame _apply_standard error when operating on 0 or NaN values

venuktan · 3Comments

Cannot use apply on Series with Timestamp values

nathanielatom · 3Comments

Incompatibility between pandas.infer_freq and pandas.to_timedelta

idanivanov · 3Comments

Interpolate (upsample) non-equispaced timeseries into equispaced 18.0rc1

marcelnem · 3Comments

Can't read csv using python pandas

Ashutosh-Srivastav · 3Comments

Pandas: Pycharm linter flags output of concat as "type", not DataFrame

Minimum display example

Problem description

Extract from pandas.core.dtypes.concat

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

Most helpful comment

All 10 comments

Related issues

Output of `pd.show_versions()`