Pandas: RLS: 0.20.0

Created on 18 Apr 2017 · 50Comments · Source: pandas-dev/pandas

We are nearing the 0.20.0 release, so opening an issue to track this.

Remaining issues/PRs with the 0.20 tag: https://github.com/pandas-dev/pandas/milestones/0.20.0
If there are other remaining things, time to speak up

cc @pandas-dev/pandas-core

Release

Source

jorisvandenbossche

🎉1 👍1

Most helpful comment

The 0.20.0 release candidate breaks a large amount of my code and AFAICS the breakage was in previously non-deprecate modules so 3rd party libraries haven't had time to adapt to the new api.

xref: #16137, #16138

I think there should be serious consideration given to providing a compatibility layer for at least one minor release. It's not hard to do and doesn't need to stick around for very long, but will give 3rd party libraries a change to adapt. The required work is in PR #16140

dhirschfeld on 26 Apr 2017

👍2

All 50 comments

Current target for the release candidate is this Friday I think, @TomAugspurger ?

jorisvandenbossche on 18 Apr 2017

Yep, that's the goal.

TomAugspurger on 18 Apr 2017

this is just an RC this friday. We prob need 2 or more weeks for that.

jreback on 18 Apr 2017

yep, that is what I meant, sorry for the confusion

jorisvandenbossche on 18 Apr 2017

Doing the RC tonight or early tomorrow. Of the remaining issues, #16088 is a blocker and #16086 would be nice. Any I am missing?

@jreback do you have a preference on tagging v0.21.dev immediately after tagging the RC, versus waiting to tag v0.21.dev until the full 0.20 release? If we tag 0.21.dev now then we can merge whatever we want into master, and I'll just backport whatever needs to be backported between now and the 0.20 release.

TomAugspurger on 21 Apr 2017

no those are fine. I'll have a look in a few.

hmm it feels weird to tag 0.21.dev before 0.20.0.

jreback on 21 Apr 2017

hmm it feels weird to tag 0.21.dev before 0.20.0.

I don't think there is anything strange about that. It is just a practical choice to make about when we branch the 0.20.x branch (the tagging is just a consequence of that): now with the rc, or only after the final release.

For me both options are OK. Branching now makes it a bit easier for progressing / merging PRs (we have to be less concerned about which PR we merge, or which has to wait until 0.20 is out), but is more work for Tom to backport.

jorisvandenbossche on 22 Apr 2017

@TomAugspurger there are still some doc issues open for 0.20, but you don't have to be too concerned about them, as for rc's we typically point to the dev docs which are further updated (I mean, tagging the rc does not freeze the docs, as it is with the final release)

jorisvandenbossche on 22 Apr 2017

ok open to whatever tagging makes sense.

yeah doc issues certainly can be done after (as well as a small number of additional fixes, already marked for 0.20.0). mainly the sorting one.

jreback on 22 Apr 2017

@TomAugspurger ok all of the things on my list are merged.

jreback on 22 Apr 2017

👍 Building now and testing locally. I'll probably tag tonight.

Conda-forge hasn't settled on pre-release stuff yet, so I'll probably wait till tomorrow to make announcements (have to finish setting up this windows VM).

TomAugspurger on 22 Apr 2017

ok lmk if u need anything
though won't be around most of tomorrow

jreback on 22 Apr 2017

Ok, tagged and made the release on github. Doing all the conda stuff and mailing list announcements tomorrow.

TomAugspurger on 22 Apr 2017

👍1

Just sent out the announce email. All the wheels are on PyPI. Our conda channel has packages for OSX and 64-bit windows. I haven't setup a 32-bit windows environment yet. Linux packages should be up shortly (built 0.19.2 on accident).

TomAugspurger on 23 Apr 2017

thanks @TomAugspurger !

Since we didn't tag 0.21.0.dev (I think we should do this after tag of 0.20.0 final), then going to continue merging small things to master.

jreback on 23 Apr 2017

Could #16111 be fixed before 0.20.final? It's a bug that causes Pandas to crash the whole Python interpreter, leaving no traceback or clue as to what went wrong.

I do not have the capabilities to fix this myself, though, so consider this just a plea from a happy user. Also, I've made my own (non-optimal) workaround, but this could bring a negative surprise to others when it happens...

EDIT: This has been taken care of. Thanks!

topper-123 on 25 Apr 2017

The 0.20.0 release candidate breaks a large amount of my code and AFAICS the breakage was in previously non-deprecate modules so 3rd party libraries haven't had time to adapt to the new api.

xref: #16137, #16138

dhirschfeld on 26 Apr 2017

👍2

Which is why we want people to test with the rc, thanks a lot for your feedback @dhirschfeld !

jorisvandenbossche on 26 Apr 2017

Any remaining issues / fixes people want in for 0.20? https://github.com/pandas-dev/pandas/pull/16171 is the only blocker I see.

I'll probably do the release Tuesday or Wednesday if that works for everyone.

TomAugspurger on 1 May 2017

👍1

after #16223 and remaining doc fixes I think good to go. We could do an rc2 (maybe just conda packages). For a quick check? up to you @TomAugspurger

jreback on 4 May 2017

Yeah, I could go either way on an RC2... May as do one just for the conda packages, and test against dask, statsmodels, etc.

TomAugspurger on 4 May 2017

sure that works.

jreback on 4 May 2017

Should I even push an RC2 to PyPI? Or just tag, push to github, and then build the conda pkgs?

TomAugspurger on 4 May 2017

no, I would just make some conda packages.

jreback on 4 May 2017

Going to do the RC2 now.

We have the two doc issues outstanding, but those can go in between RC2 and the release.

TomAugspurger on 4 May 2017

@jreback a couple GBQ failures on master

Could the be related to https://github.com/pydata/pandas-gbq/pull/34? (I haven't looked at that PR)

TomAugspurger on 4 May 2017

I restarted on master. this should work. I released a new version of gbq because was using PandasError (which we removed)

jreback on 4 May 2017

these gbq sometimes fail if multiple things are running against it (e.g. 0.20rc2 tag & master both launched at the same time).

jreback on 4 May 2017

👍1

Dask, statsmodels, and seaborn all looked OK on rc2. I did a windows build and test locally, and that passed as well.

TomAugspurger on 4 May 2017

great @TomAugspurger

yeah master is green now (and rc2 I just restarted those builds).

jreback on 4 May 2017

I get three test failures on win-amd64-py3.6.1 with numpy-1.11.3+mkl. The test_shim failure is fixed by #16239

running: pytest --skip-slow --skip-network X:\Python36\lib\site-packages\pandas
<snip>
================================== FAILURES ===================================
__________________________________ test_shim __________________________________

    def test_shim():
        # https://github.com/pandas-dev/pandas/pull/16059
        # Remove in 0.21
        with tm.assert_produces_warning(FutureWarning,
                                        check_stacklevel=False):
>           from pandas.formats.style import Styler as _styler  # noqa
E           ModuleNotFoundError: No module named 'pandas.formats'

X:\Python36\lib\site-packages\pandas\tests\io\formats\test_style.py:866: ModuleNotFoundError
______________________ TestSeriesAnalytics.test_overflow ______________________

self = <pandas.tests.series.test_analytics.TestSeriesAnalytics object at 0x0000019B01E52400>

    def test_overflow(self):
        # GH 6915
        # overflowing on the smaller int dtypes
        for dtype in ['int32', 'int64']:
            v = np.arange(5000000, dtype=dtype)
            s = Series(v)

            # no bottleneck
            result = s.sum(skipna=False)
            assert int(result) == v.sum(dtype='int64')
            result = s.min(skipna=False)
            assert int(result) == 0
            result = s.max(skipna=False)
            assert int(result) == v[-1]

            # use bottleneck if available
            result = s.sum()
>           assert int(result) == v.sum(dtype='int64')
E           AssertionError: assert 1642668640 == 12499997500000
E            +  where 1642668640 = int(1642668640)
E            +  and   12499997500000 = <built-in method sum of numpy.ndarray object at 0x0000019B061C0CB0>(dtype='int64')
E            +    where <built-in method sum of numpy.ndarray object at 0x0000019B061C0CB0> = array([      0,       1,       2, ..., 4999997, 4999998, 4999999], dtype=int64).sum

X:\Python36\lib\site-packages\pandas\tests\series\test_analytics.py:69: AssertionError
________________________ TestSeriesAnalytics.test_sum _________________________

self = <pandas.tests.series.test_analytics.TestSeriesAnalytics object at 0x0000019B69C32EB8>

    def test_sum(self):
>       self._check_stat_op('sum', np.sum, check_allna=True)

X:\Python36\lib\site-packages\pandas\tests\series\test_analytics.py:96:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
X:\Python36\lib\site-packages\pandas\tests\series\test_analytics.py:556: in _check_stat_op
    testit()
X:\Python36\lib\site-packages\pandas\tests\series\test_analytics.py:535: in testit
    assert_almost_equal(float(f(s)), float(alternate(s.values)))
X:\Python36\lib\site-packages\pandas\util\testing.py:177: in assert_almost_equal
    **kwargs)
pandas\_libs\testing.pyx:59: in pandas._libs.testing.assert_almost_equal (pandas\_libs\testing.c:4156)
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   AssertionError: expected 1099511628275500.00000 but got 499500.00000, with decimal 5

pandas\_libs\testing.pyx:209: AssertionError
=========================== pytest-warning summary ============================
WC1 X:\Python36\lib\site-packages\pandas\tests\test_config.py cannot collect test class 'TestConfig' because it has a __init__ constructor
 3 failed, 9606 passed, 542 skipped, 21 xfailed, 1 xpassed, 1 pytest-warnings in 631.87 seconds

cgohlke on 4 May 2017

@cgohlke thanks. Is that on 32-bit windows?

TomAugspurger on 4 May 2017

It is 64-bit Python 3.6.1 from python.org on Windows 10.

cgohlke on 4 May 2017

test_overflow and test_sum pass after uninstalling bottleneck-1.2.0.

cgohlke on 4 May 2017

test_overflow and test_sum pass after uninstalling bottleneck-1.2.0.

Thanks, I'm able to reproduce on my windows VM with bottleneck installed (I didn't have it installed earlier). Taking a look at what's going on now.

TomAugspurger on 4 May 2017

we might not have bottleneck installed for windows test builds

just skip them for now

jreback on 4 May 2017

we are testing on 2.7 w/bottleneck on windows, but not 3.6....will add and skip/fix those.

jreback on 4 May 2017

we are testing on 2.7 w/bottleneck on windows, but not 3.6....will add and skip/fix those.

Yep, it is definitely in bottleneck. Are you doing a PR for skipping those, or should I?

TomAugspurger on 4 May 2017

testing now

jreback on 4 May 2017

https://ci.appveyor.com/project/jreback/pandas/build/1.0.3685/job/y1rv3i97bs1ruet6

jreback on 4 May 2017

xfailed these tests: https://ci.appveyor.com/project/jreback/pandas/build/1.0.3686
merging on pass (I think we have an issue about this overflow in any event, here: https://github.com/pandas-dev/pandas/issues/15453)

so when I fix the bottlneck things in 0.21.0 will fix this.

jreback on 4 May 2017

@jreback thanks bottleneck issue is here: https://github.com/kwgoodman/bottleneck/issues/163

TomAugspurger on 4 May 2017

@TomAugspurger ok everything lgtm. assuming master finishes and passes. cut when you are ready. (obviously if you see any doc things when you are doing, pls push fixes).

jreback on 5 May 2017

👍1

@jreback the whatsnew mentions a new hash_tuples. Do we want to expose that in pandas.util along with hash_pandas_object?

TomAugspurger on 5 May 2017

that's not public it could be i guess

jreback on 5 May 2017

I only ask since it was in a release note item. Though, dask isn't using it, so maybe keep it private for now?

TomAugspurger on 5 May 2017

yeah is meant to be private

OT - date of whatsnew note needs changing

jreback on 5 May 2017

Got it in https://github.com/pandas-dev/pandas/pull/16245

I'll push one more commit removing that reference to hash_tuples, then tag it. Going to wait for that doc build to finish first.

TomAugspurger on 5 May 2017

Has 0.20 been released? pandas.pydata.org says 0.19.2 is the current release..

topper-123 on 6 May 2017

just refresh and you shall see.

jreback on 6 May 2017

Was this page helpful?

0 / 5 - 0 ratings

Related issues

RLS: 0.24.0

jreback · 61Comments

[Good first issue] TST: Disallow bare pytest.raises

ShaharNaveh · 51Comments

ENH: Allow opting in to new dtypes on I/O routines via keyword to I/O routines

Dr-Irv · 59Comments

Pivot / unstack on large data frame does not work int32 overflow

MarkiesFredje · 42Comments

API/ENH: dtype='string' / pd.String

jreback · 61Comments