Pandas: RLS: 0.20.0

Created on 18 Apr 2017  路  50Comments  路  Source: pandas-dev/pandas

We are nearing the 0.20.0 release, so opening an issue to track this.

Remaining issues/PRs with the 0.20 tag: https://github.com/pandas-dev/pandas/milestones/0.20.0
If there are other remaining things, time to speak up

cc @pandas-dev/pandas-core

Release

Most helpful comment

The 0.20.0 release candidate breaks a large amount of my code and AFAICS the breakage was in previously non-deprecate modules so 3rd party libraries haven't had time to adapt to the new api.

xref: #16137, #16138

I think there should be serious consideration given to providing a compatibility layer for at least one minor release. It's not hard to do and doesn't need to stick around for very long, but will give 3rd party libraries a change to adapt. The required work is in PR #16140

All 50 comments

Current target for the release candidate is this Friday I think, @TomAugspurger ?

Yep, that's the goal.

this is just an RC this friday. We prob need 2 or more weeks for that.

yep, that is what I meant, sorry for the confusion

Doing the RC tonight or early tomorrow. Of the remaining issues, #16088 is a blocker and #16086 would be nice. Any I am missing?

@jreback do you have a preference on tagging v0.21.dev immediately after tagging the RC, versus waiting to tag v0.21.dev until the full 0.20 release? If we tag 0.21.dev now then we can merge whatever we want into master, and I'll just backport whatever needs to be backported between now and the 0.20 release.

no those are fine. I'll have a look in a few.

hmm it feels weird to tag 0.21.dev before 0.20.0.

hmm it feels weird to tag 0.21.dev before 0.20.0.

I don't think there is anything strange about that. It is just a practical choice to make about when we branch the 0.20.x branch (the tagging is just a consequence of that): now with the rc, or only after the final release.

For me both options are OK. Branching now makes it a bit easier for progressing / merging PRs (we have to be less concerned about which PR we merge, or which has to wait until 0.20 is out), but is more work for Tom to backport.

@TomAugspurger there are still some doc issues open for 0.20, but you don't have to be too concerned about them, as for rc's we typically point to the dev docs which are further updated (I mean, tagging the rc does not freeze the docs, as it is with the final release)

ok open to whatever tagging makes sense.

yeah doc issues certainly can be done after (as well as a small number of additional fixes, already marked for 0.20.0). mainly the sorting one.

@TomAugspurger ok all of the things on my list are merged.

馃憤 Building now and testing locally. I'll probably tag tonight.

Conda-forge hasn't settled on pre-release stuff yet, so I'll probably wait till tomorrow to make announcements (have to finish setting up this windows VM).

ok lmk if u need anything
though won't be around most of tomorrow

Ok, tagged and made the release on github. Doing all the conda stuff and mailing list announcements tomorrow.

Just sent out the announce email. All the wheels are on PyPI. Our conda channel has packages for OSX and 64-bit windows. I haven't setup a 32-bit windows environment yet. Linux packages should be up shortly (built 0.19.2 on accident).

thanks @TomAugspurger !

Since we didn't tag 0.21.0.dev (I think we should do this after tag of 0.20.0 final), then going to continue merging small things to master.

Could #16111 be fixed before 0.20.final? It's a bug that causes Pandas to crash the whole Python interpreter, leaving no traceback or clue as to what went wrong.

I do not have the capabilities to fix this myself, though, so consider this just a plea from a happy user. Also, I've made my own (non-optimal) workaround, but this could bring a negative surprise to others when it happens...

EDIT: This has been taken care of. Thanks!

The 0.20.0 release candidate breaks a large amount of my code and AFAICS the breakage was in previously non-deprecate modules so 3rd party libraries haven't had time to adapt to the new api.

xref: #16137, #16138

I think there should be serious consideration given to providing a compatibility layer for at least one minor release. It's not hard to do and doesn't need to stick around for very long, but will give 3rd party libraries a change to adapt. The required work is in PR #16140

Which is why we want people to test with the rc, thanks a lot for your feedback @dhirschfeld !

Any remaining issues / fixes people want in for 0.20? https://github.com/pandas-dev/pandas/pull/16171 is the only blocker I see.

I'll probably do the release Tuesday or Wednesday if that works for everyone.

after #16223 and remaining doc fixes I think good to go. We could do an rc2 (maybe just conda packages). For a quick check? up to you @TomAugspurger

Yeah, I could go either way on an RC2... May as do one just for the conda packages, and test against dask, statsmodels, etc.

sure that works.

Should I even push an RC2 to PyPI? Or just tag, push to github, and then build the conda pkgs?

no, I would just make some conda packages.

Going to do the RC2 now.

We have the two doc issues outstanding, but those can go in between RC2 and the release.

I restarted on master. this should work. I released a new version of gbq because was using PandasError (which we removed)

these gbq sometimes fail if multiple things are running against it (e.g. 0.20rc2 tag & master both launched at the same time).

Dask, statsmodels, and seaborn all looked OK on rc2. I did a windows build and test locally, and that passed as well.

great @TomAugspurger

yeah master is green now (and rc2 I just restarted those builds).

I get three test failures on win-amd64-py3.6.1 with numpy-1.11.3+mkl. The test_shim failure is fixed by #16239

running: pytest --skip-slow --skip-network X:\Python36\lib\site-packages\pandas
<snip>
================================== FAILURES ===================================
__________________________________ test_shim __________________________________

    def test_shim():
        # https://github.com/pandas-dev/pandas/pull/16059
        # Remove in 0.21
        with tm.assert_produces_warning(FutureWarning,
                                        check_stacklevel=False):
>           from pandas.formats.style import Styler as _styler  # noqa
E           ModuleNotFoundError: No module named 'pandas.formats'

X:\Python36\lib\site-packages\pandas\tests\io\formats\test_style.py:866: ModuleNotFoundError
______________________ TestSeriesAnalytics.test_overflow ______________________

self = <pandas.tests.series.test_analytics.TestSeriesAnalytics object at 0x0000019B01E52400>

    def test_overflow(self):
        # GH 6915
        # overflowing on the smaller int dtypes
        for dtype in ['int32', 'int64']:
            v = np.arange(5000000, dtype=dtype)
            s = Series(v)

            # no bottleneck
            result = s.sum(skipna=False)
            assert int(result) == v.sum(dtype='int64')
            result = s.min(skipna=False)
            assert int(result) == 0
            result = s.max(skipna=False)
            assert int(result) == v[-1]

            # use bottleneck if available
            result = s.sum()
>           assert int(result) == v.sum(dtype='int64')
E           AssertionError: assert 1642668640 == 12499997500000
E            +  where 1642668640 = int(1642668640)
E            +  and   12499997500000 = <built-in method sum of numpy.ndarray object at 0x0000019B061C0CB0>(dtype='int64')
E            +    where <built-in method sum of numpy.ndarray object at 0x0000019B061C0CB0> = array([      0,       1,       2, ..., 4999997, 4999998, 4999999], dtype=int64).sum

X:\Python36\lib\site-packages\pandas\tests\series\test_analytics.py:69: AssertionError
________________________ TestSeriesAnalytics.test_sum _________________________

self = <pandas.tests.series.test_analytics.TestSeriesAnalytics object at 0x0000019B69C32EB8>

    def test_sum(self):
>       self._check_stat_op('sum', np.sum, check_allna=True)

X:\Python36\lib\site-packages\pandas\tests\series\test_analytics.py:96:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
X:\Python36\lib\site-packages\pandas\tests\series\test_analytics.py:556: in _check_stat_op
    testit()
X:\Python36\lib\site-packages\pandas\tests\series\test_analytics.py:535: in testit
    assert_almost_equal(float(f(s)), float(alternate(s.values)))
X:\Python36\lib\site-packages\pandas\util\testing.py:177: in assert_almost_equal
    **kwargs)
pandas\_libs\testing.pyx:59: in pandas._libs.testing.assert_almost_equal (pandas\_libs\testing.c:4156)
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   AssertionError: expected 1099511628275500.00000 but got 499500.00000, with decimal 5

pandas\_libs\testing.pyx:209: AssertionError
=========================== pytest-warning summary ============================
WC1 X:\Python36\lib\site-packages\pandas\tests\test_config.py cannot collect test class 'TestConfig' because it has a __init__ constructor
 3 failed, 9606 passed, 542 skipped, 21 xfailed, 1 xpassed, 1 pytest-warnings in 631.87 seconds

@cgohlke thanks. Is that on 32-bit windows?

It is 64-bit Python 3.6.1 from python.org on Windows 10.

test_overflow and test_sum pass after uninstalling bottleneck-1.2.0.

test_overflow and test_sum pass after uninstalling bottleneck-1.2.0.

Thanks, I'm able to reproduce on my windows VM with bottleneck installed (I didn't have it installed earlier). Taking a look at what's going on now.

we might not have bottleneck installed for windows test builds

just skip them for now

we are testing on 2.7 w/bottleneck on windows, but not 3.6....will add and skip/fix those.

we are testing on 2.7 w/bottleneck on windows, but not 3.6....will add and skip/fix those.

Yep, it is definitely in bottleneck. Are you doing a PR for skipping those, or should I?

testing now

xfailed these tests: https://ci.appveyor.com/project/jreback/pandas/build/1.0.3686
merging on pass (I think we have an issue about this overflow in any event, here: https://github.com/pandas-dev/pandas/issues/15453)

so when I fix the bottlneck things in 0.21.0 will fix this.

@jreback thanks bottleneck issue is here: https://github.com/kwgoodman/bottleneck/issues/163

@TomAugspurger ok everything lgtm. assuming master finishes and passes. cut when you are ready. (obviously if you see any doc things when you are doing, pls push fixes).

@jreback the whatsnew mentions a new hash_tuples. Do we want to expose that in pandas.util along with hash_pandas_object?

that's not public it could be i guess

I only ask since it was in a release note item. Though, dask isn't using it, so maybe keep it private for now?

yeah is meant to be private

OT - date of whatsnew note needs changing

Got it in https://github.com/pandas-dev/pandas/pull/16245

I'll push one more commit removing that reference to hash_tuples, then tag it. Going to wait for that doc build to finish first.

Has 0.20 been released? pandas.pydata.org says 0.19.2 is the current release..

just refresh and you shall see.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jreback picture jreback  路  61Comments

ShaharNaveh picture ShaharNaveh  路  51Comments

Dr-Irv picture Dr-Irv  路  59Comments

MarkiesFredje picture MarkiesFredje  路  42Comments

jreback picture jreback  路  61Comments