Warehouse: Fully implement PEP 527

Created on 8 Oct 2019  Â·  13Comments  Â·  Source: pypa/warehouse

Most helpful comment

OK, I've sent the notices to everyone that's uploaded one of these packages in the last year. The shutoff date is 30 days from today (2020-04-12).

For posterity, here's the SQL script I used to generate the affected users/projects:

SELECT 
  user_id, 
  projects.name as project_name, 
  packagetype 
FROM 
  (
    SELECT 
      roles.user_id as user_id, 
      roles.project_id as project_id, 
      packagetype 
    FROM 
      (
        SELECT 
          project_id, 
          packagetype 
        FROM 
          (
            SELECT 
              release_id, 
              packagetype 
            FROM 
              release_files 
            WHERE 
              (
                packagetype IN (
                  'bdist_dmg', 'bdist_dumb', 'bdist_msi', 
                  'bdist_rpm', 'bdist_wininst'
                ) 
                AND "upload_time" > (
                  localtimestamp - interval '365 days'
                )
              ) 
            GROUP BY 
              release_id, 
              packagetype
          ) f 
          JOIN releases ON releases.id = f.release_id 
        GROUP BY 
          project_id, 
          packagetype
      ) release 
      JOIN roles ON release.project_id = roles.project_id 
    GROUP BY 
      user_id, 
      roles.project_id, 
      packagetype
  ) p1 
  JOIN projects ON p1.project_id = projects.id;

Ran that like so:

psql service=pypi -t -A -F"," -f pep527.sql > pep527.csv

Then used the following script to turn that output into a CSV of mass emails:

import csv
from collections import defaultdict

users = defaultdict(list)
subject = "[PyPI] Notice: Deprecation of underused file types/extensions"
body_template = """
Hello,

We're emailing because you're listed as a maintainer or owner for a package that has uploaded a legacy file type to PyPI in the past year:

{project_list}

Following PEP 527, it will soon not be possible to upload legacy file types.

https://www.python.org/dev/peps/pep-0527/

This restriction will apply to new uploads after 30 days from today (2020-04-12). Existing uploads will remain on PyPI, but soon new ones cannot be uploaded.

See PEP 527 for suggestions of replacement file types, and if you have any questions, please comment on the tracking issue for this deprecation:

https://github.com/pypa/warehouse/issues/6792

Thank you,
The PyPI Administrators
"""

with open("pep527.csv") as f:
    reader = csv.DictReader(f)

    for row in reader:
        users[row["user_id"]].append((row["project_name"], row["packagetype"]))

with open("pep527-complete.csv", "w") as f:
    writer = csv.DictWriter(f, fieldnames=["user_id", "subject", "body_text"])
    writer.writeheader()

    for user_id, projects in users.items():
        project_list = "\n".join(
            f"* Project: {project_name}, package type: {packagetype}"
            for project_name, packagetype in projects
        )

        writer.writerow(
            {
                "user_id": user_id,
                "subject": subject,
                "body_text": body_template.format(project_list=project_list),
            }
        )

All 13 comments

IMO we should also push these projects to drop usage of the legacy formats, and if not, at least get a good understanding if the issue is a "the current toolchain doesn't satisfy workflows like the legacy formats" or "oh, we don't need it" or something else entirely.

How long is the list?

If not too long, let's create issues at these projects, it's likely they're not aware of the deprecation. I'd expect many are already using wheels and can ditch the legacy formats.

For example with Pillow, I didn't know they were deprecated until it came up at https://discuss.python.org/t/deprecate-bdist-wininst/1929/12?u=hugovk, and we already distribute wheels so it was "oh, we don't need it".

And how about adding a deprecation warning to Twine when uploading them?

How long is the list?

IIRC the "list" was any project that had previously uploaded one of these filetypes, essentially we only blocked new projects.

A better list would be every project that has this ability that has actually published one of these filetypes in the last N months.

I'm guessing that this list is short enough that adding a deprecation notice to Twine would be unnecessary, but hard to say until we actually make an audit.

There are currently 4,678 projects that have allow_legacy_files set:

warehouse=> select count(*) from projects where allow_legacy_files;
 count
-------
  4678
(1 row)

Recent uploads for individual deprecated filetypes:

bdist_dmg:

warehouse=> select filename, upload_time from release_files where packagetype='bdist_dmg' order by upload_time desc limit 10;
                    filename                    |        upload_time
------------------------------------------------+----------------------------
 python_igraph-0.7.1.post6-py2.7-macosx10.9.dmg | 2015-06-05 20:58:15.702734
 python_igraph-0.7.1.post6-py2.6-macosx10.9.dmg | 2015-06-05 20:56:47.202244
 python_igraph-0.7.1_4-py2.7-macosx10.9.dmg     | 2015-03-05 21:11:20.376493
 python_igraph-0.7.1_4-py2.6-macosx10.9.dmg     | 2015-03-05 21:10:22.376507
 python_igraph-0.7.1_3-py2.7-macosx10.9.dmg     | 2015-03-05 20:30:28.362479
 python_igraph-0.7.1_3-py2.6-macosx10.9.dmg     | 2015-03-05 20:28:18.772667
 python_igraph-0.7.1_2-py2.7-macosx10.9.dmg     | 2015-02-10 20:29:57.860577
 python_igraph-0.7.1_2-py2.6-macosx10.9.dmg     | 2015-02-10 20:12:25.660451
 python_igraph-0.7.1_1-py2.7-macosx10.9.dmg     | 2015-02-10 07:56:59.793387
 python_igraph-0.7.1_1-py2.6-macosx10.9.dmg     | 2015-02-09 20:48:42.300196
(10 rows)

bdist_dumb:

warehouse=> select filename, upload_time from release_files where packagetype='bdist_dumb' order by upload_time desc limit 10;
                    filename                     |        upload_time
-------------------------------------------------+----------------------------
 airspeed-0.5.13.macosx-10.14-x86_64.tar.gz      | 2019-10-22 00:49:01.646779
 py_nifty_cloud-0.9.5.macosx-10.14-x86_64.tar.gz | 2019-09-28 14:32:32.022959
 algorithmia-1.2.0.linux-x86_64.tar.gz           | 2019-08-02 19:12:17.972642
 htrc-0.1.51.macosx-10.7-x86_64.tar.gz           | 2019-07-30 14:19:25.724787
 htrc-0.1.51b1.macosx-10.7-x86_64.tar.gz         | 2019-07-24 17:42:03.053786
 htrc-0.1.51b0.macosx-10.7-x86_64.tar.gz         | 2019-07-24 15:36:59.304947
 airspeed-0.5.12.macosx-10.14-x86_64.tar.gz      | 2019-07-24 06:32:12.329112
 pysodium-0.7.2.linux-x86_64.tar.gz              | 2019-06-25 14:30:59.086794
 htrc-0.1.50.macosx-10.7-x86_64.tar.gz           | 2019-06-21 14:43:12.68958
 htrc-0.1.50b0.macosx-10.7-x86_64.tar.gz         | 2019-06-20 17:23:49.411197
(10 rows)

bdist_msi:

warehouse=> select filename, upload_time from release_files where packagetype='bdist_msi' order by upload_time desc limit 10;
              filename               |        upload_time
-------------------------------------+----------------------------
 pywincffi-0.5.0.win32-py3.6.msi     | 2017-11-18 18:55:27.694295
 pywincffi-0.5.0.win32-py3.5.msi     | 2017-11-18 18:55:26.221545
 pywincffi-0.5.0.win32-py3.4.msi     | 2017-11-18 18:55:25.067028
 pywincffi-0.5.0.win32-py3.3.msi     | 2017-11-18 18:55:23.828515
 pywincffi-0.5.0.win32-py2.7.msi     | 2017-11-18 18:55:22.319013
 pywincffi-0.5.0.win-amd64-py3.6.msi | 2017-11-18 18:55:21.128075
 pywincffi-0.5.0.win-amd64-py3.5.msi | 2017-11-18 18:55:20.016371
 pywincffi-0.5.0.win-amd64-py3.4.msi | 2017-11-18 18:55:18.597794
 pywincffi-0.5.0.win-amd64-py3.3.msi | 2017-11-18 18:55:17.357411
 pywincffi-0.5.0.win-amd64-py2.7.msi | 2017-11-18 18:55:16.060902
(10 rows)

bdist_rpm:

warehouse=> select filename, upload_time from release_files where packagetype='bdist_rpm' order by upload_time desc limit 10;
              filename               |        upload_time
-------------------------------------+----------------------------
 Aglyph-3.0.0-1.noarch.rpm           | 2018-03-16 02:39:14.304743
 Aglyph-3.0.0-1.src.rpm              | 2018-03-16 02:39:08.195363
 toughradius-5.0.0.6-1.noarch.rpm    | 2017-11-19 04:45:14.203082
 toughradius-5.0.0.6-1.src.rpm       | 2017-11-19 04:45:09.635322
 python-otopi-mdp-0.2.2-1.noarch.rpm | 2017-10-02 10:23:32.79819
 python-otopi-mdp-0.2.2-1.src.rpm    | 2017-10-02 10:23:23.707941
 toughradius-5.0.0.5-1.noarch.rpm    | 2017-08-20 07:48:30.370971
 toughradius-5.0.0.5-1.src.rpm       | 2017-08-20 07:48:26.664831
 cx_Oracle-6.0rc1-py35-1.x86_64.rpm  | 2017-06-17 00:14:38.838786
 cx_Oracle-6.0rc1-py27-1.x86_64.rpm  | 2017-06-17 00:14:20.067593
(10 rows)

bdist_wininst:

warehouse=> select filename, upload_time from release_files where packagetype='bdist_wininst' order by upload_time desc limit 10;
                filename                 |        upload_time
-----------------------------------------+----------------------------
 GPy-1.9.9.win-amd64-py3.7.exe           | 2019-10-17 08:37:05.494866
 GPy-1.9.9.win-amd64-py3.6.exe           | 2019-10-17 08:28:54.341581
 GPy-1.9.9.win-amd64-py3.5.exe           | 2019-10-17 08:10:01.336052
 GPy-1.9.9.win-amd64-py2.7.exe           | 2019-10-17 08:01:52.12953
 Trac-1.0.19.win-amd64.exe               | 2019-10-15 00:36:35.161141
 Trac-1.0.19.win32.exe                   | 2019-10-15 00:36:30.520495
 xrayutilities-1.5.3.win-amd64-py3.7.exe | 2019-10-09 10:12:48.375835
 xrayutilities-1.5.3.win-amd64-py3.6.exe | 2019-10-09 10:12:44.698438
 xrayutilities-1.5.3.win-amd64-py3.5.exe | 2019-10-09 10:12:41.362779
 xrayutilities-1.5.3.win-amd64-py2.7.exe | 2019-10-09 10:12:38.179182
(10 rows)

Of these it looks like bdist_dmg, bdist_msi, and bdist_rpm can just be shut off.

The bdist_wininst filetype is still getting a lot of uploads, but PEP 527 says that this is misleading:

It's quite easy to look at the low usage of bdist_dmg and bdist_msi and conclude that removing them will be fairly low impact, however bdist_wininst has several orders of magnitude more usage. This is somewhat misleading though, because although it has more people uploading those files the actual usage of those uploaded files is fairly low. Taking a look at the previous 30 days, we can see that 90% of all downloads of bdist_winist files from PyPI were generated by the mirroring infrastructure and 7% of them were generated by setuptools (which can currently be better covered by bdist_egg files).

Also bdist_dumb is still getting the occasional upload, but these projects would probably be better served by uploading wheels if they want platform-specific built distributions.

Thanks. From the PEP's removal process:

Finally, an email will be generated to the maintainers of all projects still given the legacy flag, which will inform them of the upcoming new restrictions on uploads and tell them that these restrictions will be applied to future uploads to their projects starting in 1 month. Finally, after 1 month all projects will have the legacy file type flag removed, and support for uploading these types of files will cease to exist on PyPI.

Would now be a good time to send out the email?

And to just bdist_dmg, bdist_msi, and bdist_rpm users first, or to all legacy format users?

ooooh! Nice find @hugovk! ^>^

I'm in favor of dropping all legacy formats if the PEP has a clear mechanism to do so.

I don't really think it's necessary to email all 4,678 projects when only a small fraction have actually used their legacy flag recently. Perhaps we should set a timeframe instead: if they've uploaded a deprecated distribution type in the last year?

Yes, that sound reasonable.

Gentle nudge on this, given https://discuss.python.org/t/3115.

OK, next steps would be:

  1. querying to get the subset of the 4,678 projects that have allow_legacy_files set which have had an upload in the last year
  2. querying for email addresses for all their maintainers
  3. drafting the email
  4. sending a bulk email

Would anyone like to help with #​3?

  1. Something along the lines of this?

Hello,

We're emailing because you're listed as the maintainer for a package that has uploaded a legacy file type to PyPI in the past year:

bdist_dmg
bdist_dumb
bdist_msi
bdist_rpm
bdist_wininst

Following PEP 527, it will soon not be possible to upload legacy file types. Existing uploads will remain on PyPI, but soon new ones cannot be uploaded.

https://www.python.org/dev/peps/pep-0527/

This restriction will apply to new uploads after 2020-04-01 [TODO decide exact date, must be at least 1 month from email date].

See PEP 527 for suggestions of replacement file types, and if you have any questions, please visit https://github.com/pypa/warehouse/issues/6792 [TODO or https://discuss.python.org/somewhere or somewhere else?].

Thank you,

[TODO]

OK, I've sent the notices to everyone that's uploaded one of these packages in the last year. The shutoff date is 30 days from today (2020-04-12).

For posterity, here's the SQL script I used to generate the affected users/projects:

SELECT 
  user_id, 
  projects.name as project_name, 
  packagetype 
FROM 
  (
    SELECT 
      roles.user_id as user_id, 
      roles.project_id as project_id, 
      packagetype 
    FROM 
      (
        SELECT 
          project_id, 
          packagetype 
        FROM 
          (
            SELECT 
              release_id, 
              packagetype 
            FROM 
              release_files 
            WHERE 
              (
                packagetype IN (
                  'bdist_dmg', 'bdist_dumb', 'bdist_msi', 
                  'bdist_rpm', 'bdist_wininst'
                ) 
                AND "upload_time" > (
                  localtimestamp - interval '365 days'
                )
              ) 
            GROUP BY 
              release_id, 
              packagetype
          ) f 
          JOIN releases ON releases.id = f.release_id 
        GROUP BY 
          project_id, 
          packagetype
      ) release 
      JOIN roles ON release.project_id = roles.project_id 
    GROUP BY 
      user_id, 
      roles.project_id, 
      packagetype
  ) p1 
  JOIN projects ON p1.project_id = projects.id;

Ran that like so:

psql service=pypi -t -A -F"," -f pep527.sql > pep527.csv

Then used the following script to turn that output into a CSV of mass emails:

import csv
from collections import defaultdict

users = defaultdict(list)
subject = "[PyPI] Notice: Deprecation of underused file types/extensions"
body_template = """
Hello,

We're emailing because you're listed as a maintainer or owner for a package that has uploaded a legacy file type to PyPI in the past year:

{project_list}

Following PEP 527, it will soon not be possible to upload legacy file types.

https://www.python.org/dev/peps/pep-0527/

This restriction will apply to new uploads after 30 days from today (2020-04-12). Existing uploads will remain on PyPI, but soon new ones cannot be uploaded.

See PEP 527 for suggestions of replacement file types, and if you have any questions, please comment on the tracking issue for this deprecation:

https://github.com/pypa/warehouse/issues/6792

Thank you,
The PyPI Administrators
"""

with open("pep527.csv") as f:
    reader = csv.DictReader(f)

    for row in reader:
        users[row["user_id"]].append((row["project_name"], row["packagetype"]))

with open("pep527-complete.csv", "w") as f:
    writer = csv.DictWriter(f, fieldnames=["user_id", "subject", "body_text"])
    writer.writeheader()

    for user_id, projects in users.items():
        project_list = "\n".join(
            f"* Project: {project_name}, package type: {packagetype}"
            for project_name, packagetype in projects
        )

        writer.writerow(
            {
                "user_id": user_id,
                "subject": subject,
                "body_text": body_template.format(project_list=project_list),
            }
        )
Was this page helpful?
0 / 5 - 0 ratings

Related issues

apogoreliy picture apogoreliy  Â·  4Comments

webknjaz picture webknjaz  Â·  4Comments

Lawouach picture Lawouach  Â·  3Comments

ewjoachim picture ewjoachim  Â·  3Comments

hartwork picture hartwork  Â·  4Comments