Warehouse: Valid `Author-email` and `Maintainer-email` fields are rejected

Created on 15 Dec 2017  路  9Comments  路  Source: pypa/warehouse

Per https://packaging.python.org/specifications/core-metadata/#author-email-optional:

A string containing the author鈥檚 e-mail address. It can contain a name and e-mail address in the legal forms for a RFC-822 From: header

Example:

Author-email: "C. Schultz" <[email protected]>

However if the provided Author-email field is set to the example value, the resulting distribution is rejected:

HTTPError: 400 Client Error: author_email: Invalid email address. for url: http://warehouse.local/legacy/

Because we're validating that this field is an email address, not a valid RFC-822 style header.

bug good first issue

Most helpful comment

I am willing to take a shot at this. Would it be fine if I ask my possibly dumb questions here?

All 9 comments

I am willing to take a shot at this. Would it be fine if I ask my possibly dumb questions here?

@pradyunsg no question is a dumb question :)
Yes, If you have any questions, please ask here - we will do our best to help you.

Just wanted to point out to any potential new contributors that @pradyunsg hasn't submitted a PR for this issue (yet!) so it's still up for grabs.

I am not sure that the definition of the format matches the data model of the Metadata PEP here. There are distinct fields for author and author email, so are we sure that all RFC 822 formats should be accepted? Would this break the various Metadata file parsers?

Same question for multiple authors.

I am not sure that the definition of the format matches the data model of the Metadata PEP here. There are distinct fields for author and author email, so are we sure that all RFC 822 formats should be accepted?

@merwok If I follow you correctly, you're pointing out that a RFC 822 style email may contain the author's name, which seems redundant when there is an Author field as well. I don't think this is a problem.

Would this break the various Metadata file parsers?

I don't think this would break anything in Warehouse. The pkginfo project already interprets these fields via the email stdlib module, so I don't think it would break anything there either. To my knowledge, there isn't anything that depends on these fields being a single valid email address, because this restriction was a relatively new addition and there are a fair amount of packages that don't even have valid RFC 822 style values.

To be clear, this should be considered a regression: there are packages on PyPI already that are using this format for these fields, which were uploaded to pypi-legacy, and would fail if they were uploaded to Warehouse today.

@pradyunsg hasn't submitted a PR for this issue (yet!)

Yeah... I've been swamped by some other work, so, I've not been able to get to this.

Hmm, so my thinking when I added the new validation was that currently people would be just complete garbage in that field sometimes. I think validating data here is still the right thing to do (because having nonsense values is bad, even if previous releases were able to upload with that). I don't think that it's a bad idea to allow the RFC 822 email addresses here though, though it raises the question of how we should display it. Right now we generate a link that looks like<a href="mailto:THEEMAILADDRESS">THEAUTHORNAME</a>, but what do we do if we have Author-Name, and an email address like Foo <[email protected]>, do we just extract the email address? Change the display?

@dstufft I think that anything that is a valid RFC 822 value _should_ be a valid value for a a mailto link, e.g. foo@example.com">a link like this would work just fine.

Heh, welp. I guess I forgot that worked.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mbakke picture mbakke  路  3Comments

nlhkabu picture nlhkabu  路  4Comments

webknjaz picture webknjaz  路  4Comments

toddrme2178 picture toddrme2178  路  3Comments

NathanBnm picture NathanBnm  路  3Comments