Warehouse: Author of package not shown

Created on 7 Apr 2016  ·  29Comments  ·  Source: pypa/warehouse

My package, Requests is registered and owned by myself. I also have two maintainers who contribute significantly to the project.

Currently, in the Warehouse UI, there appears to be no indication that I am the owner of my package, or that it is owned by me. This is very important to me, as a package author.

UUI

Most helpful comment

#pypa-dev - Autocorrect I assume ;)

All 29 comments

By contrast, the current PyPi explicitly shows "Author: Kenneth Reitz".

So, perhaps this isn't as much about surfacing ownership of the package namespace as it is surfacing the "Author" metadata.

I personally think that should be one of the most visible pieces of information presented for a package — it's one of the first things I typically look at.

In my mind, it belongs up in the blue area, next to the package name, in a smaller typeface.

e.g.

requests 2.9.1
by Kenneth Reitz
Python HTTP for Humans.

To be clear, Author: Kenneth Reitz is a different thing than the Owners/Maintainers. What now looks like:

screen-shot-2016-04-07-13-44-01

used to look like:

screen-shot-2016-04-07-13-45-39

As far as the "What users can manage this project", I personally think making the list "flat" and making Owners and Maintainers (in the PyPI permissions sense) show up the same is a reasonable thing to do.

That being said, there is some metadata that we're not currently exposing which are: Platform, Supported-Platform, Author, Author-Email, Maintainer, Maintainer-Email, License, Classifier, Requires-Dist, Provides-Dist, Obsoletes-Dist, Requires-Python, Requires-External, Requires, Provides, Obsoletes (but some of these we're not exposing by design).

Related to this specific issue, we're not exposing the Author and Maintainer information and we should be. In the current metadata Author, Author-Email, Maintainer, and Maintainer-Email just take a single arbitrary string so we mostly just assume it's a single thing, however in the future we will likely make this able to take a list of values, so whatever design we pick should be able to handle multiple entries.

I don't know what we should do on the design side of this, that is @nlhkabu's realm, but in the templates adding support for these should be as simple as referencing {{ release.author }}, {{ release.author_email }}, {{ release.maintainer }}, and {{ release.maintainer_email }}, should should all be falsey if they are not set.

One thing to point out is that in the current design, Cory is arguably the most prominently displayed user on the requests page current, largely because he was the last person to actually upload a file for requests. We already had one request with this back when instead of uploaded by ... it said by ... that it was a bit confusing and wrongly placed emphasis on one particular author, so perhaps we should just ditch the uploaded by ..?

I wouldn't be against that. For projects with multiple pypi maintainers (at least mine), the uploader is effectively just an implementation detail of who is available or who said "i'll do it".

On the other hand, if it was to remain the way it is, it would encourage me to want to once-again cut all of my own releases :)

Oh, and another reason why I think that making the PyPI concept of Owners/Maintainers "flat" is that it's very case by case, in some cases you have projects that the original author (or the "Owner") stays heavily involved and the majority of the code is written by them. In other cases you have the original author who handed off the project (but still has the "Owner" flag on PyPI) and has stepped back and someone else is in charge of the project now, possibly having completely written the code so none of it is authored by the original author anymore. In addition, I think it more accurately represents the fact that as far as PyPI is concerned, there isn't much difference between an "Owner" and a "Maintainer" (the only real one being that Owners can add other people to the project, Maintainers can't).

I do think that we should keep the "Author" and "Maintainer" fields separate, and let users populate them however makes sense for their particular project.

Yeah, I have nothing against the maintainer's list being flat. My only concern is the lack of the Author field (ideally, shown with a gravatar, if possible).

Thanks for flagging this @kennethreitz

This is a result of some confusion at my end between maintainers, owners and authors.

My initial intention was to display the _author_ on the right hand side of the links bar. Likewise, I wanted the author to be displayed on each package card.

Somewhere in the mix, it ended up being the last maintainer to upload a package :(

So, let's change:

screenshot from 2016-04-18 08-33-53

to be:

By author. Last updated about an hour ago.

and

screenshot from 2016-04-18 08-36-04

to be:

2.10 by author.

@nlhkabu We can do that, the big question really is what do we do in edge cases. The Author and Author-Email field are free form text fields so we have a few possible scenarios:

  • Author-Email is a single email and it matches one that we have recorded for someone.
  • Author-Email is a single email and it doesn't match one that we have recorded for someone.
  • Author-Email is some sort of ad hoc list of multiple email addresses which may be , or spaces or something else.
  • Author-Email is non-existent, but we have an Author field.
  • Author and Author-Email are non-existent.

The first option is pretty easy to handle, match the email address up with the account on file and then use that to show the information we have available. The second option isn't too hard either, we'll just do what GitHub does and just show the text without any sort of linking. The third option we can probably try and guess if it's a list of addresses and try to turn that into a by foo, bar, and that other one instead of just a single entry (maybe?). Probably the same thing with the fourth one, and the last one is probably best handled by just omitting the field all together.

Looking into this trying to help out a little, @nlhkabu Have you had the opportunity to look into the latest question from @dstufft ? (I am assuming that you're the main designer :))

Hi @mrasband - indeed I am :) And I think @dstufft's solutions all sound reasonable.

We'll also need to account for the avatar on the project detail page. If it is a single author with an email address, we should show the avatar. If we don't have the email address, we omit it. If it is a list of authors, we omit it.

Does this make sense?

If you need any help on this issue, please log into #pypa on freenode and ping me (my nick is nlh). I'm happy to answer any questions :)

sorry, make that #papa-dev on freenode :)

#pypa-dev - Autocorrect I assume ;)

No... It was just that I made the comment before I had properly woken up :zzz:

I know that feeling :D

Perfect, thank you 👍

Ok, I've been messing around with this, and I've changed my answer. I just don't think we can/should support multiple authors/maintainers at this point. The metadata doesn't provide a structured way of dealing with that and attempting to hack it in is proving to be very frustrating. So for now, single author/maintainer only, and we can improve the metadata to allow more, and when we do that we can revisit adding more.

Correction, I've been fighting this some more, and I've been working through some issues with it, and I'm feeling like our current author/maintainer system simply isn't powerful enough to handle this in anything that's resembling a reasonable approximation. So let's back this up, we have several different kinds of information available to us right now:

  • A free form Author and Author-Email field.
  • A free form Maintainer and Maintainer-Email field.
  • A list of users that have permission to release on PyPI.

Trying to extract a list out of Author / Author-Email and Maintainer / Maintainer-Email is a no go, I've been attempting to do it and it doesn't really work in practice.

I've been thinking about the kind of things that people might want to indicate with this data, we have things like:

  • This is a project primarily by a single person.
  • This is a project primarily by a small group of people.
  • This is a project with a lot of people who work on it.

The top bar is really only going to work for 1-2, maybe 3 people before it's just filled up, but trying to shove this same information into the side bar I feel like it's going to be confusing when we already have "Maintainers" in the side bar that represents "Who has permission to release on PyPI".

Given the top bar has a limited amount of space, I feel like the right answer for anything that has more than a single answer is going to be putting it in the side bar, but that leaves the question of what data do we expose where?

Looking at some other systems:

  • _npm:_ They allow a singular author field (though I can't find where they surface that data in their design) and N contributor fields which they show in their side bar.
  • _crates.io:_ They allow a list of N strings that will represent the authors in a list on their web page, if these strings match some sort of formatting they'll do smart things to give a better representation. They also have an "Owners" section which matches our current "Maintainers" section.
  • Rubygems:* Similar to crates.io
  • Others ?

There's also the question of what we should do with #1261 for surfacing information on someone's user page.

To further make this mess more confusing, there's currently very few releases on PyPI which don't have _something_ in their author fields, but there are very few releases on PyPI which have _anything_ in their maintainer field. Of course every project on PyPI has associated accounts with permission to release.

To even further make this confusing, it appears that distutils doesn't currently even attempt to read the maintainer field, and if you do set it, it overrides the Author // Author-Email fields when it writes out the PKG-INFO.

Given all of this, here are my suggestions:

  • We ignore the Maintainer // Maintainer-Email field, it doesn't appear to be offering much value, particularly since distutils doesn't appear to actually surface this information.
  • We treat Author // Author-Email field as a single free form text field, with the following behaviors:

    • If there is just an Author field, with no Author-Email we render it as plain text in the side bar, under a header like "Author".

    • If there is just an Author-Email field, with no Author we render it as a mailto: link, with the email address itself as the anchor text.

    • If there is both an Author field and a Author-Email field we render it as a mailto: link to Author-Email, with the Author field as the anchor text.

This allows people with a project that is done by a single person to simply put their name/email there, and projects done by a group of people can omit the email (or use a mailing list) and use have an author of something like "The Warehouse Developers". Of course projects will be free to omit this information completely too.

Now we move onto more richer metadata, currently we have a list of all users who have release permissions, but it strikes me that this isn't really going to always be the correct information. For instance, the Openstack project is going to have their "Maintainer" information in the sidebar pretty much always be whoever originally registered the project plus the openstackci bot. Other projects will want to recognize more people beyond the people who are allowed to release to PyPI.

Which of these scenarios do we want to support?

If we're OK with excluding the projects who want to recognize more people beyond those who are allowed to release to PyPI we can simply just use the list of users that are registered on PyPI and be done with it (and we could even handle that case a little better by adding a third role that doesn't have release permission and is just a contributor or something). However, this data is specific to PyPI itself and thus isn't shipped as part of the release artifact so it's less than ideal.

Another option is to add a new field to the metadata which is a list of contributors (Name/Email pairs) that we surface in the UI. This will take a bit more effort since it's adding new metadata to the ecosystem, but it travels along with the release artifacts which is great. There's also the question of how this would interact with the current data we have for "Who can release to PyPI", would we just combine the lists? show it separately? Hide the people who can release and show this new list instead? If we hide the people who can release how do we back populate data for the half a million releases that don't have this new metadata? Do we use the list of people who can release as a fallback if there's no-one in this new list? If we don't hide the people who can release and we show both of them, what do we call these three pieces of information we have available now (Author, People-Who-Can-Release-on-PyPI, New-List-of-Contributors) so that it's not confusing to people?

The simplest thing to do right now is to just treat the `Author / Author-Email field as a free form text in the side bar, and just use the list of users who can release for the rest and ignore any thing else for the time being.

Oh, and I think we should just ditch any sort of user information on in the project summary cards on the index page, search results, etc. I feel like there's no universally right answer for what should go in that field and the text is free form enough that we should just resist the temptation to guess, and instead just omit that information.

@dstufft is there metadata around how originally registered a package? If so, I would be in favor of it saying "created by <>".

That makes a subtle distinction between who is maintaining it and who created it. With Flake8, for example, one person created it and is no longer involved with it at all. Most people also think of Tarek as the person behind Flake8, so if they see "Flake8 created by Tarek Ziadé" they'll have a way of feeling certain that is the right package versions "FlakeEight created by Someone Else". It's not meant to be used for security purposes but I think it might be helpful if it is possible.

The simplest thing to do right now is to just treat the `Author / Author-Email field as a free form text in the side bar, and just use the list of users who can release for the rest and ignore any thing else for the time being.

For launch I think this is the best solution.

So - let's:

  • [x] Add Author/Author-Email info into meta data section in sidebar. This should be the first meta data
  • [x] Move the meta data to above the maintainers list to place more importance on the author info
  • [x] Remove 'uploaded by' information on cards, as this may be confused with the author

@dstufft do you want to open another issue for improving the way we handle this in the long term (e.g. with new fields, etc?), or do you think this solution will suffice?

I really like this proposal. 👍

@nlhkabu I think this should suffice, we also probably want to render the Maintainer / Maintainer-Email directly below the Author / Author-Email but that should be trivial. I think any future improvements here are going to require changes to our data model and metadata fields.

I started to work on this but need some help with the rendering of the email addresses for the author and maintainer.

Should we just assume that if the email is provided, it really is an email and render it as a link. Or should we display it after the name like: Author: Timo Furrer, [email protected]
And should we display the email even if the name is not given?

We could also keep it really simple for now and just do something like:

Author: Timo Furrer
Author-Email: [email protected]
....

What do you think @nlhkabu ?

We shouldn't assume that the email is provided if the author is - the database is far too difficult to predict :P

From a design perspective, I'd prefer to see the author wrapped in a mailto link - but only if the email field exists.

@dstufft may be able to give some insight on what (if any) validation we can do to confirm that an author email is in fact a real email.

We certainly can do some validation on the email field. And if we present it in a mailto link I'd prefer to do a validation.

For the case which has an existing email address but no name I'd just show the email address as Author in a mailto link. Is that ok?

Sounds great @timofurrer :)

Fixed in #1454.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

gcochard picture gcochard  ·  3Comments

nlhkabu picture nlhkabu  ·  4Comments

mbakke picture mbakke  ·  3Comments

Lawouach picture Lawouach  ·  3Comments

ruohoruotsi picture ruohoruotsi  ·  3Comments