Hugo: proposal: Author model

Created on 23 Feb 2017  ·  23Comments  ·  Source: gohugoio/hugo

Most the work I've done on Hugo lately, have been with simplicity in mind; mostly for the end users, but also to simplify future maintenance and extensions. We may have confused some with the _index.md thing, but the code cleaned up nicely (we have now only one pageRender method, for starters), and the multilingual feature landed nicely.

So, any work to make authors a first-class Hugo citizen should not be accompanied by a three pages manual.

Here is my design proposal, in pseudo code.

Note that the attributes on author below are picked by random, and not really important to this discussion. We should agree on a set of common attributes, so they can be used in themes, but author should be schema-less, i.e. a map, so if someone wants to add _favourite_colour_, so be it.

Pseudo-coded site config:

authors:
   id: spf13
   weight: 1
   name: Steve Francia
   favouriteColour: green

languages:
no: 
authors:
   id: bep
   name: Bjørn Erik Pedersen
   favouriteColour: blå
   id: spf13
   favouriteColour: grønn
   slogan: "Hugo er best!"  

de: 
authors:
   id: spf13
   favouriteColour: blau

So given the above, I see the following rule set:

  • Author and Authors should be on Page; Author should be the first in a sorted list, sorted by weight etc., see menus.
  • Both author and authors (a slice) should work in frontmatter. This will be the id of the authors, e.g. bep in the example above.
  • author and authors should also be possible to be set in site config. These will be default values for pages with no values.
  • The author configuration can be parsed top-down: Each language can override author attributes from the global configuration, but not from siblings (spf13 will only have a slogan in Norwegian).
  • The configuration loading should take inspiration from the output and media package.
  • I guess this should even have its own package ... people, or author?

  • The author config should also be externalised at some point, see #3090.

Social links

I suggest we just name it socialLinks, so .Author.socialLinks will give you an ordered (ordered by Weight) list of:

id: "twitter"
name: "Twitter"
weight: 1
url: "https://twitter.com/bepsays"

The SocialLinks can be defined like this:

[socialLinks]
[socialLinks.facebook]:
name: "Facebook"
weight: 1
urlTemplate: "https://www.facebook.com/%s"

Hugo should support a default definition of the most common, users can add or redefine in site config (see MediaType and OutputFormat).

The social links can then be added to a given author:

authors:
   id: bep
   socialLinks: 
     "twitter": ["bepsays"]

Note the above is pseudo-code, and it should probably support both slices and a single string as placeholders.

General

We should consider casing issues in all of the above. I suggest:

  • Creating one "big" map from the structs involved for a given author.
  • Having author info in the page and site param will help, as .Param "author.firstNAME" will work fine.

The above is not complete, but it covers the most important aspects.

Keep Proposal

Most helpful comment

Background information: "Falsehoods Programmers Believe About Names". I don't think we have to worry about people who don't have names yet, but…

- GivenName - givenName OR firstName of the author
- FirstName - alias for GivenName (perhaps this is redundant)
- FamilyName - familyName OR lastName of the author
- LastName - alias for FamilyName of the author (perhaps this is redundant as well)
- DisplayName - the displayed name of the user; it could differ from the author's real name

I strongly suspect the existence of FirstName and LastName will encourage western-european theme authors to write things like "{{person.FirstName}} {{person.LastName}}" when Hungarians will want "{{person.FamilyName}} {{person.GivenName}}" and Japanese, Chinese, and Koreans will want "{{person.FamilyName}}{{person.GivenName}}" And then there are the Indonesians with one name, and Singaporeans with double-ended names and people with multiple middle names and super-long Arabic names

Suppose Abraham Lincoln, Rubik Ernő, 홍길동, Daryl Koh Pei Xiang, Suharto, Muhammad ibn Salman ibn Ameen al-Farsi, and Stefani "Lady Gaga" Germanotta all start a group blog* and they all want their names written as I've written them here. How should almost all themes generally handle names? As far as I can tell, theme authors should strongly be encouraged by the docs to _only_ use DisplayName, and site authors should be strongly encouraged by the docs to use _only_ DisplayName unless they have very specific sorting problems. While it's unlikely that any given Hugo site will have names from all of these different style, any given theme has a good chance of being used by people who have names in three or more of these styles.

* I was tempted to write "walk into a bar", but Github Flavored Markdown doesn't have strikethrough. Pity.

All 23 comments

Is this issue an revised proposal of #1850?

This proposal defines the authors in the config file instead of using data files as in #1850 . What advantages do you see here? I would rather try to keep the config file flat.

This is "in progress"; we can take the discussion when I'm happy about my proposal.

This is "in progress"; we can take the discussion when I'm happy about my proposal.

Sorry about that. I interpreted "in progress" as ready to be discussed so that the draft can by shaped iteratively. Let's forget what I wrote above for now.

/cc @digitalcraftsman @derekperkins @spf13 @moorereason

The above is my take on how Hugo should unify its author handling. The author schema itself is almost identical to @derekperkins -- but with some fundamental changes in how it is wired up.

Comments welcomed.

Overall, I'm fine with the proposal.

Note that the attributes on author below are picked by random, and not really important to this discussion. We should agree on a set of common attributes, so they can be used in themes, but author should be schema-less

@derekperkins already defined a list of common attributes in his initial attempt to implement this feature. I would propose that we should keep the following list so that themes can work with a standardized sets of attributes.

  • ID - used to identify an author with a short handle, i.e. "bep" to idenitfy "Bjørn Erik Pederson"
  • GivenName - givenName OR firstName of the author
  • FirstName - alias for GivenName (perhaps this is redundant)
  • FamilyName - familyName OR lastName of the author
  • LastName - alias for FamilyName of the author (perhaps this is redundant as well)
  • DisplayName - the displayed name of the user; it could differ from the author's real name
  • Thumbnail - a smaller picture of the author (i.e. for the info box of an author at the end of a post)
  • Image - a larger full-sized version of the Thumbnail
  • ShortBio - a shortened version of the author's biography
  • Bio - an extented version of the author's bio
  • Email
  • Params - should work analogous to the same-named block in the config file
  • Weight - weight author's, e.g. by importance

I guess this should even have its own package ... people, or author?

author should be the more descriptive package name

The author config should also be externalised at some point, see #3090.

I like the idea. The authors files should live in data/_authors. In #3090 there was a sugguestion to use data/_config. This way we would follow the scheme. As a general rule: all Hugo-specific data folders should be prefixed with _

Social links

What's actually the difference between id and name in the example for the social links above. Namely I'm talking about the first example with twitter.

What's actually the difference between id and name in the example for the social links above.

The ID is the key and is expected to be (more) constant (as it is used as a "foreign key" in author definitions).

So if you have:

id: gplus
name: "Google Plus"

And want to change the name to "Google+", you can do so safely without having to update all the authors.

I agree that this is great, from https://github.com/spf13/hugo/issues/3090:

Site config belongs in config.toml and cousins, but it can optionally be put in files below /data/_config/. Keys in config.toml will always win.

I would do this with authors, even if there were only three, and would probably move .Site.Params in there too, but that's a different convo...

@digitalcraftsman I know that is only a draft. I would be careful of FirstName and others since not all cultures put first name's first (e.g., Japanese), but I understand this can be discussed later too :smile:

A Consideration (Maybe)

My question is re: weight and how it would be used. I come from STM publishing, where author order can have very important and inherent meaning at the page metadata level. Assuming that authors is no. 3 or no. 4 for most popular taxonomies after tags,categories, and series for many Hugo sites:

authors: ["Noam Chomsky","Howard Zinn","Christopher Hitchens"]
---

The order of this could mean that Chomsky is the primary author with Zinn and Hitchens as co-authors OR Chomsky as primary with Zinn as co-author and Hitchens as technical reviewer, etc. Or I might add taxonomic *_weight for these authors based on certain criteria.

I appreciate this may very well have _zero_ impact on what you're considering, but I'm throwing it out there in hopes it may bring to a light a less-considered perspective.

authors: ["Noam Chomsky","Howard Zinn","Christopher Hitchens"]

The USAGE or the order is contextual, i.e. theme/site.

The example you're giving does not imply any order (as we have defined it).

A complete example would be:

Site config:

authors:

a:
name: Noam Chomsky
weight: 100
b:
name: Howard Zinn
weight: 200
c:
name: Christopher Hitchens
weight: 300
d:
name: Charles Dickens
weight: 300

And in front matter:

````

authors: ["c", "d", "b", "a"]

````

Given the above:

Author => a
Authors => a, b, dc

The above may not be _the way it should be_, but that is what this discussion is all about.

@bep, I think @rdwatters is proposing that the order of the authors, as defined in the content frontmatter, could be the default sort order instead of weight. I tend to agree. It would be easier and more intuitive for the content editor to simply change the order in the frontmatter than it would be to add weights.

So, in your last example, you would get:

Author => c
Authors => c, d, b, a

If you wanted to sort them by weight, you would do AuthorsByWeight or something. Alternatively, provide a way to get them in the order they were defined (AuthorsBy???).

Background information: "Falsehoods Programmers Believe About Names". I don't think we have to worry about people who don't have names yet, but…

- GivenName - givenName OR firstName of the author
- FirstName - alias for GivenName (perhaps this is redundant)
- FamilyName - familyName OR lastName of the author
- LastName - alias for FamilyName of the author (perhaps this is redundant as well)
- DisplayName - the displayed name of the user; it could differ from the author's real name

I strongly suspect the existence of FirstName and LastName will encourage western-european theme authors to write things like "{{person.FirstName}} {{person.LastName}}" when Hungarians will want "{{person.FamilyName}} {{person.GivenName}}" and Japanese, Chinese, and Koreans will want "{{person.FamilyName}}{{person.GivenName}}" And then there are the Indonesians with one name, and Singaporeans with double-ended names and people with multiple middle names and super-long Arabic names

Suppose Abraham Lincoln, Rubik Ernő, 홍길동, Daryl Koh Pei Xiang, Suharto, Muhammad ibn Salman ibn Ameen al-Farsi, and Stefani "Lady Gaga" Germanotta all start a group blog* and they all want their names written as I've written them here. How should almost all themes generally handle names? As far as I can tell, theme authors should strongly be encouraged by the docs to _only_ use DisplayName, and site authors should be strongly encouraged by the docs to use _only_ DisplayName unless they have very specific sorting problems. While it's unlikely that any given Hugo site will have names from all of these different style, any given theme has a good chance of being used by people who have names in three or more of these styles.

* I was tempted to write "walk into a bar", but Github Flavored Markdown doesn't have strikethrough. Pity.

Just a wild thought after reading @adiabatic's comment:

With {{ .DisplayName }} I think it would make sense to have a configurable default "template" in the site config, for example:

authors:
  default:
    DisplayName: "{{ .FirstName }} {{ .LastName }}"

If one of the authors needs a different schema, it can be overwriten in the author config.

authors:
  a: 
    firstname: Bob
    lastname: Test
  b:
    firstname: Alice
    lastname: Test
    DisplayName: "{{ .LastName }}, {{ .FirstName }}"

Now using {{ .DisplayName }} for a would result to "Bob Test" and b would result to "Test, Alice".

Theme Authors could then use {{ .DisplayName }} everywhere and theme users could configure their Name Templates in the sites' config file both "global" and on a per author base.

@adiabatic thanks for pointing this out that cultures have different habits regarding names.

@kevinburke's suggested approach looks like a very flexible solution to this problem. And it also defines a de facto standard for a template variable that should be used across all themes consistently.

But when customizing the user names we've to prevent that users don't add recursion (e.g. DisplayName: "{{ .DisplayName }}") or mutual dependencies where B depends on A and A requires the variable B in it's template.

Just so I can make more informed comments on this thread: are we discussing

  1. Implementation w/r/t how functions would be called and how authors would be configured
  2. Standardizing on metadata
  3. Both 1 & 2

???

And @moorereason Yes, that it what I was saying. If prioritizing convenience for content authors, it makes more sense for the author array to be an explicit ordering...

I think it's 3. All of it is important if we want to get authors right and make themes truly portable.

Given / Family / Display name

I used those first when I built the author integration and later added FirstName and LastName aliases when people complained. Based on @adiabatic's reasoning, I think they should be removed as aliases to prevent theme authors from only using them "incorrectly".

I had the concept of a default display name template originally. If there's a way to safely allow for a template override like @kevingimbel suggested, I think that's a great idea.

Author sorting

  • When referenced from front matter, the output should definitely default to the order that the user specified the authors.
  • When referencing site authors, I agree with @bep that it should default to weight sorting first, then it should have a secondary sort by family name and given name
  • There should be multiple built-in sorting options: ByWeight, ByFamilyName, ByGivenName and ByDisplayName for sure.

Common fields

I still agree with myself from the original proposal. :)

  • GivenName
  • FamilyName
  • DisplayName - the displayed name of the user; it could differ from the author's real name
  • Thumbnail - url to a smaller picture of the author (i.e. for the info box of an author at the end of a post)
  • Image - url to a larger full-sized version of the Thumbnail
  • ShortBio - a shortened version of the author's biography
  • Bio - an extented version of the author's bio
  • Email
  • Params - should work analogous to the same-named block in the config file

I think we could be smart about the bio fields, either truncating the bio to N characters if the short bio is empty, or falling back on shortBio if Bio is empty.

Author config location

I like the cascading approach, with the default being under data/_authors, which can be overridden in the site config.

What about multiple files? My original proposal used the filename as the ID, so you could have one file per author. Definitely pros and cons, so I'm not tied to that. In fact, I think the config should ingest all the files in the directory, ignoring filenames. That gives people the flexibility to split files however they want, with the ID explicitly specified with the other fields.

Social links

I like having the urlTemplate by default. People are notorious for using the wrong social links, e.g. not using https, including/excluding www improperly. Out of the box, I think Hugo should scrub and fix those by default. The other implementation details look good to me.

schema.org has honorificPrefix (Dr./Mr./Mrs.) and honorificSuffix (M.D./Ph.D/MSCSW).

Adding HonorificPrefix(es) and HonorificSuffix(es) may be either a good idea or a bad idea, but I think we ought to think about them if only to conclude that they're a bad idea and/or we don't need them.

@bep I may be wrong here, but was this proposal implemented? From what I understand it has not been, so may be I could chime in.

I would personally prefer keeping things as simple as possible, which means that probably just a DisplayName could be used as suggested by @kevingimbel. The sorting could also be initially based only on the weights, as suggested originally. This might keep this simple while thinking and may be even while implementing.

I base this call for simplicity from the Keep It Simple Stupid mantra.

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help.
If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open.
If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.
This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

I am commenting because of the stale label attached to the issue.

I think this feature request is important. It should be tagged with the "Keep" label since no activity in another 21 days would automatically close the issue (based on .github/stale.yml). The issue number #1850 could be closed in favour of either this or #3776. This is based on my limited understanding, and could be wrong.

Additionally, I would request the stale timeout to be increased to 60 days. This is based on a totally unscientific reason that between December 6 and December 27, a lot of volks would be more engaged outside, and hence may end up missing the stale to closed transition for valid issues. This may lead to many valid issues being closed and subsequent requests for reopen next year.

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help.
If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open.
If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.
This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

OK, I'm probably not the only one confused: what does this implementation refer to as of now?
/cc @acl2358
EDIT: Answer https://github.com/gohugoio/hugo/issues/4458. This issue needs a mind-map to keep track, already. Hopefully nobody is scared away by this fact in promoting implementation...

A more useful set of questions:

  • What's the state of common attribute set? Discussion seems pretty advanced. Maybe a decision can be compound to this domain.
  • What's the state of implementation details? I also see discussion in an advanced state.
  • Still somewhat open: How configuration should look like.

As for that one, an author database seems the most compelling approach to me. But for a single author config is just enough, I guess.

Is there anything non-trivial, what holds back advancement on this?

Can this be added soon? But instead of data make it flexible to configure where data of authors is stored. It can be in data or taxonomy as many sites use taxonomies for multiple authors. This data can be made available through .Site.Author and .Page.Author.

I think it is not to have a model at all. Make the value be loaded from a config file, data file or taxonomy. The configuration could be as simple as:

[author]
type = <'config'|'data'|'taxonomy'>
data = 'author'

It looks for the data file, config file or taxonomy called author and populate the data. The minimum requirement should be the author should have an ID and at least a name in a given language character set.

One thing I'm surprised of not seeing here is consideration of the existing standards for describing authors and/or persons, like the Dublin Core, FOAF, OpenGraph and other metadata. Sure, themes can build them from these configurations, but if some of the metadata are to be predefined, using previously agreed-upon items like those could make things more predictable and overall easier.

Was this page helpful?
0 / 5 - 0 ratings