Ghost: Assets & Resources URL rules

Created on 11 Feb 2019  路  14Comments  路  Source: TryGhost/Ghost

Context

Ghost deals with different types of URL:

  1. assets (images, js files, css files, maybe one day movies/pdfs/who knows) in fields
  2. assets in the HTML / content fields
  3. resource urls: the url for a post, a tag or an author - generated by Ghost
  4. relative paths in the HTML / content fields

At the moment Ghost treats all 4 of these the same. They are all returned as absolute by default sinse API v2.

Assets

Assets that are served by Ghost, i.e. anything that has a path that starts with {subdir}/content/images/... needs to always be an absolute URL. The file was uploaded to Ghost, it will be served from Ghost. All assets MUST always be served as an absolute URL from any API. This means that any content/images url should appear absolute anywhere other than in the DB.

Long term, we should either use a separate API endpoint, or treat /content/images as an API endpoint. URLs for assets shouldn鈥檛 change.

Resource URLs & Relative Paths

Resource URLs also need to appear absolute via the API by default. Right now the url helper in Ghost takes care of changing this back for the one case (Ghost themes).

Paths in content should not be manipulated unless specifically requested. The absolute_urls=true flag should be reinstated to cover just this case.

Fixes needed to conform to rules above

  • [x] Transform absolute to relative paths in mobiledoc/html https://github.com/TryGhost/Ghost/issues/10472
  • [x] Do an audit of all URLs that are used in Ghost and stored in the db
  • [ ] Refactor absolute -> relative transformation from API layer to model layer
api server / core stale

All 14 comments

We had two bug reports in the last months because we've suddenly stored absolute asset urls in our database. It was caused by transforming incoming absolute asset urls on API layer into relative urls on API layer. We have just forgotten to add a protection for that. It happened once for posts and now for settings (https://github.com/TryGhost/Ghost/issues/10590).

should appear absolute anywhere other than in the DB.

The model layer does not protect against storing absolute paths. This needs sorting out asap @gargol

Regarding Audit:

Are there any other resources which have url fields, which could potentially get stored absolute?

Results of an audit for URL related fields we currently use. Fields without any comments do input/output serialization and transform local absolute URLs into relative ones:
posts:

  • feature_image
  • og_image
  • twitter_image
  • mobiledoc (url transformation for input content, NO output transformation)
  • html (url transformation for content in both input & output)
  • canonical_url
  • url (virtual)

users

  • profile_image
  • cover_image
  • website? (no transformation, but could be in case somebody puts a url to local instance e.g.: website: http://ghost.local/author/naz)
  • url (virtual)

tags

  • feature_image
  • url (virtual)

settings

  • cover_image
  • icon
  • logo

subscribers

  • subscribed_url
  • subscribed_referrer (not transformed)
  • unsubscribed_url (doesn't seem to be used in codebase, should we remove it?)

webhooks (no transformations)

  • target_url (should transform?)

Non DB resources:
images

  • url (output transormation from local path)

Would suggest moving transformation handling for non virtual url-containing fields (marked in bold) into model layer to keep the logic more central. The rest of fields needs discussion/feeback. cc @kirrg001

Let's discuss on Monday 馃憤

@kirrg001 think you meant some other issue as the above is linking back to here :sweat_smile:

Related: https://github.com/TryGhost/Ghost/issues/10069 and https://github.com/TryGhost/Ghost/issues/10598

From https://github.com/TryGhost/Ghost/issues/10069 we can see there is not detail in our URL types:

When we talk about assets, we mostly mean images served from Ghost, e.g. they start with /content/images/. These are served from Ghost/belong to Ghost, we therefore "own" this URL, and should serve it as relative.

When we look at URLs inside of content (and also code injection!):

There are also theme assets, which may have a relative path, and be referenced in content. Ghost also owns these assets, but it may be too hard to detect them right now. If possible, they should be served absolute.

Additionally, there are external assets, which may have a relative path, and be nothing to do with Ghost. We don't own these URLs, we should not change them.

This rule guide needs to take into account what happens:

  • In storage

    • TLDR here is only external URLs should be absolute, everything else MUST be stored relative.

  • In the API

    • Here the behaviour is different for the type of URL

  • In clients (e.g. SDK, or theme helpers)

    • Here the behaviour is different for the type of URL, and depending on whether the client is external (SDK) or internal (theme helpers).


I know this is an active concern, but reinforcing that I am hitting my head against the lack of clarity in these rules again today.

It looks to me like the absolute_urls parameter still exists in the v2 API, but I don't know what it does anymore, or whether it's useful or should have been removed.

Once we have the rules set, then I can at least document the rules as they are meant to be, and we can look for places where we are not following it.

Summary from today:

  • the editor suffered from a bug, because we transform mobiledoc from absolute to relative internal images urls for storage
  • but server serves relative internal image paths
  • editor marks the changes as "dirty"

We tried fixing:

  • returning mobiledoc with absolute internal image paths (https://github.com/TryGhost/Ghost/pull/10631) -> did not work out, too weak & too complicated
  • storing absolute urls for mobiledoc again, but html is a generated field and ends up with absolute urls too (https://github.com/TryGhost/Ghost/pull/10632) -> takes longer than we thought

Both did not work 馃槇

We need a whole new plan!

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Not stale. In progress by @kevinansfield :wink:

Sorry for random rambling, but I don't understand why Ghost must serve assets as absolute urls?

Themes serve their internal assets (js/css/images) relative, and it works just fine.

I can see many issues referred in this issue being caused by transforming between relative and absolute urls.

What would be the issue, if everything served by Ghost would be relative, and there were zero transformations between layers?

Thanks for your good work, eagerly waiting 3锔忊儯.0锔忊儯 to arrive 馃殮

What would be the issue, if everything served by Ghost would be relative, and there were zero transformations between layers?

@sampumon Relative URLs won't work if the content is being served under completely different domain. To give one example of why: Ghost is a headless CMS. It should support a very basic case for static site generators being pure consumers of data from whatever domain the static site is running on :smiley:

@kevinansfield do you think we need to keep this issue around or want to reuse it for future work when optimizing how the relative->absolute->relative transformations work?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ArthurianX picture ArthurianX  路  4Comments

kevinansfield picture kevinansfield  路  3Comments

mattferderer picture mattferderer  路  4Comments

kirrg001 picture kirrg001  路  3Comments

krokofant picture krokofant  路  3Comments