Wp-rocket: When a slug is shared amongst a page and posts, all the posts might be cleared during partial purge

Created on 1 Jun 2020  Â·  7Comments  Â·  Source: wp-media/wp-rocket

Describe the bug
When there is a page that shares a slug with posts, whenever a purge of this page happens, we're clearing all posts too.

To Reproduce
Steps to reproduce the behavior:

  1. Create a page named Blog, with slug /blog
  2. Modify permalinks structure to display posts under /blog/%postname%' like this: https://jmp.sh/BcDMGIE
  3. add some posts and preload the cache.
  4. the posts should be cached inside the /blog folder, something like /blog/postname/index-https.html folders, and the blog page should be cached at _/blog/index-https.html_
  5. Edit the Blog page, or a sibling page.
  6. The whole /blog folder will be deleted, including all the posts that are sharing the parent folder.

Expected behavior
The posts that are created under a shared page slug folder, should not be deleted when editing the page.

Screenshots
Page:
Screen Shot 2020-06-01 at 5 13 49 p  m

Permalinks settings, including the same slug of one of the pages:
Screen Shot 2020-06-01 at 3 56 15 p  m

Resulting cache folder Structure:
Screen Shot 2020-06-01 at 3 05 09 p  m

Additional context
This purge can be triggered when editing the page itself, or by editing a sibling page since it will be included in the array by get_adjacent_post

It seems pretty similar to this issue: https://github.com/wp-media/wp-rocket/issues/2549
The additional content looks pretty close to this one.

Backlog Grooming (for WP Media dev team use only)

  • [x] Reproduce the problem
  • [x] Identify the root cause
  • [ ] Scope a solution
  • [ ] Estimate the effort
epics 🔥 cache grooming medium moderate bug

Most helpful comment

I concur that we do need a better cleaning strategy where we apply a set of rules (heuristics) to determine all the different combinations for cleaning.

We could create a Cleaning Module or add a Cleaner to the Filesystem Module.

This particular issue is likely an epic.

All 7 comments

Related ticket: https://secure.helpscout.net/conversation/1181295666/169480/

In this case, these are pages created with Elementor. However, the customer has added categories using the Add Category to Pages plugin.

Log: http://snippi.com/s/2r9twaz

✅ Reproduced this. Happens in the common cache and in the logged-in-user-specific cache. When the cache is preloaded for both common and logged-in, both cache sub-folders are deleted.

✅ Root cause is that we are recursively cleaning from rocket_clean_files() called by rocket_clean_post(). With this custom permalink structure there is a permalink collision where the post permalinks share their permalink root with the page permalink slug (and any sub-page permalinks).

Furthermore, if the page happens to be the page_for_posts we probably do want to clean recursively, whereas if the page is unrelated to posts except for the permalink collision, we want to treat it differently.

As @Tabrisrp and I have discussed it, we likely need to implement a bigger change in cleaning strategy that can distinguish between page/post and implement a different cleaning strat that will not pull in posts in these instances.

I concur that we do need a better cleaning strategy where we apply a set of rules (heuristics) to determine all the different combinations for cleaning.

We could create a Cleaning Module or add a Cleaner to the Filesystem Module.

This particular issue is likely an epic.

I concur that we do need a better cleaning strategy where we apply a set of rules (heuristics) to determine all the different combinations for cleaning.

We could create a Cleaning Module or add a Cleaner to the Filesystem Module.

This particular issue is likely an epic.

I absolutely agree. There are a number of cleaning scenarios which are need to minimize host resource usage and CDN activity when editing a wordpress website. There should also be a way to easily disable WP-Rocket cache syncing to the CDN while editing the a wordpress website.

--- CLEAR CACHE

  1. Clear Cache (all pages and posts)
  2. Clear this page/post (only related pages/posts) *Could include subdirectories if path is modified.
  3. Clear this page/post (only this page/post)

---PRELOAD

  1. Preload (all pages and posts)
  2. Preload (only pages and posts that have not been preloaded) *Could include subdirectories if path is modified.
  3. Preload (only this page/post)

---CDN SYNC

  1. Ability to disable WP-Rocket CDN syncing while editing a website and then have the CDN syncing updated when sitewide editing is done.

The flaw in the design is broader that what had been documented here. In some cases clearing a sibling cached page will delete all of the cached directories in the parent's directory, even if they do not share a slug, and in other cases clearing a single page can delete all of the cached pages for you're entire website, even though they do not share a slug. So far I have not discovered any logic behind this. Edit: If you clear a page, every cached page it's linked to may be deleted.

This isn't a simple flaw. This is a highly destructive bug that can delete the cache for your entire website. What's worse is that the cache files will not be automatically rebuilt, which will have a dramatic effect on the SEO for the entire website.

I'm guessing a lot of people don't monitor the cache folder and are unaware that clearing the cache for a single page may affect the cache for their entire website almost at random.

The bottomline is that cached pages which do not share a "slug" may be deleted, to include the cache for the entire website.

To temporarily resolve this issue we setup a cron to run the run_rocket_sitemap_preload() every 30 minutes. This ensures any pages which are unexpectedly deleted after a theme/plugin/page/post update are cached in a timely manner.

The WP Rocket team could easily solve this with an interim fix that does the same every time a partial cache clear occurs, instead of leaving customers to rely on a scheduled cron.

Cron
wget -q -O - https://yourwebsite.com/preload.php

preload.php code:

` // Load WordPress.
require( 'wp-load.php' );

// Preload cache.
if ( function_exists( 'run_rocket_sitemap_preload' ) ) {
run_rocket_sitemap_preload();
} `

Was this page helpful?
0 / 5 - 0 ratings

Related issues

webtrainingwheels picture webtrainingwheels  Â·  5Comments

NataliaDrause picture NataliaDrause  Â·  4Comments

Tabrisrp picture Tabrisrp  Â·  5Comments

crystinutzaa picture crystinutzaa  Â·  4Comments

webtrainingwheels picture webtrainingwheels  Â·  5Comments