Hi folks,
I disovered a potential bug regarding the layout cache which leads to a blank page. If this happens, customers may see a white page without any hmtl inside the body tags. It might be related to #6942, although there is no php-fpm or libxml in the game here.
bin/magento cache:clean
curl -gI "http://magento2.local/page_cache/block/render/?ajax=1&blocks=[%22header.login%22]&handles=[%22default%22%2C%22cms_index_index%22%2C%22cms_page_view%22%2C%22cms_index_index_id_home%22]&originalRequest=a&version="
curl -s http://magento2.local | grep "<body" -A 3 -B 3
<body data-container="body" data-mage-init='{"loaderAjax": {}, "loader": { "icon": "http://magento2.local/pub/static/version1487104262/frontend/Magento/luma/de_DE/images/loader-2.gif"}}' class="cms-index-index page-layout-1column">
</body>
This ajax call seems to destroy the layout cache. If it is not present, structure which gets loaded and unserialized in lib/internal/Magento/Framework/View/Layout.php:generateElements from cache does not contain any root element anymore:
$ awk -F "root" '{print NF-1}' var/cache/mage--0/mage---792_STRUCTURE_LAYOUT_FRONTEND_STORE1_2C7461AF04BB11FE209659D7147D2C579
0
0
Cleaning the cache again and loading the front page normally generates a valid layout cache, where root is present:
$ awk -F "root" '{print NF-1}' var/cache/mage--0/mage---792_STRUCTURE_LAYOUT_FRONTEND_STORE1_2C7461AF04BB11FE209659D7147D2C579
0
98
I think it has something do to with the used handles [default, cms_index_index, cms_page_view, cms_index_index_id_home]
which are
used for generating the cache key in getCacheId
. If there is no cache present, the page cache block just writes its content into the layout cache although there is no root layout used.
I need to dive deeper into this issue. It normally should not happen, because the DOM should get loaded first. And subsequent ajax calls should always have a layout cache present. But if cache gets cleared manually and there is already an ajax call on its way it can lead to a broken layout cache.
Ok, I did some debugging. It seems to only affect CMS pages and is related to the way how Magento_PageCache
module works. This module renders a script into the DOM to fetch some blocks asynchronously.
<script type="text/x-magento-init">
{
"body": {
"pageCache": {"url":"http:\/\/magento2.local\/index.php\/page_cache\/block\/render\/","handles":["default","cms_index_index","cms_page_view","cms_index_index_id_home"],"originalRequest":{"route":"cms","controller":"index","action":"index","uri":"\/index.php\/"},"versionCookieName":"private_content_version"} }
}
</script>
This is done in PageCache/view/frontend/web/js/page-cache.js
. This call hits PageCache/Controller/Block/Render.php
and PageCache/Controller/Block.php:_getBlocks
triggers the loading of the layout:
$this->_view->loadLayout($handles, true, true, false);
However, because of the way how lib/internal/Magento/Framework/View/Model/Layout/Merge.php:_fetchRecursiveUpdates
works, the pageLayout will be set to admin-1column
defined in app/code/Magento/Theme/view/adminhtml/page_layout/admin-1column.xml
. Unfortunately, in this case lib/internal/Magento/Framework/View/Page/Layout/Reader.php:read
just gets an empty layout <layout/>
for $xml
.
public function read(Layout\Reader\Context $readerContext, $pageLayout)
{
$this->getPageLayoutMerge()->load($pageLayout);
$xml = $this->getPageLayoutMerge()->asSimplexml();
$this->reader->interpret($readerContext, $xml);
}
I think the layout is not loaded because our area in this case is frontend
not adminhtml
. So this layout won't be loaded in this case. And therefore there are some critical elements missing inside scheduledStructure
afterwards, like e.g. the root
container.
Now, the bad thing happens. This structure gets stored into the cache, if there is none. This is done in lib/internal/Magento/Framework/View/Layout.php:generateElements
.
$cacheId = 'structure_' . $this->getUpdate()->getCacheId();
$result = $this->cache->load($cacheId);
if ($result) {
$this->readerContext = unserialize($result);
} else {
\Magento\Framework\Profiler::start('build_structure');
$this->readerPool->interpret($this->getReaderContext(), $this->getNode());
\Magento\Framework\Profiler::stop('build_structure');
$this->cache->save(serialize($this->getReaderContext()), $cacheId, $this->getUpdate()->getHandles());
}
The $cacheId
will be the same as for a normal request because it is based on used handles. But remember, we have missing elements in the structure like root
. Next time, somebody visits the page, he will get a blank page without body, because this broken layout structure has been loaded from the cache.
For normal requests on CMS pages, this is not happening, because the controller sets the pageLayout
explicitly in app/code/Magento/Cms/Helper/Page.php:setLayoutType
before the layout is loaded. This is usually 1column
defined in app/code/Magento/Theme/view/frontend/page_layout/1column.xml
and can then easily be loaded by lib/internal/Magento/Framework/View/Page/Layout/Reader.php
.
I would really appreciate any advice on how to fix this.
@bka Thanks for sharing your findings. I'm curious whether you spent time to find a solution to this issue?
I'm experiencing a very similar issue with an empty homepage (where the body tag is empty). Until I read your issue description, I wasn't able to consistently reproduce the issue, but I'm now able to reproduce it if I follow these steps:
bin/magento cache:flush
curl -s http://example.dev/page_cache/block/esi/blocks/%5B%22catalog.topnav%22%5D/handles/%5B%22default%22,%22cms_index_index%22,%22cms_page_view%22,%22cms_index_index_id_home%22%5D/
curl -s http://example.dev
If I reverse steps 2 and 3, the homepage content will contain content like expected, and the curl request for the menu/navigation content will also contain content.
On a side note, I initially suspected Varnish was at fault with this issue as I know that an improper Varnish configuration can cause the catalog navigation to go missing. I've been experiencing issues with both a blank homepage and then sometimes the navigation is missing (but the rest of the homepage content is loaded). I was able to figure out the http://example.dev/page_cache/block/esi/blocks…
url used in step 2 by using the Gist that was shared in this comment. After finding this issue, I tried requesting that ESI url in the manner you described and that's how I was able to consistently reproduce the issue.
I was curious to see what combination of handles were successful vs not, so I ran varnishlog
while loading a category page to get an example query that did work. Here is what I found:
Broken requests:
curl -s http://example.dev/page_cache/block/esi/blocks/%5B%22catalog.topnav%22%5D/handles/%5B%22default%22,%22cms_index_index%22,%22cms_page_view%22,%22cms_index_index_id_home%22%5D/
curl -s http://example.dev/page_cache/block/esi/blocks/%5B%22catalog.topnav%22%5D/handles/%5B%22default%22%5D/
curl -s http://example.dev/page_cache/block/esi/blocks/%5B%22catalog.topnav%22%5D/handles/%5B%22cms_index_index%22,%22cms_page_view%22,%22cms_index_index_id_home%22%5D/
Working requests:
curl -s http://example.dev/page_cache/block/esi/blocks/%5B%22catalog.topnav%22%5D/handles/%5B%22default%22,%22catalog_product_view%22,%22catalog_product_view_id_38640%22,%22catalog_product_view_sku_XX-XXX-XXX%22,%22catalog_product_view_type_simple%22%5D/
curl -s http://example.dev/page_cache/block/esi/blocks/%5B%22catalog.topnav%22%5D/handles/%5B%22default%22,%22catalog_product_view%22%5D/
Hi @erikhansen,
thanks for your feedback. I have some additional insights. This issue relates to usage of $_isScopePrivate
. Devdocs say it wont work properly. This also explains why this issue did not appear on standard luma. Indeed, we had some blocks with isScopePrivate.
Having a block defined with isScopePrivate will trigger async loading of blocks also known as Varnish Hole Punching. Sad thing is, if this Block is present on the homepage, indicated by handles default, cms_index_index, cms_page_view, cms_index_index_id_home
it corrupts the layout cache if it is empty.
Related source files are:
I fixed this by patching Module_PageCache/Controller/Block.php
to include an additional pseudo handle which changes the generated cache id like this:
/**
* Get blocks from layout by handles
*
* @return array [\Element\BlockInterface]
*/
protected function _getBlocks()
{
$blocks = $this->getRequest()->getParam('blocks', '');
$handles = $this->getRequest()->getParam('handles', '');
if (!$handles || !$blocks) {
return [];
}
$blocks = json_decode($blocks);
$handles = json_decode($handles);
// fix for layout corruption bug #8554
if (is_array($handles)) {
$handles[] = 'fpc_block';
}
$this->_view->loadLayout($handles, true, true, false);
$data = [];
$layout = $this->_view->getLayout();
foreach ($blocks as $blockName) {
$blockInstance = $layout->getBlock($blockName);
if (is_object($blockInstance)) {
$data[$blockName] = $blockInstance;
}
}
return $data;
}
@bka - Thanks for your feedback. While your workaround might fix your specific manifestation of this issue, it doesn't fix mine. And I don't think my manifestation of the issue has anything to do with $_isScopePrivate
. I'll post here if I find the solution to my problem.
@Magento - I was able to easily reproduce my issue on a vanilla Magento 2 install using the steps I outlined in my comment above.
@bka - I have a solution to the issue. You can reference the fix in this commit: https://github.com/magento/magento2/commit/99bb170a805f3e9d0ab5c9b9bb801f739ac161c1
Try applying this patch to your environment and see if the issue goes away:
diff --git a/vendor/magento/module-ui/view/base/layout/default.xml b/vendor/magento/module-ui/view/base/layout/default.xml
index 7f2efbd..64d5f14 100644
--- a/vendor/magento/module-ui/view/base/layout/default.xml
+++ b/vendor/magento/module-ui/view/base/layout/default.xml
@@ -5,7 +5,7 @@
* See COPYING.txt for license details.
*/
-->
-<page xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" layout="admin-1column" xsi:noNamespaceSchemaLocation="urn:magento:framework:View/Layout/etc/page_configuration.xsd">
+<page xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:framework:View/Layout/etc/page_configuration.xsd">
<body>
<referenceContainer name="after.body.start">
<block class="Magento\Ui\Block\Logger" name="logger" template="Magento_Ui::logger.phtml"/>
If the issue does go away, I'd recommend closing this issue.
@bka Have you had a chance to test my suggested patch yet?
@bka Any update on this?
The root cause of this issue is that the "pageLayout" parameter for the layout builder is missing. However before starting blocks and XML generation Magento adds additional blocks to the list based on selected pageLayout: 1column, 2columns or empty. By default for the most of the pages the layout is: 1column. As the result: almost all default pages are working as expected. But once you customize your theme a little bit, using "move" actions, the full page cache becomes broken. The solution is pretty complex: I have overridden Magento\Framework\View\Page\Builder, Magento\Framework\View\Layout, Magento\PageCache\Observer\ProcessLayoutRenderElement and Magento\PageCache\Controller\Block\Esi, by adding a new parameter to the URL and to the controller, named: "page_layout" and then I set it into the view in the Esi controller. Works for me.
@andreymoskvenkov Can you provide a patch so we can verify your solution?
Sorry, but I can provide only a tip, which I have already done. Tips are free :)
@andreymoskvenkov C'mon man, this is opensource. Take a little, give a little ;)
All kidding aside, I appreciate your tip, but I'm sure your patch would be appreciated even more and it could be validated. If you can't share it for whatever reason, I understand.
@cherreman Did you happen to try the patch I included in my comment on February 28th?
Yeah, there are many reasons of not sharing the code, one of them is - NDA of the company, where I am working for. We have fixed a lot of core bugs of M2 internally and this is one of the most difficult to find. And why have I posted a tip here is - because this issue number #8554 has also given me a tip about layout type such as "1column", "empty", etc. I was debugging it almost whole day and was starting to become a crazy, because I could not understand: why in the Layout object's "_blocks" property the list of generated blocks are only a few of them in page_cache controller, however in CMS_index controller the list of blocks in "_blocks" property is big: about 79 blocks there. And at the same time the final XML in both actions is almost the same! The cause of that was: because Magento adds some blocks to the "_blocks" property manually in a separate method depending on what "pageLayout" is set. As the result: in page_cache action no page layout was passed and no page layout was set.
I don't need a validation of my fix :) It is on Production already and it works for us 100% :)
https://github.com/magento/magento2/pull/9560 seems to fix this issue, can you try that fix and report back? I'd like to merge that PR but I will need some help from you :smile:
:-D
Sorry, but no, this PR will not fix the issue. In the fix need to Pass "page_layout" code into an API request together with the list of applied handlers on a page. Your fix just adds an additional handler, which is used only in the FullPageCache's controller.
@adrian-martinez-interactiv4 can you chime in here?
@andreymoskvenkov If I have understood well, you are proposing to include the additional handle in the ajax request, instead of putting it within the block page cache controller. I don't know if I'm missing something, could you clarify a bit your previous explanation pls? Thanks in advance.
No, I propose to add an additional parameter to the PageCache's URL: 'page_layout', where to put what Layout type was used on the current page and then to add a processing of this parameter in the ESI controller, which should just take the value from that parameter from the request and to set it into Layout object. Then everything will work correctly. And this is not easy btw, because of crazy Magento 2 architecture (from what I see the core developers instead of thinking more deeply, they just split responsibilities between classes randomly and apply "protected", "public" and "private" keywords also randomly :) ). It is not a "handler", it is an additional characteristics (or parameter, or property) of Layout entity.
Well, I'm been doing some research about this issue. I'll be explaining each point later, but long story short:
About point 1:
As said before, I'm able to reproduce it in CE 2.1.6, this one is ok:
And this one shows the empty body:
Same steps reproduced at 2.2.0-dev, first shows non-empty body, as expected:
But the same steps that resulted in an empty body in CE 2.1.6, return non-empty body in 2.2.0-dev:
Without going too deep in what commits are involved in solving this issue, what seems clear is that this issue, as long as #9050 , that I consider a duplicate of this one, are already solved.
About point 2:
Unfortunately, #9560 is not solved yet. As explained in comment https://github.com/magento/magento2/pull/9560#issuecomment-300349007, page structure is cached based on layout handles, and it's not correctly generated due to cms page not being loaded to include its layout updates within the cached page structure. This leads us to point 3.
About point 3:
At this point, I see two ways to solve the issue, as described above. I chose the additional / unique layout handle solution, as the easiest and least intrusive option. In the other hand, I think it would be a good idea to have a custom property in the layout that may be used to generate the cache id if defined, along with layout handles. This would make it possible to generate unique ids for layout cache if specified from anywhere. I'm giving a try to this approach, but I see no reason to include a new param in the request, as long as the controller should be responsible to set the property in the layout to ensure a different, unique layout cache id is generated for that request. Anyway, I'm giving a try on that too.
As this issue seems to be solved, I'll continue giving feedback about #9560 in the thread of that PR.
:-D
Ok, continue adding new handlers for FPC, up to you!
I also don't see no reason in adding a new param to the request, until it was a core Magento approach how they build the final layout XML: they phisically set this layout type into Layout object and then even Create some blocks from PHP code, if a layout type was set. That's why any other fixes, such as making CacheID unique, etc, do Not fix he issue Completely for ALL cases.
I have successfully fixed it for our client in March and we continue shareing this fix internally from project to project.
But you may continue having fun in fixing other bugs many times instead of fixing the root cause once.
:-D
@andreymoskvenkov can you clarify what "we have an internal patch" means? If you have a patch for this or other issues, would you kind sharing them for others to benefit? Thanks!
It meas, that I have fixed this issue and wrapped it up as a separate module. However as I said earlier, I won't share it. Magento Community has helped me to find the root cause of the issue and that's why I was able to fix it. That's why I want to thank the community for this and share the root cause of it. However I will not share the fix, because if after my notes you cannot fix the issue like I fixed it, then you are useless as Magento specialists and will not be able to provide high quality assistance to your clients on Magento 2 platform. So the goal of my posts here is: to introduce an exercise, which is based on one of real problems, which faces a Magento developer almost every day. I want to make the community stronger. Just providing a final fix will not improve the level of community developers.
That is not how any of this works.
If you really think you are setting the bar higher let me tell you that you are doing the complete opposite, leaving a bug open because you think people should be "smart" like you to fix it. This is how you make a community weaker because you say that if anyone is not as good as you it should not participate at all and that is completely false. But hey, this is OSS and we deal with this kind of cases every day so chances are someone will probably come and fix it. In the meantime @adrian-martinez-interactiv4 keep up the good work!
if one could see and read the fix, one can learn from it, trying to understand what the problem was and how the solution was build. For me thats a giant part of OSS. If you have a patch and don't want to share it, okay, go your way, but please don't brag (it's nothing more to me) about it.
(And yes, it is about you, since you were implying, that you are good and if one can't do one thing you can, he or she is useless)
A proprietary patch? When will this land in Magento EE? :trollface:
The thread was cleaned from not-related to issue comments. Please refrain from negative messaging and be respectful to each other.
It seems this issue is fixed in develop. Cannot reproduce it anymore, at least the manifestation described above. It seems the change of layout in module-ui/view/base/layout/default.xml pointed out by @erikhansen did the trick. I dont know about other manifestiations but this one seems to be resolved.
Just to add some more information, I quickly looked in the git history and like @bka said, @erikhansen's proposed solution was already executed on the develop
branch in scope of MAGETWO-55341, see: https://github.com/magento/magento2/commit/f42472a3753c8b87969a8e2dde6645525bf797d8
(Not sure if this is the correct and definitive solution though, based on the above comments...)
Internal ticket to track issue progress: MAGETWO-69018
@bka, thank you for your report.
The issue is already fixed in develop branch, 2.2.0
Internal ticket to track issue progress: MAGETWO-80647
Most helpful comment
That is not how any of this works.
If you really think you are setting the bar higher let me tell you that you are doing the complete opposite, leaving a bug open because you think people should be "smart" like you to fix it. This is how you make a community weaker because you say that if anyone is not as good as you it should not participate at all and that is completely false. But hey, this is OSS and we deal with this kind of cases every day so chances are someone will probably come and fix it. In the meantime @adrian-martinez-interactiv4 keep up the good work!