Magento2: Page Builder auto-completing malformatted HTML tags for HTML-ready Product fields (Description, Short Description, etc), causing Admins to miss issues they need to fix / resulting in broken PDPs for customers

Created on 27 Nov 2019  路  9Comments  路  Source: magento/magento2

Preconditions (*)

  1. Magento Commerce 2.3.3
  2. Firefox / Chrome (latest versions as of 2019-11-26)
  3. Have products with malformatted HTML tags in product fields that allow HTML (short description, description). For us, our short descriptions for some products had malformatted HTML tags from a test migration in our M2 staging.
    _Note: Our product data with malformatted HTML tags was migrated on Magento 2.3.2, but we've recently upgraded our staging environment to 2.3.3. So this is happening on 2.3.3, but the data was migrated on 2.3.2_

Steps to reproduce (*)

  1. As per Precondition 2, you simply need malformatted HTML tags in the short description, or any description field.that allows HTML. In our case, our short description had the following:
    image
    Product XYZ (Model ABC)</p><p><span><strong>Lorem Ipsum Etc Etc
    *Notice that there is a missing start tag for a paragraph, and no closing tags for another p, span, and strong tag
    Edit your catalog_product_entity_text for the short description to have the above-mentioned malformatted HTML tags.
  1. Edit your product in Magento Admin, without "Show Editor" selected, so you should see the raw malformatted HTML from the database for the short description.

Expected result (*)

  1. I expect Page Builder to show my short description without having my malformatted HTML tags corrected for me, so I know to fix them. I should see:
    Product XYZ (Model ABC)</p><p><span><strong>Lorem Ipsum Etc Etc
    In Page Builder, it would look like:
    image
    Which would let me know to fix my malformatted HTML tags.

Actual result (*)

  1. The raw HTML shown by Page Builder for the Short Description shows with all HTML tags properly opened / closed, where applicable, even if they're malformatted in the database.
    In other words, I see:
    <p>Product XYZ (Model ABC)</p><p><span><strong>Lorem Ipsum Etc Etc</strong></span></p>
    image
    But this is WRONG, as the actual value in the database does not have the corrected HTML tags. An Admin user never knows there's an issue.
  1. This causes the front-end UI to mess up, as what is eventually rendered to the end-user still has the malformatted HTML, as Page Builder only shows you what the correct HTML should be. Thus, Admin users never know their migrated data has malformatted HTML tags, and will have a hard time debugging the problem if they don't know how to query the raw M2 database, as the Admin edit page for the product will not show the malformatted HTML causing the end-user's display issue on the storefront.
    End users see the following, with the Short Description showing way out of whack within the layout; it should not be where it currently is showing, but due to the malformed HTML tags, the layout is building incorrectly and closing a tag too early, messing up where the details box goes:
    image

Here is an example of a product with NO malformatted HTML, showing where the details box SHOULD be located normally:
image

To fix this bug we would ask that you please do not auto-correct malformed HTML that is read out of the database when populating Page Builder HTML editors, or, if better for everyone, add an option to disable auto-correcting HTML tags in Page Builder HTML editors, so Admin users can see there's malformatted HTML tags and fix them.

Format is valid non-issue

Most helpful comment

An alternative solution would be having Page Builder / Magento still auto-open / auto-close malformed HTML tags, but if they do auto-correct, the Edit page for the product MUST alert the Admin User editing the Product that they should make a non-consequential change to the HTML-corrected field & save the product.

I say make a non-consequential change because saving the product as-is, even with the corrected HTML tags from Page Builder's auto-complete, won't fix the issue; The "pre-save" short description (with fixed HTML tags) is the same as the "post-save" short description (with fixed HTML tags), thus the short description is NOT corrected / saved, and the value in the database still contains the malformed HTML. You would have to add an nbsp or an empty open & close span tag to have Magento recognize the edited field and thus save the now-automatically-fixed HTML tags.

That said, I think it makes more sense (both logically and for usability) to simply make it a Store setting to disable auto-completion of malformed HTML tags in HTML-editable fields, so after-migration, Admins can see potentially malformed HTML tags in their product descriptions, and turn the auto-complete back on once migrated data is cleaned up, to ensure no other malformatted HTML tags make their way into the database.

Let me know if I can provide any additional details to reproduce the problem, I think my original post & screenshots should show you the full extent of the problem, however.

Cheers & thanks in advance!
-Art

All 9 comments

Hi @not-art. Thank you for your report.
To help us process this issue please make sure that you provided the following information:

  • [ ] Summary of the issue
  • [ ] Information on your environment
  • [ ] Steps to reproduce
  • [ ] Expected and actual results

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento give me 2.3-develop instance - upcoming 2.3.x release

For more details, please, review the Magento Contributor Assistant documentation.

@not-art do you confirm that you were able to reproduce the issue on vanilla Magento instance following steps to reproduce?

  • [ ] yes
  • [ ] no

An alternative solution would be having Page Builder / Magento still auto-open / auto-close malformed HTML tags, but if they do auto-correct, the Edit page for the product MUST alert the Admin User editing the Product that they should make a non-consequential change to the HTML-corrected field & save the product.

I say make a non-consequential change because saving the product as-is, even with the corrected HTML tags from Page Builder's auto-complete, won't fix the issue; The "pre-save" short description (with fixed HTML tags) is the same as the "post-save" short description (with fixed HTML tags), thus the short description is NOT corrected / saved, and the value in the database still contains the malformed HTML. You would have to add an nbsp or an empty open & close span tag to have Magento recognize the edited field and thus save the now-automatically-fixed HTML tags.

That said, I think it makes more sense (both logically and for usability) to simply make it a Store setting to disable auto-completion of malformed HTML tags in HTML-editable fields, so after-migration, Admins can see potentially malformed HTML tags in their product descriptions, and turn the auto-complete back on once migrated data is cleaned up, to ensure no other malformatted HTML tags make their way into the database.

Let me know if I can provide any additional details to reproduce the problem, I think my original post & screenshots should show you the full extent of the problem, however.

Cheers & thanks in advance!
-Art

Hi @engcom-Echo. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

  • [ ] 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).
    DetailsIf the issue has a valid description, the label Issue: Format is valid will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid appears.
  • [ ] 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description label to the issue by yourself.

  • [ ] 3. Add Component: XXXXX label(s) to the ticket, indicating the components it may be related to.

  • [ ] 4. Verify that the issue is reproducible on 2.3-develop branch

    Details- Add the comment @magento give me 2.3-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.3-develop branch, please, add the label Reproduced on 2.3.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and _stop verification process here_!

  • [ ] 5. Add label Issue: Confirmed once verification is complete.

  • [ ] 6. Make sure that automatic system confirms that report has been added to the backlog.

Hello @not-art

Thank you for contribution and collaboration!
I moved this issue into 聽Magento Commerce private repository where it could be fixed and delivered by Solution Partners Contribution Program.
Current repository and issue tracker aimed at Magento Open Source version only and the main focus is community contribution/collaboration. It described in Issue reporting guidelines and it is a part of the issue report template:

Verify, that the issue you are about to report does not relate to the Magento Commerce. GitHub is intended for Magento Open Source users to report on issues related to Open Source only. There are no account management services associated with GitHub. You can report Commerce-related issues in one of two ways:

  • You can use the Support portal associated with your account
  • If you are a Partner reporting on behalf of a merchant, use the Partner portal.

I believe Page Builder's use of DOMDocument is to blame for the auto formatting of your content, it has a tendency to modify HTML when it's processing it, in your case it's fixing your broken HTML.

This is handled by app/code/Magento/PageBuilder/Model/Filter/Template.php and you'll be able to see where we make detections on your description to determine if we need to do additional processing.

While I can appreciate this is annoying I don't believe this is a bug. Trying to utilize invalid data within a system solely responsible for modifying and generating valid HTML is going to have side effects.

My recommendations would be to either disable Page Builder programmatically in areas you do not require it or to look into re-importing your product descriptions from a known good source, without the invalid markup.

After review and discussion with the internal "Page Builder" team was confirmed it is not a bug.
Details in the comment above

We are provided product feeds from 3rd party vendors we source inventory from. While most of the time their product data has valid HTML, sometimes it does not. We have no way of easily checking if every product's data is correct when importing, and vendors WILL make mistakes; human nature and all.

In those scenarios, my above-mentioned issue is indeed a bug, I would argue; there is no way to see in the admin system that the HTML stored in the database is malformed, AND, on top of that, PageBuilder isn't saving the "corrected" HTML it fixes, since the "pre-save" form value matches the "post-save" value, since the pre-save data's HTML is already corrected. So even if you save the product, the short description won't write the fixed HTML to the database unless the Admin happens to know the HTML was broken, and adds some whitespace or an empty tag.

Our merch team reviews each product we import before putting it live on the site, but thanks to how Page Builder is working, our Merch team will NEVER see bad HTML from a vendor feed, and on top of that, Magento won't save the corrected HTML if the admin user activates the product on the site.

Given this workflow, you are causing our customers to have to smoke-test our merchandise. That is not a correct workflow for a store, and in that case, this is indeed a bug. It MUST save the corrected HTML if it's auto-editing my product data.

I will agree that it may not be a bug that Page Builder is automatically fixing malformed HTML in product fields, but at the very least, it's a bug that Magento Admin won't save the automatically fixed HTML to the database without adding some random data to the field to "change" the description and trigger an update.

AND, if it auto-fixes the HTML, it should show some warning to the Admin user TELLING them to save the product. What is the point of auto-fixing content, and then both not telling the admin user about the change, on top of not saving the fixed HTML if they were to happen to click save?

You need to tell the user when malformed HTML is fixed by Page Builder, and suggest that they save the product. Otherwise, stop auto-correcting the HTML in raw html edit mode please. Otherwise you're causing our Merch team to not be able to do it's own QA and forcing our customers to be be our QA. :/

@not-art

I think the best action in this situation to resolve the current issue will report it directly to Enterprise support related to your Magento Commerce account.

Current repository and issue tracker aimed at Magento Open Source version only and the main focus is community contribution/collaboration. It described in Issue reporting guidelines and it is a part of the issue report template:

Verify, that the issue you are about to report does not relate to the Magento Commerce. GitHub is intended for Magento Open Source users to report on issues related to Open Source only. There are no account management services associated with GitHub. You can report Commerce-related issues in one of two ways:

  • You can use the Support portal associated with your account
  • If you are a Partner reporting on behalf of a merchant, use the Partner portal.

Thanks @sdzhepa, that makes sense. We'll submit a Commerce ticket- thank you for the help / explanation.

Was this page helpful?
0 / 5 - 0 ratings