Hugo: Chops off trailing .Content in one multioutput setup

Created on 7 May 2018  ·  20Comments  ·  Source: gohugoio/hugo

Hello,

I realized that Hugo 0.40.1 introduced a serious regression.

It chops off some of the trailing .Content!

  • This issue is not seen in hugo server.
  • It is only seen when the site is built to public/ using the hugo command.

Bug example

Notice the malformed HTML:

image

This is what I have in the template:

    <h3>Content</h3>
    {{ .Content }}

    {{ if (not .Params.disable_debug) }}
        {{ with .Resources }}
            <hr />
            <h3>Resources</h3>

And this is the HTML around the part where the .Content ends:

<table>
<thead>
<tr>
<th>Inside <code>&lt;ORG_FILE_DIR&gt;</code></th>
<th>Copied-to location inside BUNDLE</th>
<th>Explanation</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>&lt;ORG_FILE_DIR&gt;/bar/baz/foo.png</code></td>
<td><code>&lt;HUGO_BASE_DIR&gt;/content/&lt;SECTION&gt;/&lt;BUNDLE&gt;/bar/baz/foo.png</code></td>
<td>Even if the <strong>outside</strong> path does not have <code>/static/</code>, it is still inside the same dir as the Org file, 



            <hr />
            <h3>Resources</h3>

Notice that the ending portion of the table got truncated!

If I roll back to Hugo 0.40, this bug is gone. See that that table is generated in entirety:

<table>
<thead>
<tr>
<th>Inside <code>&lt;ORG_FILE_DIR&gt;</code></th>
<th>Copied-to location inside BUNDLE</th>
<th>Explanation</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>&lt;ORG_FILE_DIR&gt;/bar/baz/foo.png</code></td>
<td><code>&lt;HUGO_BASE_DIR&gt;/content/&lt;SECTION&gt;/&lt;BUNDLE&gt;/bar/baz/foo.png</code></td>
<td>Even if the <strong>outside</strong> path does not have <code>/static/</code>, it is still inside the same dir as the Org file, so the directory structure is preserved.</td>
</tr>
</tbody>
</table>

<p>See <a href="https://ox-hugo.scripter.co/test/images-in-content/page-bundle-images-in-same-dir/">this other test</a> for an example.</p>




            <hr />
            <h3>Resources</h3>

How to recreate this issue

  1. git clone --recursive -j8 https://github.com/kaushalmodi/ox-hugo
  2. cd ox-hugo/test/site
  3. hugo -b https://ox-hugo.scripter.co/test/
  4. HTML referenced in this report: ./public/bundles/page-bundle-a/index.html.

I have yet been unable to create a smaller reproducible version of this bug. But it is 100% reproducible using the above steps.

  • The bug is absent in Hugo 0.40
  • The bug is present in Hugo 0.40.1 onwards.
Bug

All 20 comments

I wish I could understand the changes in this commit (https://github.com/gohugoio/hugo/commit/288c39643906b4194a0a6acfbaf87cb0fbdeb361) that's the only functional difference between 0.40 and 0.40.1. But that's what introduced this issue.

For the future, please keep bug titles in a neutral tone. Using movie style titles to draw attention really don't work.

I needed to make a point that this was serious; caused content loss. It's not a big deal for my test site, but could cause a problem with professional sites.

I am not sure what was "movie title like" in the old title.. I mentioned key points of information:

  • regression
  • since 0.40.1
  • serious, as causes content loss
  • bug description

I cannot see in the example above that the .Content is chopped off? You say it is chopped off after the content?

ou say it is chopped off after the content?

No, the trailing portion of the .Content is chopped.

If you look at the Markdown source:

|----------------------------------|--------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------|
| `<ORG_FILE_DIR>/bar/baz/foo.png` | `<HUGO_BASE_DIR>/content/<SECTION>/<BUNDLE>/bar/baz/foo.png` | Even if the **outside** path does not have `/static/`, it is still inside the same dir as the Org file, so the directory structure is preserved. |

See [this other test](/images-in-content/page-bundle-images-in-same-dir/) for an example.

you will see that the HTML contains the content only up to "it is still inside the same dir as the Org file,":

<td>Even if the <strong>outside</strong> path does not have <code>/static/</code>, it is still inside the same dir as the Org file, 

This part from the Markdown file (towards the end) is missing in the HTML:

so the directory structure is preserved. |

See [this other test](/images-in-content/page-bundle-images-in-same-dir/) for an example.

Note the "good HTML" snippet generated using Hugo 0.40 has this portion:

so the directory structure is preserved.</td>
</tr>
</tbody>
</table>

<p>See <a href="https://ox-hugo.scripter.co/test/images-in-content/page-bundle-images-in-same-dir/">this other test</a> for an example.</p>

which is absent in the live example.

As the table-closing is chopped off, the "Resources" heading which should be a separate h3 heading ends up in that incomplete table:

image

Also, I don't understand how to build the above project (seems to need Emacs and vc-git etc).

Sorry, I did an edit of the steps few minutes back.. try:

  1. git clone --recursive -j8 https://github.com/kaushalmodi/ox-hugo
  2. cd ox-hugo/test/site
  3. hugo -b https://ox-hugo.scripter.co/test/
  4. HTML referenced in this report: ./public/bundles/page-bundle-a/index.html.

Screencast (asciinema)

Yes, I can reproduce it on your site. I understand the _what_, but not the _why_. I will investigate, and issue a patch release tomorrow. Thanks for the report.

And re. the issue title, a movie title is of type "Scary movie in theatres on May 8th"

Issue titles should be short and tell _what_ it is. The when and how serious etc. may belong in the body text, but it should be obvious from the what. I see too many drama titles of type "Hugo 0.x seriously blew up my site." A bug is super-serious to the issue reporter, I understand that, but ...

A bug is super-serious to the issue reporter, I understand that, but ...

Understood.

Yes, I can reproduce it on your site. I understand the what, but not the why. I will investigate, and issue a patch release tomorrow. Thanks for the report.

Thanks!

Looks like this issue is on a lot more pages than I thought; surprised why no one else spotted this.

Another example:

Note that all content after "It is necessary to set the Hugo site config variable" below is lost:

**It is necessary to set the Hugo site config variable
`pygmentsCodeFences` to `true` for syntax highlighting to work for
fenced code blocks.**

[^fn:1]: Even if the user has set the `HUGO_CODE_FENCE` value to `t` (via variable, keyword or subtree property), the Hugo `highlight` shortcode will be used automatically instead of code fences if either (i) the user has chosen to either show the line numbers, or (ii) has chosen to highlight lines of code (See the `ox-hugo` documentation on [Source Blocks](https://ox-hugo.scripter.co/doc/source-blocks)).

Here's the truncated .Content HTML:

<p><strong>It is necessary to set the Hugo site config variable
<





        <hr />
        <a id="debug"></a>

Something special about me using <hr /> immediately after {{ .Content }}? :)

image

surprised why no one else spotted this.

Your site does not represent the average Hugo site.

If you could take the PR above for a spin and tell me if that fixes this issue.

If you could take the PR above for a spin and tell me if that fixes this issue.

I confirm that to fix this issue (I have now rebuilt my site using the fix in the PR. The only way to see the bug is to now use Hugo v0.40.1, v0.40.2 and generate the site locally). Thanks!

Your site does not represent the average Hugo site.

OK. Not sure what would represent an average Hugo site. The only thing different I am doing in this site is print out all the debug info and implement search. Apart from that, it's pretty bare bones. I don't even use any shortcodes other than may be figure (apart from few test pages where I test shortcodes).

In any case, I'd like to understand how my site turned out to be a corner case (if so). Hopefully looking at the tests you add helps me understand that.

Also note that this issue showed up only on doing hugo, not hugo server. So might be a useful piece to this puzzle too.

The only thing different I am doing in this site is print out all the debug info and implement search. Apart from that, it's pretty bare bones.

I will look at this more tomorrow. This is a bug, but I suspect that it relates to your not very _bare bones_ page output definitions:

▶ find content -name "*.md" | xargs grep outputs
content/search.md:outputs = ["html", "json"]
content/posts/output-html-and-json.md:outputs = ["html", "json"]
content/posts/output-html-and-json.md:tags = ["outputs", "json"]
content/posts/output-html-and-json.md:template lookup hierarchy for the JSON outputs to be created.
content/posts/output-empty.md:title = "Setting empty outputs is fine"
content/posts/output-empty.md:tags = ["outputs", "empty"]
content/posts/output-empty.md:will not set the `outputs` variable in the front-matter at all. So
content/posts/keyword-collection.md:outputs = ["html", "json"]
content/posts/keyword-collection.md:-   [X] `#+hugo_outputs`

Aha, to separate out the matches with "outputs" in that:

Files with outputs front-matter

> find .  -name "*.md" | xargs grep -P 'outputs ='
./posts/keyword-collection.md:outputs = ["html", "json"]
./posts/output-html-and-json.md:outputs = ["html", "json"]
./search.md:outputs = ["html", "json"]

The first two are my tests for setting outputs. The last is what implements the test-site-wide search.

outputs as a tag

> find . -name "*.md" | xargs grep -P 'tags.*"outputs?"'
./posts/output-empty.md:tags = ["outputs", "empty"]
./posts/output-html-and-json.md:tags = ["outputs", "json"]

Problem to use outputs as a tag name?


I will look at this more tomorrow.

Thank you! I'll stay tuned! :)

Problem to use outputs as a tag name?

No, that was just me being lazy with the grep.

Hi @bep! Sorry for the interruption, but caught a possible typo in your commit log:

There have been one report of a site with truncated .Content after the Hugo 0.41 release.

I guess you meant to say 0.40 or 0.40.1 release there? Many thanks!

Just a quick note. I will create a patch release with this tomorrow.

Also note that even if the fix was semi-obvious reading the code, I have not been able to reproduce this outside of the synthetic test site from @kaushalmodi -- I have tested the output before/after this patch for a set of sites, including the pretty big Kubernetes site with 1000+ long content pages in both English and Chinese, with exact same result.

I will create a patch release with this tomorrow.

Thanks!

I have not been able to reproduce this outside of the synthetic test site from @kaushalmodi

That definitely makes me feel special 😁. But was the issue caused just because of those outputs front-matter in those 3 pages? Apparently no one/very few people use that front-matter?

Note that my test site is testing every Hugo front-matter, for the sake of my ox-hugo package.

@kaushalmodi I tried to pin-point the exact error case, but I got tired. The tests in this area should be greatly improved as a result of this, and that is great.

I tried to pin-point the exact error case, but I got tired.

No problem. Thank you for the prompt fix, and in addition adding more test cases!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

MunifTanjim picture MunifTanjim  ·  3Comments

artelse picture artelse  ·  3Comments

kaushalmodi picture kaushalmodi  ·  3Comments

geddski picture geddski  ·  3Comments

chrissparksnj picture chrissparksnj  ·  3Comments