sources we've used:
We are trying to create a gatsby theme that programmatically searches the DOM for certain html elements ('h1' tags, etc...), so we can decorate (add attributes to) the elements before they are written to disk. We have looked into certain ssr hooks such as onPreRenderHTML and onCreatePage (gatsby-node.js) but neither one seems to give us access to the page contents.
```
System:
OS: macOS 10.14.6
CPU: (12) x64 Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
Shell: 3.2.57 - /bin/bash
Binaries:
Node: 12.8.0 - ~/.nvm/versions/node/v12.8.0/bin/node
Yarn: 1.17.3 - ~/.yarn/bin/yarn
npm: 6.10.3 - ~/.nvm/versions/node/v12.8.0/bin/npm
Languages:
Python: 2.7.16 - /usr/bin/python
Browsers:
Chrome: 81.0.4044.113
Safari: 13.1
npmPackages:
gatsby: 2.20.8 => 2.20.8
gatsby-plugin-react-helmet: ^3.1.22 => 3.1.22
gatsby-plugin-sass: ^2.1.29 => 2.1.29
gatsby-source-contentful: ^2.1.89 => 2.1.89
gatsby-transformer-remark: ^2.6.53 => 2.6.53
npmGlobalPackages:
gatsby-cli: 2.7.58
gatsby-core-utils: 1.0.13
gatsby-dev-cli: 2.5.35
gatsby-theme-minimal: 1.0.0
gatsby: 2.8.8
### File contents (if changed)
`gatsby-config.js`: N/A <!-- Please use a code block or just leave it as is if wasn't changed -->
`package.json`: N/A <!-- Please use a code block or just leave it as is if wasn't changed -->
`gatsby-node.js`:
exports.createPages = ({ page }) => {
...
console.log(page)
//logging undefined
`gatsby-browser.js`: N/A <!-- Please use a code block or just leave it as is if wasn't changed -->
`gatsby-ssr.js`:
exports.onPreRenderHTML = ({ getPreBodyComponents }) => {
let preBodyComponents = getPreBodyComponents()
console.log(JSON.stringify(preBodyComponents, null, 2))
//logging empty []
}
```
Can you elaborate a bit on what attributes you're talking about? There are a few options but they are really advanced, like adding your own React.createElement wrapper, ....
Would listening on postBuild work and change the already written files to disk?
note: this hasn't been tested
exports.onPostBuild = () => {
const htmlFiles = glob(`public/**/*.html`);
const cheerio = require('cheerio')
htmlFiles.forEach((file) => {
const $ = cheerio.load(fs.readFileSync(file).toString());
$('h1').attr('new-attr', 'value')
fs.writeFileSync(file, $.html())
})
}
Hi @wardpeet - I've been working with Sam on this problem and our thinking was that there would be a moment in the build when we would have access to some sort of data structure that we could manipulate before things are written to disk.
The context is that we're trying to decorate the elementtiming attribute into our page but the pages are built using third party components that we don't control. Looking at this example, it looks like onCreatePage should receive a page argument. The types seem to say the same. When we tried that, though, page was undefined (see our first example snippet)
While your suggestion for onPostBuild makes sense, I had hoped to run this process as a part of the page-building process because, as I understand it, Gatsby will parallelize that work across my cores by page whereas onPostBuild would only run on the JS main thread.
You should get the page when running exports.onCreatePage = ({ page }) => {. It's possible to inject context variables into the page but you won't be able to change all h1 components.
Another option is to run babel/webpack AST transformations to accomplish your tasks. Can you give a proper solution on what you're trying to do exactly? A small code snippet before & after of the actual react code you would like to have.
Ah, sure. Whether JSX or HTML, what we want to do is to take
<html>
<head>
...
</head>
<body>
<div>
<h1>I'm the first H1 on the page</h1>
</div>
</body>
</html>
and turn it into
<html>
<head>
...
</head>
<body>
<div>
<h1 elementtiming="first-h1">I'm the first H1 on the page</h1>
</div>
</body>
</html>
Perhaps I'm doing something wrong with my onCreatePage because when I add
exports.onCreatePage = ({ page }) => {
console.log(`OnCreatePage`);
console.log(args);
}
to my gatsby-node.js, those console.logs never fire. That's not strictly related to the issue of rewriting HTML but is impeding my ability to report what I'm receiving. I think this issue was caused by intentional behavior within Gatsby and I have a workaround
You can use pageContext to inject variables in your page but you still have to wire them correctly. See https://www.gatsbyjs.org/docs/creating-and-modifying-pages/#pass-context-to-pages
I don't have the full use case but I feel like the onPostBuild is your best bet. You can still use https://github.com/facebook/jest/tree/master/packages/jest-worker to distribute the work over cores. Also Gatsby only does a part distributed, we'll work more on this.
@wardpeet - I've got a working solution using onPostBuild.
As feedback for Gatsby, it would be helpful if I could access these HTML files via webpack in order to alter them (for instance with a critical CSS generator) but this issue can be closed.
Hiya!
This issue has gone quiet. Spooky quiet. 馃懟
We get a lot of issues, so we currently close issues after 30 days of inactivity. It鈥檚 been at least 20 days since the last update here.
If we missed this issue or if you want to keep it open, please reply here. You can also add the label "not stale" to keep this issue open!
As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!
Thanks for being a part of the Gatsby community! 馃挭馃挏
@abmagil, this could slowdown the process by a ton and seems like a niche use case. Thanks for the feedback, I'll share it with the team!