We pushed lot's of importer improvements in the last year. I think the performance of the 1.x importer decreased a little, because we refactored the importer code base, fixed bugs and added more features.
One last improvement for the 1.x importer is missing. If you import a large JSON file via the browser, it can happen that a browser or nginx default timeout kicks in too early. Even thought that the server should still finish the import in the background, we should improve this, because the user doesn't really understand what's going on.
I think it's impossible to extend (?) default timeouts for all supported/used browsers and even if we could, there is still a web-server component, which could timeout the connection.
So @kevinansfield and I think that adding short polling for the db import will resolve this. I think therefor we need to do:
/db/import/result) - could simply return inProgress: true|false and the import result with warnings and if it was imported or failed.Triggering another import while one is active, should return an error.
We can keep the import result in process, no need to persist it. If you restart Ghost while the import is running, the import won't finish and nothing get's changed in the database, because we run everything in a transaction. This is the easiest solution i can think of which should work.
NOTE: Currently, if you import a database, you are still able to send any GET/POST/... request to the Ghost API. This is not blocked/denied while the import is running. This is a different topic we might solve in the future. But actually unrelated to short polling.
@kevinansfield Leave any concerns/thoughts as comment 👍
Why are 1 & 2 separate? It seems like the response the client gets back as soon as the upload finishes should be whatever the suggested response for 2 is.
Is there a use-case for uploading an import file and not triggering the import?
Why are 1 & 2 separate? It seems like the response the client gets back as soon as the upload finishes should be whatever the suggested response for 2 is.
We can also merge 1&2 into one step if we want. I thought we might could follow our upload endpoint pattern.
I thought we might could follow our upload endpoint pattern.
If it's a lot easier to do that then that's fine, otherwise I think it makes more sense for a successful upload to automatically trigger the import
Is there a workaround for this, and what are you defining as “large”?
@lordnothing "large" means over ~10-15mb.
There is an open PR, which will speed up the importer by ~50%. Will be released in the next weeks.
the server should still finish the import in the background
If you have a large file to import, I would ignore the client timeout and watch the server logs. Wait some time till it fully finished in the background.
This issue might become less important because of https://github.com/TryGhost/Ghost/pull/9431.
But still something we should consider reserving some time in the future.
Hey folks - is this issue separate from the Node server closing the connection? I'm experiencing consistent import fails (file is 4MB), receiving 502 errors in the browser from the POST request after 2 minutes. I believe that's due to Node's default timeout is set at 2 minutes? https://nodejs.org/api/http.html#http_server_timeout
Is there a way to override that default setting? Should I open a separate issue for this or is it related?
Thanks!
@sbrichards same issue, no need to open a separate one. Until an async background process and polling is implemented it's typically safe to ignore the client timeout:
If you have a large file to import, I would ignore the client timeout and watch the server logs. Wait some time till it fully finished in the background.
Would be really nice to tackle this soon. The browser timeout sucks 🙊
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This wasn’t stale
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Not stale. Plan to review https://github.com/TryGhost/Ghost/pull/10340 coming Monday.
I have a ~700mb import from Wordpress. I increased timeouts in nginx and imported the file. The latest log I see is Dec 01 11:30:08 localhost.localdomain node[812]: [2019-12-01 17:30:08] INFO "POST /ghost/api/v3/admin/db/" 200 67608ms
but now ghost is unresponsive. Is there some database migration happening in the background I have to wait for? I gave up on waiting the first time and tried to restart ghost but it never stopped.
Hi, is there any workaround for this? I have a 7MB file to import that seems to timeout.
Hi, is there any workaround for this? I have a 7MB file to import that seems to timeout.
The issue seems to have been on nginx, fixed using:
proxy_read_timeout 1200; on nginx.conf