Pg-promise: Trying to restore database from dump file using QueryFile

Created on 16 Apr 2018  路  11Comments  路  Source: vitaly-t/pg-promise

I've got a big database dump file (created with the pg_dump command) which restores the database totally fine using psql or pg_restore.

Now I'm trying to restore the database from this file in Node without using psql or pg_restore, and tried using your pg-promise library. Unfortunately it does not seem to work.

By running the following code:

const queries = pgp.QueryFile('data.sql', { minify: true })
db.many(queries)
    .then(res => {
        console.log(res)
    })
    .catch(e => {
        console.log(e)
    })

I get the following error message:

QueryFileError {
    message: "Failed to parse the SQL."
    options: {"debug":false,"minify":true,"compress":false,"noWarnings":false}
    file: "data.sql"
    error: SQLParsingError {
        code: parsingErrorCode.multiLineQI
        error: "Multi-line quoted identifiers are not supported."
        position: {line: 39139, col: 299}
    }
}

While running the same code with { minify: false } returns this error:

{ error: syntax error at or near "1"
    at Connection.parseE (.../search-benchmark/node_modules/pg/lib/connection.js:545:11)
    at Connection.parseMessage (.../search-benchmark/node_modules/pg/lib/connection.js:370:19)
    at Socket.<anonymous> (.../search-benchmark/node_modules/pg/lib/connection.js:113:22)
    at emitOne (events.js:115:13)
    at Socket.emit (events.js:210:7)
    at addChunk (_stream_readable.js:266:12)
    at readableAddChunk (_stream_readable.js:253:11)
    at Socket.Readable.push (_stream_readable.js:211:10)
    at TCP.onread (net.js:585:20)
  name: 'error',
  length: 93,
  severity: 'ERROR',
  code: '42601',
  detail: undefined,
  hint: undefined,
  position: '64306',
  internalPosition: undefined,
  internalQuery: undefined,
  where: undefined,
  schema: undefined,
  table: undefined,
  column: undefined,
  dataType: undefined,
  constraint: undefined,
  file: 'scan.l',
  line: '1087',
  routine: 'scanner_yyerror' }

I can't provide the SQL-file. Do you see any obvious mistakes I've made or do you have any idea about what might be wrong? Thanks.

Edit: To be clear, I know what the "Multi-line quoted identifiers are not supported." means, but I do not see any easy way to fix this without manually going through the large SQL-file.

question

All 11 comments

I do not see any easy way to fix this without manually going through the large SQL-file

The error tells you exactly the line and column where the problem is. Should be easy to locate it and fix.

position: {line: 39139, col: 299}

I know, but I've got about 100 occurrences of the same problem in the SQL-dump. And I'm trying to automate the process of dumping the db > drop the db > create new clean db and then execute the queries from the SQL-dump in the new clean db, all without having to manually edit the SQL-file every time.

There's no way around the error? When querying with minify: false I get a different type of error. Do you think the same problem is the origin of that error as well?

There are some apparent issues with the format of the SQL file that you generate.

psql is known to pre-parse and work-around some formatting issues, while neither this library no its underlying driver can, i.e. the file is either valid SQL or it is not.

I would suggest looking into how you end up getting quoted multi-line identifiers there, so may be you can avoid it when generating such SQL files. Normally, one never uses multi-line quoted identifiers, so it looks like an anomaly in the output file, like a formatting issue.

Do you think the same problem is the origin of that error as well?

Quite possibly, but I can only speculate there, since I do not have the SQL file.

@ferdizz I understand your SQL file is huge, but I still recommend that you upload it somewhere and post the link here, or else I won't be able to help any further with this.

I can't provide the SQL-file because of the private data it contains. However I've got a workaround solution working by executing the psql-command psql database < data.sql in the Node-script, which will work for now (downside is it requires psql to be installed). Thank you anyways.

I get the same error with minify : false It's because I have a double quota in my data "
How can I solve this? don't tell me to modify data please.

It depends on the contents of the SQL file. If your files contains COPY statements, linebreaks instead of semicolons, tabs instead of commas and \N instead of null statements (typical copy dump format) then you won't be able to parse it with pg-promise nor any pg syntax interpreter.

In postgres tools such as pgAdmin (or phpPgAdmin) you don't load the dumps in the same panel you execute queries. You have to upload such scripts as attachments which will in turn execute through an asynchronous task.

This means you don't execute such a file. You copy it from STDIN.

Executing such scripts is completely out of the scope of syntactic sugar provided by pg-promise

What I am trying to achieve is to dump a .sql file to create the whole database (part of an automated process) I don't do it by hand (CLI or using pgadmin). Other counter part is that I run it in serverless I don't have a CLI where I can use pg_restore. Is there a way to generate the backup with this characters you mentioned scaped in order to pg-promise to parse it?

Other solution I tried was using pg-minify but it parses strings not files.

Well, I'm the current mantainer of https://github.com/HuasoFoundries/phpPgAdmin6 which is a fork of a long abandoned project.

The part to process such a file is a very, very convoluted method taken from the very source code of bin/psql/mainloop.c lexer.

Since the driver won't provide a lexer, and pg-promise is a superset of a driver, the only approach I see is to build a lexer on JS.

@amenadiel Thank you for the follow-ups!

pg-promise wasn't designed to handle large SQL dumps, and especially that use non-standard SQL syntax. The library was designed to handle SQL files that fit well in memory and can be processed as a single string. Anything else is out of the scope, and underlying driver node-postgres doesn't support it either.

The latest version of pg-minify is much more functional. It now supports nested multi-line comments, better special-case handling, and it can process a 5MB SQL in under 50ms.

But it still cannot do any magic with SQL that is simply invalid. The library is trying to be consistent with how PostgreSQL itself handles the SQL, and to fail on invalid SQL in the exact same way. And this is what it does. If you've got invalid SQL as the input, the library will tell you so, with details about the error and the position. But the onus is on you to further locate and fix the problem in the SQL.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

msjoshi picture msjoshi  路  4Comments

illarionvk picture illarionvk  路  3Comments

calibermind picture calibermind  路  3Comments

vitaly-t picture vitaly-t  路  3Comments

dzaman picture dzaman  路  3Comments