Pg-promise: Setting application_name to current application user on global db object

Created on 14 Oct 2017 · 24Comments · Source: vitaly-t/pg-promise

I'm trying to achieve something similar to #71, #100 — setting the application_name to include the current application user's id for use in audit tables (via trigger functions).

I haven't been able to set the application_name on a configured db object (despite this SO post) and the furthest I've got is this example from Vitaly of using event extend:

// with extra parameter:
var options = {
    extend: obj=> {
        obj.myMethodWithParam = (query, values, context)=> {
            // append/pre-pend query parameters, using the context;
            return obj.query(query, values);
        }
    }
};

How can I set application_name and would use of extend mean having to use db.myMethodWithParam vs. db.any, db.one etc. or does it 'extend' the base methods?

Thanks,
Lloyd

question

Source

lloydh

All 24 comments

Do I understand it right that you want to extend the protocol according to what database it is?

If so, then you should use the Database Context, see the [Database] constructor.

var options = {
    extend: (obj, dc) => {
        // check the `dc` and apply the extension according to the context
    }
};

The dc is available everywhere in the protocol ;)

does it 'extend' the base methods?

You add new methods that can call an existing method + do something else.

vitaly-t on 14 Oct 2017

Sorry if my question was misleading, or perhaps I've been looking at the wrong solutions.

In my case there's only 1 database and the goal is to be able to set the logged-in application user's id in some way so that this can be used for an audit trail.

lloydh on 14 Oct 2017

I just did a test against the latest pg-promise v7.0.1 + PostgreSQL 10, and it works as expected:

const pgp = require('pg-promise')();

const appName = 'my-app';

const db = pgp({
    database: 'pg_promise_test',
    port: 5432,
    user: 'postgres',
    application_name: appName
});

db.any('SELECT application_name FROM pg_stat_activity WHERE application_name = $1', appName)
    .then(data => {
        console.log(data);
        pgp.end();
    });

Outputs:

[ anonymous { application_name: 'my-app' } ]

vitaly-t on 14 Oct 2017

That would only work if I created a new db object for each request with the user id _from that request_ but my understanding is that's horrible for performance.

lloydh on 14 Oct 2017

You want the interactive user from the request? That's quite unrelated to all of this. If you have the user from the request, you log right there, so what's the problem?

And your question in this case is completely misleading. It got nothing to do with event extend or the application_name. It's just basic logging logic that you implement yourself.

vitaly-t on 14 Oct 2017

Also, what do you think your method myMethodWithParam should do? You cannot amend query like that.

You can either execute two queries, like log insert + the actual query, or you can use a template for the query, like from the external SQL file that would expect the extra parameters, so it can amend it.

vitaly-t on 14 Oct 2017

I hadn't figured out what the myMethodWithParam example was going to do but I believe the author of that post was trying to accomplish the same thing as me.

I have the user id at the application layer so in the business logic I could insert into the foo_audit table whenever changing foo. This works but will add noise to the business logic and is more prone to errors should another dev update foo without manually updating foo_audit correctly.

I think I follow your suggestion of using a template but the audit for inserts will need the id returned, so two statements (or a common table expression) are required.

After insert/update trigger functions seem to solve things nicely however the issue is having access to the user id from the application layer — and pg-promise is that bridge.

If there's not an elegant way to achieve this then I'll go with one of the other options.

lloydh on 14 Oct 2017

I cannot advise you on how to template it into a single query, as it depends on the database architecture, and what it is exactly you are expecting.

At best, I can show you how you can do it via 2 queries and event extend:

var options = {
    extend: obj => {
        obj.logContext = context => {
            return obj.none('INSERT INTO logs(context) VALUES($1)', context);
        };
        obj.queryWithContext = (query, values, context) => {
            return obj.logContext(context).then(() => {
                return obj.query(query, values);
            });
        }
    }
};

vitaly-t on 14 Oct 2017

Thanks for the help.
Lloyd

lloydh on 14 Oct 2017

🎉1

@vitaly-t: Hello again! Since this keeps coming up, for the record: what's the recommend way to prepend/append a single SET statement with a per-request unique identifier*?

*Allowing re-use of connections/pg-instances between users/requests and use of pg-monitor.

SET application_name = <per-request-unique-uidentifier>

I've been version locked with my nasty hack and am just now trying to fix this the correct way :-)

jmealo on 25 Oct 2017

@jmealo you want to execute SET application_name on every HTTP request? That doesn't sound right, because due to parallel HTTP on top of parallel query execution you are likely to keep overriding it, thus ending up with invalid current application.

The only way it won't override the setting is by using a transaction. But using a transaction for every HTTP request is a horrible idea, because transactions are blocking operations.

So I don't think you'd want to do it :wink:

vitaly-t on 25 Oct 2017

Anyhow, as a pure exercise, you can play with it. One approach is to extend the protocol with your custom task method that will execute SET application_name = ... in the beginning of every task:

const initOptions = {
    extend: obj => {
        obj.appTask = (appName, cb) => {
            return obj.task('app-task', t => {
                cb = cb.bind(t, t);
                return t.none('SET application_name = $1', appName).then(cb);
            });
        }
    }
};

Then you can call:

db.appTask('app-name', t => {
    // do the task....
}).then().catch();

vitaly-t on 25 Oct 2017

@vitaly-t Use case: if you set the application name GUC to a combination of a session id and correlation id you can match any PostgreSQL log output to a given correlation id or web application session. The application_name GUC is the only way to influence the CSV logging at run-time. Extending Postgres to do this would be an issue for those using hosted solutions.

So to be clear, I do not need a transaction per query, I need a SET statement prepended to allow queries issued by pg-promise so that the query is attributed to the correct correlation id. It's strictly for correlating logs and can be used in RLS rules to reference the web application user.

If there is something I misunderstand about the internals of pg-promise that would break the above use case please let me know, I've had a shim in place in production for 2 years but I'd really like to get rid of it and use the latest version of pg-promise. The shim wraps each method and munges the input strings. I only have very limited use of the library because of this.

jmealo on 28 Oct 2017

@jmealo The example I showed above is a task, not a transaction.

And of course you can alternatively prepend a set to each query either manually or by extending the protocol.

Here, for example, we introduce a new special appQuery method that extends the existing [query] method by prepending each query with SET application_name = :

const initOptions = {
    extend: obj => {
        obj.appQuery = (appName, query, values, qrm) => {
            const sql = pgp.helpers.concat([
                {query: 'SET application_name = $1', values: appName},
                {query, values}
            ]);
            return obj.query(sql, [], qrm);
        }
    }
};

It would be well-performing, as the query is concatenated and executed only once.

See also method helper.concat ;)

vitaly-t on 28 Oct 2017

You can also set the PGAPPNAME environment variable. If you are using dotenv in yout project, then it will be sent by default when opening the connection. This also works natively on Heroku.

ffflabs on 19 Apr 2018

@amenadiel @vitaly-t Thank you for your suggestions, however, they do not accomplish the stated goal, just to re-iterate:

The app name can be included in PostgreSQL logging.
App name is the only value that can be output via standard logging AND set per-query via GUC.
Setting the app name enables us to output an arbitrary value that will be associated with a given query in the logging facilities provided by PostgreSQL.
If we prepend a SET statement to each query that sets the application name to a correlation/request id/web application user name we're able to do application-level/user-level tracing without adding logging facilities to our web application.
To make absolutely sure that every query issued by pg-promise on behalf of a given request (or given user), we'd want to inject values into a SET statement along with each query.

Every suggestion that I've received does not accomplish the goals above. I had to wrap pg-promise and am locked into an older version which allowed such monkey patching.

When using something like Koa where we can have a middleware that presents pg-promise to the request context it would be hepful to set these state variables without entirely wrapping pg-promise.

jmealo on 19 Apr 2018

Another thought, if the number of apps is very limited, you can throw each app-specific stuff into its own schema, and then use separate database objects ;)

vitaly-t on 19 Apr 2018

@vitaly-t: This works with older versions of pg-promise but I cannot upgrade, hopefully this makes my user case clear: https://gist.github.com/jmealo/58c96de7add64ebbab86a637a46d3417

guc = {
    'spark.user_id': ctx.userId,
    'spark.role': ctx.role,
    'spark.request_id': requestId,
    application_name: `spark-api_${ctx.username}_${requestId}`
};

This populates a set of GUC variables that can be used both for logging and row-level security.

The goal here is to enforce that a given set of GUC variables be SET before issuing every query leaving pg-promise. The GUC variables depend on some state, so there would need to be a way to bind that state.

pg-promise has tons of facilities for modifying the query, however, I have been unable to find one that will work appropriately within a request/response workflow where we want to use state from the request in the GUC variables (see example above).

jmealo on 19 Apr 2018

@jmealo This is an old ticket now, but I want to do the same, did you ever land on a good solution? I'm using Koa2 and want to be able to set GUC variables within the request workflow. It seems a transaction per request is the way to go :?

abeluck on 24 Jan 2020

@abeluck: I just stayed locked at an older version. If a transaction per-request works for your application, that should be safe but may not meet your performance requirements. This gist may be a good starting point.

IIRC, @vitaly-t warned that this wrapper might break some guarantees with tasks. I encountered no issues by prepending the queries. It's worth noting that I limited most requests to a single round trip using a CTE where I would grab everything in one go.

If you design your app/schema/queries around the premise that you want end-to-end logging, multi-tenancy (using schemas) and RLS using application roles rather than PostgreSQL roles and are going to prepend each query with the correct application name you should be fine. You can enforce this pretty easily by requiring the GUC to be set via RLS and then any logic errors should be caught in development. YMMV. Good luck!

jmealo on 12 Feb 2020

👍1

I generally advise against wrapping base methods like that, because it won't work within tasks and transactions. If you want to extend the protocol, you should use event extend instead.

vitaly-t on 12 Feb 2020

👍1

@vitaly-t: I hope that you are healthy/well. Please consider the use case of row-level-security that uses web application roles instead of database roles:

https://www.graphile.org/postgraphile/security/#how-it-works
https://postgrest.org/zh/stable/auth.html

We need an API that allows passing in an application context (in our case, an http request) that will be used to populate settings (GUC) based upon that context by prepending SET commands to outgoing queries.

Here is an example of one such API. .

How can we implement this with pg-promise?

I still don't see a way to use extend for this use case (set guc values based on an incoming web request [[ or custom context]] prior to issuing each query)

I'm working on a new project and I'd really like to have this functionality as both postgraphile and postgrest support using GUC settings to enforce RLS rules based on application roles to great success.

jmealo on 13 Apr 2020

@jmealo Something like this might help?

prepending SET commands to outgoing queries.

Do you intend to pretend SET to every single query? That would be not a good idea. Especially, inside a task or transaction, where you would end up prepending and executing useless SET queries that just repeat each other.

It is better to do inside a task or transaction, because there you can check for a fresh connection, and execute SET only for fresh connections...

await db.task('my-task', async t => {
    const isFreshConnection = t.ctx.useCount === 0;

    if(isFreshConnection) {
        await t.none('SET ...');
        // etc.
     }

   // then your queries...

});

vitaly-t on 13 Apr 2020

@vitaly-t: Thanks for the quick reply. I looked over #710 and what you have here and I'm still not sure how to expose full-functionality of pg-promise while requiring every query to execute a "pre-query" first. For my use case it's critical that there's no way to issue a query without the search_path and set_config/current_setting values being set / bound to the user session making the request/query.

Here's what I get by prepending queries:

The settings are used in row-level security rules
The search_path is used to select the correct tenant
Setting the application_name to a correlation id for each web request allows tracing of any outgoing queries made while fulfilling that request (application_name is available in the postgresql logs)

My primary concern is that I want to make it impossible for an outgoing query to be made without setting the search_path and settings.

If at all possible, I don't want it to be on developers to only use a subset of pg-promise features, or through later advanced usage somehow make a query without the SET statement.

jmealo on 13 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings