Pg-promise: Auto camelCase

Created on 14 Jan 2016 · 25Comments · Source: vitaly-t/pg-promise

Hi Vitaly,

I noticed your issue in humps, are you considering the possibility of (optionally) auto camelCasing a returned object from a query?

That would be great!

question

Source

jcristovao

Most helpful comment

You are welcome!

And here's a seriously optimized version of it, way better then my previous implementation:

function camelizeColumnNames(data) {
    var template = data[0];
    for (var prop in template) {
        var camel = humps.camelize(prop);
        if (!(camel in template)) {
            for (var i = 0; i < data.length; i++) {
                var d = data[i];
                d[camel] = d[prop];
                delete d[prop];
            }
        }
    }
}

In fact, this one has such a good performance, one can even use it in performance-critical stuff ;)

vitaly-t on 15 Jan 2016

❤5 👍1

All 25 comments

I wasn't considering adding that feature, no, but the possibility of implementing it, if necessary.

This is why version 2.8.0 introduced support for event receive, during which you can do whatever you like on the original data set before it reaches the client.

And as an internal test, I did use that humps library to convert the column names. Read further.

vitaly-t on 14 Jan 2016

I used the following implementation as a test, also optimized for performance:

var options = {
    receive: function (data /* , result, e */) {
        camelizeColumnNames(data);
    }
};

var humps = require('humps');
var pgp = require("pg-promise")(options);

function camelizeColumnNames(data) {
    var names = Object.keys(data[0]);
    var camels = names.map(n=> {
        return humps.camelize(n);
    });
    data.forEach(d=> {
        names.forEach((n, i)=> {
            var c = camels[i];
            if (!(c in d)) {
                d[c] = d[n];
                delete d[n];
            }
        });
    });
}

One should always consider the performance implication such transformation may carry when pre-processing large data sets.

vitaly-t on 14 Jan 2016

Thanks! It works perfectly!

Yes, I am aware of it, I'm considering for special cases to have two separate pgp instances, one 'camelCasing' for non-performance-critical stuff, and another one for critical stuff.

jcristovao on 15 Jan 2016

You are welcome!

And here's a seriously optimized version of it, way better then my previous implementation:

function camelizeColumnNames(data) {
    var template = data[0];
    for (var prop in template) {
        var camel = humps.camelize(prop);
        if (!(camel in template)) {
            for (var i = 0; i < data.length; i++) {
                var d = data[i];
                d[camel] = d[prop];
                delete d[prop];
            }
        }
    }
}

In fact, this one has such a good performance, one can even use it in performance-critical stuff ;)

vitaly-t on 15 Jan 2016

❤5 👍1

@jcristovao Just made the example part of the official documentation, see receive

vitaly-t on 15 Jan 2016

@jcristovao Check out the new 2.9.0 feature - Query Files ;)

vitaly-t on 18 Jan 2016

@jcristovao

Check out - my latest meditation about the query performance: Performance Boost ;)

vitaly-t on 26 Jan 2016

Was just wondering if this is possible. You got me covered! Thanks.

ferdinandsalis on 25 Feb 2017

@ferdinandsalis there is even some user's re-publishing of it:

Pg-promise and case sensitivity in column names :wink:

vitaly-t on 25 Feb 2017

Thank you all. This is really helpful.

I believe the code below will be more optimised, especially when a lot of rows are returned. (in my test, 204,000 rows were camelised in 3019ms compare to 5600ms using the function above)

function cameliseColumnNames(data) {
    const firstRow = data[0];
    const camelObj = {};
    Object.keys(firstRow).forEach((prop) => {
        const camel = pgp.utils.camelize(prop);
        if (camel !== prop) {
            camelObj[prop] = camel;
        }
    });
    for (let i = 0; i < data.length; i++) {
        const row = data[i];
        Object.keys(camelObj).forEach((key) => {
            row[camelObj[key]] = row[key];
            delete row[key];
        });
    }
}

arm5472 on 5 Oct 2017

👍1

@arm5472 couple issues in your code...

Your optimization is made in such a way that it will only be of benefit when some of the columns do not need to be camel-cased. I think if all columns need to be camel-cased, it is likely to become slower.

Second problem - you do not check for camel-case overrides, like I do here: http://vitaly-t.github.io/pg-promise/global.html#event:receive

if (!(camel in tmp)) {

This check prevents us from overriding an existing column when after caramelizing it matches another existing column. And you do not do this check in your code ;)

Example: column _user-id can override column User ID, etc. Those are unique cases, but still ;)

Still, it is a good effort, although I wouldn't use forEach , rather for for optimum performance ;)

And I might even update my algorithm later on ;) I will post an update, if I do ;)

vitaly-t on 5 Oct 2017

@vitaly-t many thanks for your reply and pointing out the matching issue.

I added the if condition which you mentioned and also only selected columns which required camelisation (38 columns in my new test). The result still shows a big gap with the same number of records; 1933ms vs 3529ms.

Try it for yourself, if you have a moment, please.

arm5472 on 5 Oct 2017

@arm5472 which version of Node.js did you use?

I've run tests against the latest, Node 8.6.0, and results are the opposite, your version is slower than mine by about 10%. Maybe you didn't measure it right? Here's the complete test.

const data = [];

for (var i = 0; i < 1000000; i++) {
    data.push({
        'first-col': i + 1,
        'second-col': i * 2,
        'third-col': i * 3,
        'fourth-col': i * 4,
        'fifth-col': i * 5
    });
}

function cameliseColumnNames1(data) {
    const tmp = data[0];
    for (let prop in tmp) {
        const camel = pgp.utils.camelize(prop);
        if (!(camel in tmp)) {
            for (let i = 0; i < data.length; i++) {
                const d = data[i];
                d[camel] = d[prop];
                delete d[prop];
            }
        }
    }
}

function cameliseColumnNames2(data) {
    const firstRow = data[0];
    const camelObj = {};
    Object.keys(firstRow).forEach((prop) => {
        const camel = pgp.utils.camelize(prop);
        if (camel !== prop) {
            camelObj[prop] = camel;
        }
    });
    for (let i = 0; i < data.length; i++) {
        const row = data[i];
        Object.keys(camelObj).forEach((key) => {
            row[camelObj[key]] = row[key];
            delete row[key];
        });
    }
}

function test() {
    const start = Date.now();
    cameliseColumnNames2(data); // change it between 1 and 2
    const duration = Date.now() - start;
    console.log(duration);
}

test();

And that's your original version, without the extra check even.

vitaly-t on 5 Oct 2017

I have done a lot more tests, and can state conclusively, your tests are wrong. The implementation that I provided is as fast as can possibly be. You cannot do it faster. Something is wrong with your measurement.

vitaly-t on 5 Oct 2017

Thanks for your test.
It turns out that the more columns you have, the more beneficial my code would be compare to yours.
You have only 5 columns in your test. If you have 10 or 20 (or in my test 38), you see the gap in performance widens hugely. (make 30 columns and it runs 3 times faster)
I guess it's now down to users as to which one to pick, based on their situation. In my case, the loss with small number of columns is little, but the gain with more columns is substantial. But I understand this is not the case for everyone.
Thanks again @vitaly-t

arm5472 on 6 Oct 2017

I don't see how this is possible. Your algorithm is doing the same as mine, except that you are doing it in 2 steps what mine is doing in 1. I do not see how it can be faster.

Your condition if (camel !== prop) is covered by mine if (!(camel in tmp)), i.e. I do all in one step, so your algorithm adds nothing extra, except it makes it longer.

vitaly-t on 6 Oct 2017

I have increased the number of columns, and I still see your algorithm being slower than mine. I don't know what kind of tests you are running, but it doesn't seem right.

vitaly-t on 6 Oct 2017

The main difference is I don't loop through all rows for every camelised column. I only go through each row once, replacing all necessary columns. You can see the performance difference if you add more columns to your test.

As I mentioned before, it's a matter of choice. I'm writing an API which returns a lot of data (both columns and rows). But for the calls that return small data, the difference is negligible anyway.

arm5472 on 6 Oct 2017

Again, what is your version of Node.js?

vitaly-t on 6 Oct 2017

I'm using node 8.3.0.

arm5472 on 6 Oct 2017

Tried that, the results are the same.

Oh, and your algorithm also will show very poor performance when only a subset of column needs to be changed. You may not know it, but delete is the heaviest operation in JavaScript, and you are using it for all the columns, even when it is not needed at all.

vitaly-t on 6 Oct 2017

Anyway, I can see this is not going anywhere, so I want to park where it is ;)

vitaly-t on 6 Oct 2017

Hello. First of all, thank you @vitaly-t for replay! =)
I tried your solution and it works nice.
I have one more additional question to this case.
What if after retrieving an object from the database I want to modify it in code and update it in the database with 'this' syntax. Or for example, I have an object with typescript interface based on camel case and I want to insert it into the databesa.

'this' syntax.

const obj = {
    one: 1,
    two: 2
};
db.query('INSERT INTO table(${this:name}) VALUES(${this:csv})', obj);
//=> INSERT INTO table("one","two") VALUES(1, 2)

I will try to explain with code what I want to do

export class User {
    name?: string = null;
    password?: string = null;
    registrationDate?: Moment = moment.utc();
}

let user: User = new User();
user.name = "Taras"
user.password = 'pass'

db.none('INSERT INTO users (${this:name}) VALUES (${this:csv})', user)

So now I have backward problem. I have camel cased property names but in the database, they are underscore cased. Is there some similar solution for it?

Best regards =)

Halynsky on 12 Aug 2018

@Halynsky You cannot use this syntax in this case, as it is only suitable for simple cases, where conversion is not required. For more complex cases you would just need to manually specify the fields.

The comprehensive solution for inserts would be in using the [helpers] namespace. See this example: https://stackoverflow.com/questions/37300997/multi-row-insert-with-pg-promise

vitaly-t on 12 Aug 2018

I convert from camelCase to snake_case this way:

const pgPromise = require('pg-promise');
const camelcaseKeys = require('camelcase-keys');

const options = {
  receive(data, result) {
    result.rows = camelcaseKeys(data);
  },
};
const pgp = pgPromise(options);

and vice versa, from snake_case to camelCase

const decamelcaseKeys = require('decamelcase-keys');

const user = {
  name: 'Taras',
  password: 'pass',
  registrationDate: new Date(),
  userStatus: 'needEmail',
};

db.none('INSERT INTO users ($1:name) VALUES ($1:csv)', [decamelcaseKeys(user)]);
// INSERT INTO users ("name","password","registration_date","user_status")
//      VALUES ('Taras','pass','2020-02-04T19:59:25.466+00:00','needEmail')

The npm modules camelcase-keys anddecamelcase-keys use caching of converted keys (DB columns), so they are very fast - this is the fastest conversion solution I have found.