A lot of people are using vendor-specific JSON (or in PostgreSQL: JSONB) data types. We should support those types natively and bind them to a string representation in Java.
If users prefer to bind the types to something like Jackson or gson, they can do so easily by using a Converter. The same can be done for JSR 367 bindings.
Some useful information about JSON standards compliance:
https://obartunov.livejournal.com/200076.html
Support is being added for:
See also: #7235
Would it be possible to at least provide an example implementation of Binding that simply translates any Postgres jsonb field to String and vice versa?
I read the docs (https://www.jooq.org/doc/3.11/manual/code-generation/custom-data-type-bindings/) and while most of it looks relatively straight-forward, from the example the sql() method looks relatively advanced and (more importantly) prone to errors and bugs when the implementor does not know anything about Jooq internals (but maybe it's not - hard to judge from my perspective).
In general I am a little disappointed that basic "jsonb as String" support is still not currently supported in Jooq. In plain JDBC it's actually relatively simple to deal with jsonb by manually dealing with PGobject, which I also wouldn't mind Jooq providing. Jooq implicitly does this, by just returning the JDBC object as Object by default, which happens to be a PGobject in the case of jsonb. But unfortunately it's a little ugly to deal with (with @Deprecated annotations and manual type casting).
I understand you have a lot on your plate, so perhaps this reference implementation of Binding that people can copy-paste without much worry could be a good first step to at least help people on their way.
Thanks for the ping, @t9t
Indeed, it has already been decided that we should offer a Binding<"JSON", String> data type mapping out of the box, and let users override this with a more sophisticated binding if needed. I'm not sure anymore, why this got postponed again, but I think we should finally deliver on this, given the number of upvotes on the issue.
Exposing the PGobject is not an option, as this is vendor specific. A jOOQ provided solution must also work with MySQL, Oracle, SQL Server, etc.
I'll look into what this means in detail right away.
An additional option: We could roll our own JSON type that wraps a string and might offer additional functionality in the future, e.g. once the JDK offers a standard JSON binding API... jOOQ already uses https://github.com/fangyidong/json-simple internally, to import/export JSON data. Reusing that library is another option.
Having a custom JSON type seems imperative to type safety, otherwise, users would not be able to distinguish between ordinary string types and actual JSON types, or even compare them, or pass JSON expressions to VARCHAR expecting functions.
One PostgreSQL specific caveat will be the distinction between json and jsonb types. It is not unlikely for jOOQ to need to know the exact type, not just knowing that it is a "json-ish" type, as jOOQ often has to generate explicit type casts...
This will be a breaking change in PostgreSQL, as users will have implemented their own Binding<Object, ?> implementations. Whatever we do now, the database type will no longer be Object
A simple benchmark shows that the distinction between json and jsonb types in PostgreSQL is relevant:
create table if not exists t_json (j json);
create table if not exists t_jsonb (j jsonb);
DO $$
DECLARE
v_ts TIMESTAMP;
v_repeat CONSTANT INT := 200000;
v_json text := '{}';
rec RECORD;
BEGIN
-- Repeat the whole benchmark several times to avoid warmup penalty
FOR r IN 1..5 LOOP
v_ts := clock_timestamp();
FOR i IN 1..v_repeat LOOP
insert into t_json values (v_json::json);
END LOOP;
RAISE INFO 'Run %, Statement 1: %', r, (clock_timestamp() - v_ts);
v_ts := clock_timestamp();
FOR i IN 1..v_repeat LOOP
insert into t_json values (v_json::jsonb);
END LOOP;
RAISE INFO 'Run %, Statement 2: %', r, (clock_timestamp() - v_ts);
v_ts := clock_timestamp();
FOR i IN 1..v_repeat LOOP
insert into t_jsonb values (v_json::json);
END LOOP;
RAISE INFO 'Run %, Statement 3: %', r, (clock_timestamp() - v_ts);
v_ts := clock_timestamp();
FOR i IN 1..v_repeat LOOP
insert into t_jsonb values (v_json::jsonb);
END LOOP;
RAISE INFO 'Run %, Statement 4: %', r, (clock_timestamp() - v_ts);
RAISE INFO '';
END LOOP;
END$$;
drop table if exists t_json;
drop table if exists t_jsonb;
The output is consistently slower when a cast is needed. It doesn't matter in which direction we're casting, and it doesn't matter if we're using json or jsonb in this simple case:
00000: Run 1, Statement 1: 00:00:00.524751
00000: Run 1, Statement 2: 00:00:00.659905
00000: Run 1, Statement 3: 00:00:00.628354
00000: Run 1, Statement 4: 00:00:00.51718
00000:
00000: Run 2, Statement 1: 00:00:00.516217
00000: Run 2, Statement 2: 00:00:00.64406
00000: Run 2, Statement 3: 00:00:00.642708
00000: Run 2, Statement 4: 00:00:00.526859
00000:
00000: Run 3, Statement 1: 00:00:00.502883
00000: Run 3, Statement 2: 00:00:00.636907
00000: Run 3, Statement 3: 00:00:00.636343
00000: Run 3, Statement 4: 00:00:00.525677
00000:
00000: Run 4, Statement 1: 00:00:00.499228
00000: Run 4, Statement 2: 00:00:00.650017
00000: Run 4, Statement 3: 00:00:00.638345
00000: Run 4, Statement 4: 00:00:00.524202
00000:
00000: Run 5, Statement 1: 00:00:00.504289
00000: Run 5, Statement 2: 00:00:00.649801
00000: Run 5, Statement 3: 00:00:00.635239
00000: Run 5, Statement 4: 00:00:00.535
The results are "obvious" of course. We'd hope that using the correct data type should be faster than needing to convert it, but I was curious about how significant this difference is. So we need to get this right. I guess native support for jsonb for all dialects would be an option here. After all, varchar and varbinary are standard SQL types as well, even if on a very technical level, they mean the same thing.
The SQL standard doesn't distinguish between the two physical JSON representations, and I'm not aware of any other dialects than PostgreSQL doing this.
In the mean time I had created this little one-class library: https://github.com/t9t/jooq-postgresql-json
At least to have a reusable away to bind json and jsonb fields to String in my projects. My plan was to also add the various jsonb functions and operators (such as the ?, |?, ->>, etc) to the jooq-postgresql-json library in the coming weeks.
I naively and without any performance test just used ?::json there, but it looks like you have pretty conclusively proven that that is not a good idea. :)
In the mean time I had created this little one-class library: https://github.com/t9t/jooq-postgresql-json
Yes, I've seen it :)
My plan was to also add the various jsonb functions and operators (such as the ?, |?, ->>, etc) to the jooq-postgresql-json library in the coming weeks.
Sure, that's something that would be useful in PostgresDSL, for example.
I naively and without any performance test just used ?::json there, but it looks like you have pretty conclusively proven that that is not a good idea. :)
That was a very naive benchmark indeed, casting only empty objects {}. I'm assuming this gets worse for larger JSON data points. Easy to reproduce:
create table if not exists t_json (j json);
create table if not exists t_jsonb (j jsonb);
DO $$
DECLARE
v_ts TIMESTAMP;
v_repeat CONSTANT INT := 200000;
v_json text := '{"a":1,"b":2,"c":[3, 4, 5, 6]}';
rec RECORD;
BEGIN
-- Repeat the whole benchmark several times to avoid warmup penalty
FOR r IN 1..5 LOOP
v_ts := clock_timestamp();
FOR i IN 1..v_repeat LOOP
insert into t_json values (v_json::json);
END LOOP;
RAISE INFO 'Run %, Statement 1: %', r, (clock_timestamp() - v_ts);
v_ts := clock_timestamp();
FOR i IN 1..v_repeat LOOP
insert into t_json values (v_json::jsonb);
END LOOP;
RAISE INFO 'Run %, Statement 2: %', r, (clock_timestamp() - v_ts);
v_ts := clock_timestamp();
FOR i IN 1..v_repeat LOOP
insert into t_jsonb values (v_json::json);
END LOOP;
RAISE INFO 'Run %, Statement 3: %', r, (clock_timestamp() - v_ts);
v_ts := clock_timestamp();
FOR i IN 1..v_repeat LOOP
insert into t_jsonb values (v_json::jsonb);
END LOOP;
RAISE INFO 'Run %, Statement 4: %', r, (clock_timestamp() - v_ts);
RAISE INFO '';
END LOOP;
END$$;
drop table if exists t_json;
drop table if exists t_jsonb;
Now getting:
00000: Run 1, Statement 1: 00:00:00.569626
00000: Run 1, Statement 2: 00:00:01.169486
00000: Run 1, Statement 3: 00:00:01.034296
00000: Run 1, Statement 4: 00:00:00.855842
00000:
00000: Run 2, Statement 1: 00:00:00.567955
00000: Run 2, Statement 2: 00:00:01.154019
00000: Run 2, Statement 3: 00:00:01.033361
00000: Run 2, Statement 4: 00:00:00.897972
00000:
00000: Run 3, Statement 1: 00:00:00.57026
00000: Run 3, Statement 2: 00:00:01.159787
00000: Run 3, Statement 3: 00:00:01.019552
00000: Run 3, Statement 4: 00:00:00.853984
00000:
00000: Run 4, Statement 1: 00:00:00.576009
00000: Run 4, Statement 2: 00:00:01.193102
00000: Run 4, Statement 3: 00:00:01.027302
00000: Run 4, Statement 4: 00:00:00.916919
00000:
00000: Run 5, Statement 1: 00:00:00.570805
00000: Run 5, Statement 2: 00:00:01.154218
00000: Run 5, Statement 3: 00:00:01.02407
00000: Run 5, Statement 4: 00:00:00.861866
Again, the conversion between types is the worst case. json -> json performs best, because no data processing is being done, which is a well known property of the json type. However, when reading and querying json or jsonb, jsonb will heavily outperform json.
This is still for very small json documents.
An interesting article can be seen here:
https://www.2ndquadrant.com/en/blog/processing-json
Apart from performance, if you're always casting to ::json, you will not be able to correctly use a bunch of operators / functions, which may work with json and/or jsonb, so we really have to maintain this type information in jOOQ's runtime, in order to apply the correct cast.
This was surprisingly easy to do... I've added support for PostgreSQL's json and jsonb data types. Will look into supporting other dialects later this week
Once we have support for these types, we can also support some operators / functions out of the box. It is unlikely we'll support PostgreSQL's ASCII syntax directly, but other databases (like MySQL) have useful functions that should be "obvious" to emulate on PostgreSQL as well: #8944
I guess we'll have to do the same thing for XML, add an org.jooq.XML type, and support it in generated code... I've added another feature request for this: #8945
In Oracle, we have to check the all_json_columns dictionary view to see if a column is a JSON column, as Oracle doesn't have an actual JSON data type for DDL - it's a constraint:
CREATE TABLE t (j VARCHAR2 (4000)
CONSTRAINT ensure_json CHECK (po_document IS JSON)
);
We could think of Oracle's BLOB backed JSON columns as JSONB:
CREATE TABLE j (
s clob CHECK (s IS JSON),
b blob CHECK (b IS JSON)
);
select *
from user_json_columns;
Yielding
TABLE_NAME OBJECT_TYPE COLUMN_NAME FORMAT DATA_TYPE
----------------------------------------------------------
J TABLE B TEXT BLOB
J TABLE S TEXT CLOB
Oracle's "JSONB" type has to be bound / inlined like ordinary blobs. ~The same is true for SQL Server~
SQL Server doesn't seem to offer a way of recognising JSON columns like Oracle. We'll think about this in a future version of jOOQ (check constraints could be reverse engineered).
Implemented for the most popular dialects in jOOQ 3.12 (see list in issue description). More dialects will have JSON support in jOOQ 3.13: #8948
It was breaking my expectations that the only way to retrieve the value from JSON and JSONB instances is using toString() and not an accessor (eg. getData()). And to me it is a little bit awkward that in the case of an SQL NULL, toString() returns null (at least in the case of PostgreSQL).
I would definitely be nice if there was some documentation what happens in the cases of SQL NULL and the JSON null value (which translate to null and "null" respectively, for PostgreSQL).
Thanks for your comment, @t9t. You're right, we'll fix these two issues ASAP: https://github.com/jOOQ/jOOQ/issues/9265
Much appreciated! I'll track that one and related issues.
Hi Lukas,
I'm having some troubles with the new JSONB type. Maybe I misunderstood something:
Prior to the JSONB type in jooq, I had this (simplified) query:
jooq.insertInto(MY_TABLE)
.set(MY_TABLE.MY_COLUMN, DSL.val(aString))
.execute();
If aString was null, I would have a null postgres column ;
If aString was equal to "null", I would have a null jsonb object in my column ;
With the new version of jooq, MY_COLUMN is typed as a JSONB on the java side, the query becames:
jooq.insertInto(MY_TABLE)
.set(MY_TABLE.MY_COLUMN, DSL.val(JSONB.valueOf(aString)))
.execute();
If aString is null, I will have a null jsonb object in my column due to the wrapping of the JSONB class, instead of null value in my column.
Since the JSONB class seems to be a convenient wrapper around the jsonb data, I was expecting the same behaviour as before:
JSONB.valueOf(null) would represent a null columnJSONB.valueOf("null") would represent a null jsonb JSONB object in my columnCurrently, JSONB.valueOf(null) and JSONB.valueOf("null") are not equals, but will have both insert a null jsonb object in database, which I find a little confusing.
Of course, I could simply work around that by cheking the nullity of the String value at every insert.
I was just wondering if I misunderstood something, and if there is a way that would be easier to keep the JSONB type and the previous behaviour ?
@paulomatias: Thanks for your message.
If
aStringis null, I will have anulljsonb object in my column due to the wrapping of the JSONB class, instead of null value in my column.
To get a null value, please use DSL.val(null, SQLDataType.JSONB) instead. You don't even have to wrap things in DSL.val(), you can just write this:
.set(MY_TABLE.MY_COLUMN, aString == null ? null : JSONB.valueOf(aString))
Currently,
JSONB.valueOf(null) and JSONB.valueOf("null")are not equals, but will have both insert anulljsonb object in database, which I find a little confusing.
Yes, see: https://github.com/jOOQ/jOOQ/issues/10343
I hope this helps
Most helpful comment
Implemented for the most popular dialects in jOOQ 3.12 (see list in issue description). More dialects will have JSON support in jOOQ 3.13: #8948