I'm currently trying to insert data into fts4 4table, created according to https://www.sqlite.org/fts3.html, but jOOQ can't detect data types in this table, whilst they are quite simple, for example here:
create virtual table pages using fts4(
content TEXT,
book_id INTEGER REFERENCES title(id)
);
I believe generator could detect data types in such kind of tables
Thanks for your suggestion. We can't easily detect the types with queries on SQLITE_MASTER and PRAGMA_TABLE_INFO, as you can verify here:
select *
from sqlite_master t, pragma_table_info(t.tbl_name) c
where t.tbl_name = 'pages';
Resulting in:
|type|name|tbl_name|rootpage|sql|cid|name|type|notnull|dflt_value|pk|
|----|----|--------|--------|---|---|----|----|-------|----------|--|
|table|pages|pages|0|CREATE VIRTUAL TABLE pages using fts4(content TEXT,book_id INTEGER REFERENCES title(id))|0|content||0||0|
|table|pages|pages|0|CREATE VIRTUAL TABLE pages using fts4(content TEXT,book_id INTEGER REFERENCES title(id))|1|book_id||0||0|
The type column being empty.
We could attempt to parse the SQL text and reverse engineer data types from there, though. That's what we're already doing to detect check constraints. I guess the parser would have to be able to parse and ignore the virtual and using clauses.
Do you have any other ideas on how to reverse engineer these columns, whose type information / affinity is missing?
Interesting, this syntax isn't documented in CREATE TABLE: https://sqlite.org/lang_createtable.html
It's described in separate place: https://sqlite.org/lang_createvtab.html
Not only for check constraints, but also for views, we've already parsed the SQL sources to find out missing data types: #8278
Regarding types, it actually looks like a bug in SQLite:
create virtual table pages using fts4(
content TEXT,
book_id INTEGER REFERENCES title(id)
);
insert into pages values (1, 'a');
-- Query OK, 1 row affected
select * from pages;
-- +---------+---------+
-- | content | book_id |
-- +---------+---------+
-- | 1 | a |
-- +---------+---------+
If it's meant that SQL should be typesafe — it isn't here.
Also according to docs if one creates a virtual table without arguments (using fts3();) resulting table will contain a single non-hidden content column.
Everything derived from such virtual tables won't have types also:
create temp table tmp2 as select * from pages limit 1;
PRAGMA table_xinfo(tmp2);
This PRAGMA statement won't output any types.
If it's meant that SQL should be typesafe — it isn't here.
"Feature", not a bug: https://www.sqlite.org/datatype3.html. It's called "type affinity".
Oh, shoot. But then there is a bug in the documentation:
Each column in an SQLite 3 database is assigned one of the following type affinities
And we've just found a way to create columns without affinities :)
Anyway. While we're at it, I wonder if we should implement some logic to recognise all the implementation tables and ignore them? Creating this table:
CREATE VIRTUAL TABLE t_11172 USING fts4 (
content text,
id integer not null primary key
)
I'm getting these implementation tables for the virtual table in sqlite_master:
|type|name|tbl_name|rootpage|sql|
|----|----|--------|--------|---|
|table|t_11172|t_11172|0|CREATE VIRTUAL TABLE t_11172 USING fts4 (content text, id integer not null primary key)|
|table|t_11172_content|t_11172_content|2|CREATE TABLE 't_11172_content'(docid INTEGER PRIMARY KEY, 'c0content', 'c1id')|
|table|t_11172_segments|t_11172_segments|4|CREATE TABLE 't_11172_segments'(blockid INTEGER PRIMARY KEY, block BLOB)|
|table|t_11172_segdir|t_11172_segdir|5|CREATE TABLE 't_11172_segdir'(level INTEGER,idx INTEGER,start_block INTEGER,leaves_end_block INTEGER,end_block INTEGER,root BLOB,PRIMARY KEY(level, idx))|
|table|t_11172_docsize|t_11172_docsize|9|CREATE TABLE 't_11172_docsize'(docid INTEGER PRIMARY KEY, size BLOB)|
|table|t_11172_stat|t_11172_stat|10|CREATE TABLE 't_11172_stat'(id INTEGER PRIMARY KEY, value BLOB)|
I don't think users would like to see any of them in generated code, apart from the one they created? But how to recognise them formally?
I think this is implementation-specific detail. Ideally, we should know that fts3(4) will create this set of implementation tables, as well as that fts3(4) without arguments will create a table with a single content column.
Thinking about this again...
I think this is implementation-specific detail. Ideally, we should know that fts3(4) will create this set of implementation tables
Users can just exclude all .*?_(content|segments|segdir|docsize|stat) tables from being generated in SQLite. Or they don't if they don't care about the extra classes.
I'll also add a flag <includeSystemTables/> (similar to other <includeSystemXYZ/> flags) that allows for doing this automatically, defaulting to true: https://github.com/jOOQ/jOOQ/issues/11176
as well as that fts3(4) without arguments will create a table with a single
contentcolumn.
Thanks for the reminder. This could be done entirely in the parser
Thank you very much, I believe this is perfect decision!
But the latest fix looks not completely correct, for example
CREATE VIRTUAL TABLE stat2 USING dbstat();
describe stat2;
outputs
| cid | name | type | notnull | dflt_value | pk |
+-----+------------+---------+---------+------------+----+
| 0 | name | TEXT | 0 | <null> | 0 |
| 1 | path | TEXT | 0 | <null> | 0 |
| 2 | pageno | INTEGER | 0 | <null> | 0 |
| 3 | pagetype | TEXT | 0 | <null> | 0 |
| 4 | ncell | INTEGER | 0 | <null> | 0 |
| 5 | payload | INTEGER | 0 | <null> | 0 |
| 6 | unused | INTEGER | 0 | <null> | 0 |
| 7 | mx_payload | INTEGER | 0 | <null> | 0 |
| 8 | pgoffset | INTEGER | 0 | <null> | 0 |
| 9 | pgsize | INTEGER | 0 | <null> | 0 |
+-----+------------+---------+---------+------------+----+
Well, how can we reliably handle this for all possible values of USING <xyz> without re-implementing 80% of SQLite in jOOQ? After all, this is just a somewhat wonky workaround for something that is really a flaw in SQLite...
... I mean, one solution is to just ignore these tables again, if they don't have an explicit column list.
Another solution would be to perform sequential checks: 1. Rad table descriptions. If types are there — we're fine and don't need to parse CREATE TABLE. 2. Parse DDL and try to find types there. 3. If nothing found — fallback to CLOB or Any. CLOB should work for anything because of type affinity. And a user will be able to define custom type if CLOB won't work for them.
If types are there — we're fine
That's already being done this way
Parse DDL and try to find types there
That too
If nothing found — fallback to CLOB or Any
I prefer to stick with OTHER (which also produces a @Deprecated annotation) than pretend we know what it is
Agreed, thank you for the clarification!
There, I reverted that part of the change.
Perhaps, a reasonable solution for implicit column definitions appears in the future. The current solution will definitely work for what you've presented in your original issue description, which I believe will be a more common case. After all, you probably want to have a foreign key in these tables, and possibly, some additional meta data.