Right now cockroach supports the PostgreSQL wire protocol (which is pretty sweet). On a lark I decided to see what happened if I pointed the django test suite against it. Right now it fails pretty early with:
File "/Users/alex_gaynor/projects/django/django/db/migrations/recorder.py", line 52, in ensure_schema
if self.Migration._meta.db_table in self.connection.introspection.table_names(self.connection.cursor()):
File "/Users/alex_gaynor/projects/django/django/db/backends/base/introspection.py", line 58, in table_names
return get_names(cursor)
File "/Users/alex_gaynor/projects/django/django/db/backends/base/introspection.py", line 53, in get_names
return sorted(ti.name for ti in self.get_table_list(cursor)
File "/Users/alex_gaynor/projects/django/django/db/backends/postgresql/introspection.py", line 66, in get_table_list
AND pg_catalog.pg_table_is_visible(c.oid)""")
File "/Users/alex_gaynor/projects/django/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/Users/alex_gaynor/projects/django/django/db/utils.py", line 94, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/Users/alex_gaynor/projects/django/django/db/backends/utils.py", line 62, in execute
return self.cursor.execute(sql)
django.db.utils.InternalError: sql/select.go:324: TODO(pmattis): unsupported FROM: FROM pg_catalog.pg_class AS c LEFT JOIN pg_catalog.pg_namespace AS n ON n.oid = c.relnamespace
I believe the problem is a lack of support for pg_catalog.pg_class. Is the intent to only support PostgreSQL at the wire protocol level, or also the full API surface (notably these tables).
The exact query being run is: https://github.com/django/django/blob/master/django/db/backends/postgresql/introspection.py#L56-L69
Actually, slightly closer inspection of sql/select.go indicates that maybe this is a far simpler parsing issue on the FROM statement.
The immediate issue in that stack trace is not that the pg_catalog.pg_class table is missing, but that we don't yet support JOIN.
We are not planning to implement the pg_catalog tables; frameworks that rely on introspection will need adapters to use our own introspection commands. I have some work towards doing this for django in cockroachdb/cockroach-python#14, although it's not working yet (blocked on our lack of support for changing the default value of a column with ALTER TABLE, which django's migration system does when setting up a new database).
:+1: I suppose this should be closed in favor of #2970 then.
Some sort of virtual table mechanism would be nice, so that all of these PG system tables could be implemented virtually via a plugin that in turn used native introspection.
That said, the real trick or fly in the ointment might be achieving near-100% compatibility with the actual values in the tables, like for entity types that simply don't exist in CockroachDB.
I think it would be difficult to get close enough to postgresql's system tables to be useful. In my work so far with django replacing the introspection code was only a small part of what was required.
Also ruins using rust-postgres, which queries a bunch of the meta tables https://github.com/sfackler/rust-postgres/blob/master/src/lib.rs#L466
at connect time:
https://sfackler.github.io/rust-postgres/doc/v0.11.4/src/postgres/src/lib.rs.html#459
Kind of a bummer to have to fork all of these.
The SQL standard defines an information_schema database which contains this kind of information in a vendor-neutral format. Implementing that would be more palatable than the postgres-specific tables. However, I'm not sure how beneficial it would be in practice - every time this has come up the library is using the pg-specific tables and not information_schema. I wonder if that's because they need some information that is not available in information_schema or just because they predated widespread support for information_schema (or the authors weren't aware of it).
Looks like django uses information_schema for the MySQL backend. Is the backend tied to the client protocol for these ORMs?
It varies. Sometimes you can swap out the network protocol independently of the rest, sometimes it's configured as one big unit (and even if the protocol were separate, using the mysql introspection code would typically pull in a lot of other mysqlisms). In the specific case of django, you can't make this change without a custom plugin (which I have in a branch in cockroach-python), and once we're going that far it's easier to just implement the necessary introspection commands with SHOW TABLES and the like instead of sharing information_schema with the mysql backend.
I strongly suspect the answer is that folks are unaware of
infromation_schema, I'm sure if you sent a PR to djang to use it that'd be
accepted.
On Fri, Apr 8, 2016 at 3:48 PM, Ben Darnell [email protected]
wrote:
It varies. Sometimes you can swap out the network protocol independently
of the rest, sometimes it's configured as one big unit (and even if the
protocol were separate, using the mysql introspection code would typically
pull in a lot of other mysqlisms). In the specific case of django, you
can't make this change without a custom plugin (which I have in a branch in
cockroach-python), and once we're going that far it's easier to just
implement the necessary introspection commands with SHOW TABLES and the
like instead of sharing information_schema with the mysql backend.—
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
https://github.com/cockroachdb/cockroach/issues/5194#issuecomment-207575365
"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
GPG Key fingerprint: D1B3 ADC0 E023 8CA6
It's not just a lack of awareness (although it's possible that that's the root cause and no one has had any reason to update the old code): django uses both information_schema and pg_catalog in adjacent methods. Anyway, we don't support information_schema yet either, so there's not much point in trying to promote its use at this point.
Outside the technical reasons for meta-fields (and the cited standards above), from an end-user point of view there are some valid points to have queryable metadata on table.
[RFC] meta-fields on tables for end-users
per-table meta-fields:
per-column meta-fields:
With this setting, visible rowname could link the meta_column_rowname value, be easy to change/modify. easy to move in the columns order.
per-row meta-fields:
Values should be native/internally triggered on insert/update, as option, not by default.
Every cited meta-field should be queryable (select).
Almost all of these meta-fields should be ready for insert/update.
Comments are welcome.
HTH
@dmedri Mind filing a separate issue for your feature request? It feels sufficiently different than the original issue which is about exposing existing table metadata in a way that is compatible with ORMs.
Tables and columns have associated IDs, but those IDs are not UUIDs. The table ID and column IDs are encoded in the keys used to store the rows for table data so space efficiency is a concern. Small integer IDs are much more space efficient than UUIDs.
It would be helpful if people could identify some specific ORM usage examples that attention could then focus on.
If I Google "ORM example", SQLAlchemy comes up on top:
http://docs.sqlalchemy.org/en/latest/orm/examples.html
One question about ORMs - is the interest primarily in simply having an object-oriented interface to traditional SQL data models, or... is the primary interest in how to map non-relational, object-oriented data structures to the relational data model. Looking at the SQLAlchemy example list, it seems like the latter. If that's the true goal of ORM support, fine, but the goal should be made explicit.
As a side comment, with CockroachDB and its flexible underlying KV store, focusing too heavily on ORM for supporting graphs and other object-oriented structures may be a distraction from eventually focusing on a native object-oriented keystore data model and direct, non-SQL data access. That said, ORM support seems like a relatively easy way to get a lot of traction in the near-term.
Looks like pg_catalog exists, and so do JOINs, so I suspect this can be closed.
We're still working on adding more tables to pg_catalog. Once we support all meta tables needed by major ORMs, then I think this can be safely closed.
@nvanbenschoten FWIW, the next place the django tests bail is the missing function pg_catalog.pg_table_is_visible
I'm going to close this issue in favor of more targeted issues for missing meta tables and functions, as I think we've gotten the bulk of them at this point. I've opened #12538 to track the missing pg_table_is_visible function.
Most helpful comment
We're still working on adding more tables to
pg_catalog. Once we support all meta tables needed by major ORMs, then I think this can be safely closed.