Cockroach: prod: Is there an API to embed CockroachDB in a Go application?

Created on 9 Feb 2016  Â·  34Comments  Â·  Source: cockroachdb/cockroach

I'm interested in using cockroachdb as a Go library, to avoid the cost of one more client-server network roundtrip. Is there an API for running embedded? e.g. I'd like to pass sql queries from my Go main() into cockroachdb directly. (This would also make writing tests for applications that use cockroachdb easy.)

C-question C-wishlist O-community X-wontfix docs-todo

Most helpful comment

Not to resurrect an old thread, but an additional compelling use case to make a simple API for embedding cockroachdb is to more easily support unit testing of consumers. If I'm writing a service that depends on cockroachdb, I'd like to be able to make it part of my unit testing framework so it doesn't have external dependencies/port allocations/etc, and lives and dies as the unit test does.

All 34 comments

Jason, Is running the binary on the same node as the client a solution for
you? It's best not to embed cockroachdb with your binary. Cockroach as such
contains the entire code for a database, adding that code to your client is
going to make it even more monolithic. Message passing between your client
and cockroachdb through a socket is the preferred solution.

On Tue, Feb 9, 2016 at 7:50 AM Jason E. Aten, Ph.D. <
[email protected]> wrote:

I'm interested in using Cockroach as a Go library, to avoid the cost of
one more client-server network roundtrip. Is there an API for running
embedded? e.g. I'd like to pass sql queries from my Go main() into a
cockroach directly.

—
Reply to this email directly or view it on GitHub
https://github.com/cockroachdb/cockroach/issues/4253.

Sadly, as you know, that is much slower.

CockroachDB is going to send rpcs to other nodes for replication. Can you
explain why this is a big deal for you? Thanks!

On Tue, Feb 9, 2016 at 9:28 AM Jason E. Aten, Ph.D. <
[email protected]> wrote:

Sadly, as you know, that is much slower.

—
Reply to this email directly or view it on GitHub
https://github.com/cockroachdb/cockroach/issues/4253#issuecomment-181887486
.

For reads that can be serviced in memory and without network roundtrip, using an embedded db will be 1000-10,000x faster than client-server mode.

@glycerine I believe all the necessary mechanics are already exported, so you should be able to do this, but I wouldn't depend on the API remaining stable.

If you do work up a prototype, do let us know.

@tamird, thanks for the perspective. I'm still getting oriented to the code, is github.com/cockroachdb/cockroach/sql the first layer in the server process after the wire?

Yep.

On Tue, Feb 9, 2016 at 9:45 AM, Jason E. Aten, Ph.D. <
[email protected]> wrote:

@tamird https://github.com/tamird, thanks for the perspective. I'm
still getting oriented to the code, is
github.com/cockroachdb/cockroach/sql the first layer in the server
process after the wire?

—
Reply to this email directly or view it on GitHub
https://github.com/cockroachdb/cockroach/issues/4253#issuecomment-181897312
.

Got it. Thanks :-)

yeah sql/executor.go is the first layer in sql processing after the wire,
where statements are executed one by one.

We would be grateful if you can benchmark your finding so that this
approach can be validated as valuable, so that we can consider developing
an API for it.

On Tue, Feb 9, 2016 at 9:46 AM Jason E. Aten, Ph.D. <
[email protected]> wrote:

@tamird https://github.com/tamird, thanks for the perspective. I'm
still getting oriented to the code, is
github.com/cockroachdb/cockroach/sql the first layer in the server
process after the wire?

—
Reply to this email directly or view it on GitHub
https://github.com/cockroachdb/cockroach/issues/4253#issuecomment-181897312
.

@vivekmenezes, thanks for that. Perhaps I have misunderstood what is possible though -- Is the transaction model for reads in CockroachDB such that consistent reads demand network roundtrips?

No. You are right. if all the range leaders are on the local node, your
reads will be local. But things do change with rebalancing and stuff so you
can't depend on that (unless you also add a way to pin the leader).

On Tue, Feb 9, 2016 at 9:55 AM Jason E. Aten, Ph.D. <
[email protected]> wrote:

@vivekmenezes https://github.com/vivekmenezes, thanks for that. Perhaps
I have misunderstood what is possible though -- Is the transaction model
for reads in CockroachDB such that consistent reads demand network
roundtrips?

—
Reply to this email directly or view it on GitHub
https://github.com/cockroachdb/cockroach/issues/4253#issuecomment-181901521
.

Makes sense. Thanks for clarifying.

@glycerine for an example simple entry point you can also reuse/study the code from the command-line interface in cli/sql.go.

@knz, thanks!

Are you talking about a single-node embedded DB, or do you intend that your
app itself is running on multiple nodes with each node replicating some or
all of the same data?

-- Jack Krupansky

On Tue, Feb 9, 2016 at 7:50 AM, Jason E. Aten, Ph.D. <
[email protected]> wrote:

I'm interested in using Cockroach as a Go library, to avoid the cost of
one more client-server network roundtrip. Is there an API for running
embedded? e.g. I'd like to pass sql queries from my Go main() into a
cockroach directly.

—
Reply to this email directly or view it on GitHub
https://github.com/cockroachdb/cockroach/issues/4253.

@JackKrupansky, in my mind there's a natural evolution through each of those phases as an application grows. My immediate target is midway between the two you describe. Start with a single node embedded DB, then let the storage scale out horizontally as the application remains in one process for simplicity. Finally, when you need fault tolerance in your application, you bite the bullet and pay the complexity cost to replicate it as well. But by that time your design has stabilized, so the replication doesn't seriously slow down your ability to evolve the design to meet the discovered requirements.

@glycerine I wanted to reiterate that if you try to use CockroachDB as a library using the current packages and interfaces we will make changes that break you. Not intentionally, but as a side-effect of normal development. While we've talked about supporting such a use case, it isn't something we want to tackle right now. I would highly discourage trying to use CockroachDB in this manner at this time.

That said, it's fascinating that you've brought this up. While I agree that talking directly to an in-process database will be faster, I'm not sure the speedup will be significant. I would imagine the RPC overhead to be a fairly small part of anything but the most trivial database usage. When we've talked about embedding a CockroachDB node into an application it was to ease deployment and to scale the database at the same rate as the application.

Hi Peter, I appreciate you reiterating that point. I did understand as much from Tamir's comment. It makes perfect sense to me that you don't want to focus on making CockroachDB embeddable at present. I thought that might have been a use case, but I'm now well aware that it is not for you, and I'm on my own if I go that route. - Jason

Not to resurrect an old thread, but an additional compelling use case to make a simple API for embedding cockroachdb is to more easily support unit testing of consumers. If I'm writing a service that depends on cockroachdb, I'd like to be able to make it part of my unit testing framework so it doesn't have external dependencies/port allocations/etc, and lives and dies as the unit test does.

Take a look at server.TestServer and serverutils.StartServer([...]), though I'm sure we don't want to officially endorse these (at least not at this point). It doesn't look like anything would keep you from using those, though.

Further resurrecting this thread: I have a use case for embedding CockroachDB within my golang app, so that I can ship it as a single executable / single process.
To that effect I don't need to use an internal API, and am fine to use the PostgreSQL protocol locally. There is a small amount of C++ involved in this project, and so I'm wondering whether that part is at all avoidable: I would just want to have everything run as golang, and while I could make a binary executable, it wouldn't be a trivial feat for users of my app.

Question being: is there a way to embed cockroach within my golang app without having to compile C++ code?

In short, no if you intend your data to persist when your process doesn't run.
CockroachDB is fully dependent on RocksDB for on-disk storage at this point, and RocksDB is written in C++.

@glycerine status update on this ? was thinking of attempting the same thing

@gedw99, There's no change I'm aware of to the answers that you read above.

sorry for may be already answered question, but does cockroach supports embedding?
I don't want to use inproc communication (i'm happy with local unix pgsql socket) but don't want to force users to install or run additional software.

@vtolstov it's not directly supported but possible, albeit a bit inconvenient. You'd need to prepare the cockroachdb build tree so that the only remaining step is linking (i.e. pre-build all the dependencies with make), then import code for the cockroachdb start command in your app, run that in a separate goroutine, and have the rest of your process connect to it (via unix socket or tcp/ip). you'll need to tweak the link process for your app to pass all the build/link flags needed by the cockroachdb code.

So long story short "it's possible but you're on your own".

can you guide me for then import code for the cockroachdb start where i can find in source code needed stuff?

pkg/cli/start.go

@vtolstov Note that this is not something we support, nor recommend.

does it possible to disable ncurses while build from source ?

yes, you can manually remove pkg/cli/sql.go from your tree, and remove the references to it from other files (this is the interactive command-line client, which will not be used in an embedded build)

@petermattis I think @vtolstov is running a project/business to provide successful open source components in an "operating system as library" to run on lightweight VMs. I expect this will be hard to achieve with CockroachDB due to the sheer number of external requirements, but I find it personally fun to see how far the attempt goes.

@knz, as you bundle rocksdb inside cockroachdb i want to avoid additional software. Also without bundling i need to pack in assets 57Mb binary, and if i want to provide versions for mac or arm i need to hack more things.

Yeah you will need to compile rocksdb yourself for Mac and Arm. Then compile cockroachdb against it.

If you put up the code on GitHub I think I and others can help with it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

nvanbenschoten picture nvanbenschoten  Â·  3Comments

ajwerner picture ajwerner  Â·  4Comments

melskyzy picture melskyzy  Â·  3Comments

nvanbenschoten picture nvanbenschoten  Â·  3Comments

magaldima picture magaldima  Â·  3Comments