The docker images for postgres and mysql support environment variables (POSTGRES_DB and MYSQL_DATABASE) that will cause a default database to be created (among other things; looks like a user can be created too). We may want to allow similar automatic configuration for use in testing.
As I see it, it is usually done through some bash scripts that get run as part of the ENTRYPOINT in docker.
Looks like /cockroachdb/cockroach.sh is a good place to set it up. Not sure how it works in clustered mode though.
Mysql does some sort of double boot of mysql, starts it in the background, does the setup magic against that instance, stops it, and starts the real thing.
As there is no clear workaround, I add mine. It allows me to define a COCKROACH_DB or COCKROACH_DATABASE at startup in a docker-compose. Pretty handy when you run your test of integration with docker-compose.
article_cockroachdb:
image: mycorp/cockroach
command: start --insecure
ports:
- "8080"
- "26257"
restart: unless-stopped
article-cockroachdb-init:
image: mycorp/cockroach
entrypoint: /cockroach/init.sh
command: article
depends_on:
- article_cockroachdb
restart: on-failure
#!/bin/sh
host=$1_cockroachdb
./cockroach sql --insecure --host="$host" --execute=" DROP USER IF EXISTS cockroach; \
DROP DATABASE IF EXISTS $1 CASCADE; \
CREATE DATABASE IF NOT EXISTS $1; \
CREATE USER cockroach WITH PASSWORD 'password'; \
GRANT ALL ON DATABASE $1 TO cockroach;"
FROM cockroachdb/cockroach:v1.1.4
COPY init.sh /cockroach
Either you create your own Docker image with this init.sh or you mount a volume in article-cockroachdb-init.
Hope it will help.
cc @dianasaur323 for prioritization
since this is docker, let's have @kannanlakshmi take a look
We could make this sort of thing work for the single-node case, but it can't really work for the multi-node case given the inability to create DB/users on an uninitialized cluster.
To the folks that are commenting/upvoting this:
cockroach init?Note that we're also shipping #24735 in 2.1 which will provide two predefined dbs in all cases. Maybe this will also help?
@knz Your pull request will definitely help! Thx for it!
Just one thing, IMHO you don't cover one important use case:
As a developer I want to run my test of integration in a specific cockroach DB with a docker_compose.
In this scenario, you need to define in your docker-compose a COCKROACH_DB (e.g article)
As mention earlier, this use case occurs when you run a single instance of crdb. When you run your test locally, this is the case :)
Not exactly this issue, but I couldn't find a way to initialise cockroach easily with arbitrary SQL commands for purposes of testing. I therefore created a simple script that will run the set of SQL files contained within a specified folder. You can use this to create databases as well as initialise that database with whatever structure and data you wish (source code on github as linked from repo):
https://hub.docker.com/r/episub/cockroach-init/
If there already exists a way to do this, please let me know.
Zendesk ticket #2296 has been linked to this issue.
The solutions I see here are correct, but they seem heavyweight to me. In general in other people's code I read I also see Docker being used in some heavyweight ways to work around challenges that are created by various images.
In the case of cockroach, I have the same issue... I want to run an init script immediately after db creation. Yes, I realize that in true cluster mode (>1 node) this does not make immediate sense... Because you would not want to run your init script on every node.
I would argue that the developer should be able to init a db or a cluster of db nodes, using the same script or function call. The implementation may change (depending on if single node or multi-node), but the intended outcome remains the same.. The developer needs the db/cluster to have a particular state immediately after creation. This is so clearly necessary, the cockroach start command should have a command line flag to specify a shell script to execute or perhaps an SQL command to execute.
Anyways... here is a somewhat simpler solution requiring only docker-compose:
db_cockroach:
......
command: shell -c '/etc/cockroach/conf/init-user-db.sh & /cockroach/cockroach start --certs-dir=/etc/cockroach/certs;'
You simply fork (using "&") your init script before you launch cockroach... BUT, you must make sure your init script is efficiently polling the port that cockroach will be listening on when it wakes up... so that when cockroach actually wakes up, your init script can run.
I made use of the "nc" program a.k.a. netcat to poll... and my solution (init-user-db.sh) looks like:
#!/bin/bash
set -e
while ! nc -w 1 -z localhost 26257; do sleep 0.1; done;
/cockroach/cockroach sql --certs-dir=/etc/cockroach/certs --user=root --execute="
CREATE USER IF NOT EXISTS $COCKROACH_USER;
CREATE DATABASE IF NOT EXISTS $COCKROACH_DATABASE;
GRANT CREATE, SELECT, DROP, INSERT, DELETE, UPDATE ON DATABASE $COCKROACH_DATABASE TO $COCKROACH_USER;
"
notice nc is used in a loop, whereby it checks to see if any process is listening on the cockroach port, then sleeps for 1/10 of a second.
This solution does not require any new Docker containers to be created... it basically adds one line to your init script, requires you to apt-get install netcat, and adds a little more muck to your docker-compose command.
In my final analysis, to bring cockroach up to speed with mysql/maria/postgres, it should support the ability to create an initial non-root user and an initial database (both with the name of the users choosing). It should also allow the user to supply an init script. There is a best practice for doing this in Docker... see what mysql/maria/postgres did. cockroach could also address the issue for non-Docker situations by allowing the user to supply cockroach start with a command line flag that specifies an init shell script or SQL script. All this can be done in a way that achieves the developers intended result regardless of single-node or multi-node.
It is worth mentioning that this solution is very similar to the classic Docker/nginx/database problem of: "How can I make my webserver (nginx) accept incoming requests only after the db is up and listening for new connections?"
The solution is equivalent (in both beauty and ugliness)... poll with netcat.
Also worth noting, I attempted various other "one line" solutions (one line in docker-compose command), and encountered various problems. I tried to start cockroach in the background (with --background), then run my init script, then kill cockroach, then start cockroach in the foreground... I think there was a timing conflict between the kill and start, or between the init script and the kill. It was not looking like a promising path. But that would have been a slightly more ideal one line solution (not requiring netcat) if only it worked ;)
Kubernetes (secure) solution here.
This job will initialize a cluster + run an init script against the cluster. Note that this assumes a secret cr-db-secret with a comma-separated list of DB's to create, in addition to CR_USER and CR_PASSWORD variables.
You will have to do the kubectl get csr and kubectl certificates approve xyz steps for this job.
apiVersion: batch/v1
kind: Job
metadata:
name: cluster-init-secure
labels:
app: cockroachdb
spec:
backoffLimit: 1
activeDeadlineSeconds: 3600
template:
spec:
restartPolicy: Never
serviceAccountName: cockroachdb
initContainers:
- name: init-certs
image: cockroachdb/cockroach-k8s-request-cert:0.4
imagePullPolicy: IfNotPresent
command:
- '/bin/ash'
- '-ecx'
- '/request-cert -namespace=${POD_NAMESPACE} -certs-dir=/cockroach-certs -type=client -user=root -symlink-ca-from=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt'
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: client-certs
mountPath: /cockroach-certs
restartPolicy: OnFailure
volumes:
- name: client-certs
emptyDir: {}
containers:
- name: cluster-init
image: cockroachdb/cockroach:v2.1.4
imagePullPolicy: Always
volumeMounts:
- name: client-certs
mountPath: /cockroach-certs
env:
- name: CR_DBS
value: accounts,classes,discussions,exercises,votes,workspaces
- name: CR_USER
valueFrom:
secretKeyRef:
key: CR_USER
name: cr-db-secret
- name: CR_PASSWORD
valueFrom:
secretKeyRef:
key: CR_PASSWORD
name: cr-db-secret
command:
- /bin/sh
- -c
- |
{
echo "Initializing cluster..."
while ! /cockroach/cockroach init --certs-dir=/cockroach-certs --host=cockroachdb-0.cockroachdb 2>&1 >/dev/null | grep AlreadyExists; do sleep 1; done;
} && {
echo "Creating user"
/cockroach/cockroach sql --user=root --certs-dir=/cockroach-certs --host=cockroachdb-0.cockroachdb --execute="
CREATE USER IF NOT EXISTS $CR_USER WITH PASSWORD $CR_PASSWORD;
"
echo "Initializing databases..."
for db in $(echo $CR_DBS | sed "s/,/ /g")
do
/cockroach/cockroach sql --user=root --certs-dir=/cockroach-certs --host=cockroachdb-0.cockroachdb --execute="
CREATE DATABASE IF NOT EXISTS $db;
GRANT ALL ON DATABASE $db TO $CR_USER;
"
echo Created db $db
done
}
Sample cr-db-secret secret:
kubectl create secret generic cr-db-secret \
--from-literal=CR_USER=devuser \
--from-literal=CR_PASSWORD=devpassword
Just to note, since 2.1 two default databases are created: defaultdb and postgres.
We currently create two default databases as Rio pointed out above. We don't create a default non-root user which is a separate more involved request. Closing this for now as the minimum requirement - default db is met.
Honestly I'm a little bummed this was furloughed, setting up psql, mysql, and mariadb using docker-compose on a local dev machine is super easy. It's significantly more difficult with cockroachDB without the default environment variables and configs.
Look at the wealth of options afforded on the MySQL image for example - https://hub.docker.com/_/mysql/
Maybe it'd be worth having a cockroachdb/cockroach-local image for folks setting up a local environment.
@rathboma I appreciate your concern. Note that you can use the defaultdb and postgres databases that are pre-created and empty, with no special meaning for cockroachdb (ie they are fully for you to use).
Additionally, we still plan to address issue #26722 which will give you the ability to add SQL initialization to the initial cluster setup.
Hey! Yes that would certainly help for sure.
I think Dev environments are a big use case for the docker image, so for sure the more documentation and similarity with other DB images aimed at that use case the better.
Thanks for the hard work on a good DB, keep it up!
@johnrk why do you re-open this issue as it is superseded #26722
I was just talking to John about this and I think ease-of-use for the docker image is a separate issue from #26722's changes to the init command. For one thing the use cases we're looking at here are more about single-node clusters (for integration testing?) that will be using start-single-node instead of start and init.
I'd like to see a single-node mode of the docker image (or maybe a separate derived docker image?) with env var customization to create a database and a user and apply settings, as well as some thought around how to do this in secure mode (how to pass certs in/out of the container, etc).
Awesome. If you need to see how I use it on my project, you can look at my Docker Compose, you can see we set up a variety of databases for testing, and seed them with credentials and data. Currently my CockroachDB version doesn't do anything beyond starting the app.
Just to add my 2c based upon a separate conversation with @vy-ton: for the Testcontainers module for CockroachDB we'd love it if the docker images supported an environment variable based approach for setting username, password and DB name.
For the other databases that Testcontainers works with, it's more or less a defacto standard that these are settable as environment variables. For our integration with CockroachDB, our implementation is currently rather sad 馃槃.
While having an init script capability would be a definite improvement too, I'd just like to add a 馃憤 for environment variable support as a simple and user friendly option.
Omg testcontainers looks amazing.
This seems to relate to this other issue I made: https://github.com/cockroachdb/cockroach/issues/48647
That one describes the convention used by Postgres (and I believe other DBs) to run init scripts that are in /docker-entrypoint-initdb.d.
Until this issue is resolved, check out this docker image written by our own @timveil that provides the functionality of creating a database on startup based on an environment variable. https://github.com/timveil-cockroach/cockroachdb-remote-client
Most helpful comment
As there is no clear workaround, I add mine. It allows me to define a COCKROACH_DB or COCKROACH_DATABASE at startup in a docker-compose. Pretty handy when you run your test of integration with docker-compose.
docker-compose
init.sh
Docker file
Either you create your own Docker image with this init.sh or you mount a volume in
article-cockroachdb-init.Hope it will help.