Cockroach: prod: need to document better how to set up cockroach within docker on CoreOS

Created on 23 May 2016  Â·  11Comments  Â·  Source: cockroachdb/cockroach

Requested by @mhausenblas.

  • What did you try?

Using the Dockerfile at https://hub.docker.com/r/mhausenblas/cockroachdb-minimal/, e.g.

FROM ubuntu:14.04
MAINTAINER Michael Hausenblas "[email protected]"
ENV REFRESHED_AT 2016-05-23T14:32Z
EXPOSE 8080
COPY cockroach-beta-20160519.linux-amd64/cockroach /
ENTRYPOINT ["/cockroach"]
CMD ["start", "--logtostderr"]

Then running with docker run -d -P ...

This is running using CoreOS on AWS as docker host.

  • Observed behavior:

the instance starts, the log output reports the cockroach server is started properly, however (in this simple / default configuration) the network ports for the HTTP UI and the DB port are not reachable from outside of the docker instance.

  • Expected behavior:

the DB server is reachable from outside the docker instance.

  • Desired resolution

CockroachDB should be shipped with an example Dockerfile and pre-built Docker instances.

  • Alternate data point

Using Ubuntu 15.10 on GCE as host, @knz was able to start the docker instance and access the DB server from the host.

C-enhancement docs-todo

Most helpful comment

@mhausenblas Glad we figured this out. As I mentioned earlier, I'm not at all familiar with DC/OS. How is your configuration above providing persistence for the data CockroachDB stores? That is, if the node restarts, how is it going to locate the on-disk state? With kubernetes I experimented with using persistent volumes to do this. Does DC/OS have something similar?

All 11 comments

@mhausenblas I'm not sure precisely what you were seeing, but the following worked for me on Mac OS X:

~ docker run -d -P mhausenblas/cockroachdb-minimal:0.7 start --logtostderr --insecure
4c33d67fcbfcfc0d8aa8893b68cbfe7328306c0dbfebc87767da6adc2b04e353
~ docker ps -a
CONTAINER ID        IMAGE                                 COMMAND                  CREATED             STATUS              PORTS                     NAMES
4c33d67fcbfc        mhausenblas/cockroachdb-minimal:0.7   "/cockroach start --l"   1 seconds ago       Up 2 seconds        0.0.0.0:32773->8080/tcp   backstabbing_booth

Notice the --insecure flag. By default, cockroach start only listens on localhost, but when running inside a container we want to bind to the external IP address (of the docker container) in order to allow the server to be accessible outside of the container.

Since I'm running on Mac OS X (using docker-machine) I need to get the IP address of the machine:

~ docker-machine env default | grep DOCKER_HOST
export DOCKER_HOST="tcp://192.168.99.100:2376"

And then I can access the admin UI:

~ curl -o - http://192.168.99.100:32773 | head -20
...
<!doctype html>
<html>
  <head>

@mhausenblas Let me know if this helped. Sounds like you were trying to run using DC/OS and Marathon which I have no experience with. I'd be happy to use this as an excuse to learn more if you lead me through what commands you were using.

So the big issue here is that either --insecure or a certificate is necessary to make a cockroachdb server accessible to anything but localhost, but this is under-documented and fails in confusing ways (how many cockroachdb team members looked at this before anyone noticed it was missing?). With that solved, I've made some progress running cockroachdb DC/OS on AWS. The UI isn't quite loading for me but parts of it are coming through so I've made it past the networking hurdle.

@mhausenblas, forgive me if this is retreading things you already know but here's what I've been able to figure out. The DC/OS docs are amazingly unhelpful for this (The intro usage docs at https://dcos.io/docs/1.7/usage/ say "here's the command to run a web server! you can see it running on the DCOS status page!". But then nothing about how to actually get a request routed to that web server)

In general, the system is geared toward having marathon-lb (an instance of haproxy) as the only externally-visible service. The AWS template sets up an ELB that exposes ports 80 and 443 of the "public slave" node, with a health check that assumes that that node will be running haproxy. To expose anything else (if you're running on AWS), you need to set up more ELBs or modify this one to expose more ports.

But instead of modifying the ELB, it's probably better to configure the marathon-lb. The magic words are to put this in your app specification:

 "ports": [
        80
    ],
    "labels":{
        "HAPROXY_GROUP":"external"
    }

Note that the 80 here doesn't mean that cockroach gets the flag --http-port=80; this is an instruction to marathon-lb to expose the $PORT0 port on port 80. I cheated a bit here because cockroachdb is the only thing I'm running; more realistically you'd want to use options like HAPROXY_0_VHOST to give cockroachdb its own hostname instead of taking over the entire cluster's public port.

Even once I've done this, I get a lot of 503 errors, usually enough of them while loading javascript that the UI doesn't load. These errors appear to be coming from the marathon-lb haproxy, but I haven't dug deeper to see what's going on there.

Here's my final configuration file for dcos marathon app add:

{
    "id": "cockroachdb",
    "instances": 1,
    "cpus": 1,
    "mem": 1024,
    "cmd": "curl -O https://binaries.cockroachdb.com/cockroach-beta-20160519.linux-amd64.tgz && tar xfz cockroach-beta-20160519.linux-amd64.tgz && cockroach-beta-20160519.linux-amd64/cockroach start --http-port=$PORT0 --logtostderr --insecure",
    "ports": [
        80
    ],
    "labels":{
        "HAPROXY_GROUP":"external"
    }
}

And here's a config using our published docker image:

{
    "id": "cockroachdb",
    "instances": 1,
    "cpus": 1,
    "mem": 1024,
    "container": {
        "docker": {
            "image": "cockroachdb/cockroach",
            "network": "BRIDGE",
            "portMappings": [
                { "containerPort": 8080, "hostPort": 0, "servicePort": 80, "protocol": "tcp" }
            ]
        }
    },
    "args": ["start", "--logtostderr", "--insecure"],
    "labels":{
        "HAPROXY_GROUP":"external"
    }
}

Thanks a bunch for looking into this @petermattis and @bdarnell! So we made it work now (big kudos to @jfrazelle who spotted this and resolved it within minutes): turned out it was a combination of --insecure and --host=0.0.0.0 missing.

Concerning your last Marathon app spec you posted @bdarnell my suggestion is to always specify the exact image, so cockroachdb/cockroach:beta-20160512 rather than cockroachdb/cockroach since otherwise you don't know what you get (in other words: latest is unreliable).

Now, concerning routing and service discovery: I always start with the simplest scenario: think sorta smoke test. I just expose it on the public node directly, using the acceptedResourceRoles. Once that works, using VIPs (Minuteman) for internal SD and Marathon-lb as an edge router.

So, the following was the minimal (dev/test, non-prod) config I was now able to launch CockroachDB as well as access the Admin WebUI:

{
    "id": "cockroachdb",
    "instances": 1,
    "cpus": 1,
    "mem": 500,
    "cmd": "./cockroach start --insecure --host=0.0.0.0 --logtostderr",
    "container": {
        "type": "DOCKER",
        "docker": {
            "image": "cockroachdb/cockroach:beta-20160512",
            "network": "BRIDGE",
            "portMappings": [
                {
                    "containerPort": 8080,
                    "hostPort": 0
                }
            ]
        }
    },
    "acceptedResourceRoles": [
        "slave_public"
    ]
}

BTW, happy if this issue is closed now, but would suggest you mention it in the docs somewhere? Also, I plan to blog about it, maybe that can be used as a basis for a 'How to deploy/run CockroachDB on DC/OS?'

@mhausenblas Glad we figured this out. As I mentioned earlier, I'm not at all familiar with DC/OS. How is your configuration above providing persistence for the data CockroachDB stores? That is, if the node restarts, how is it going to locate the on-disk state? With kubernetes I experimented with using persistent volumes to do this. Does DC/OS have something similar?

So yeah @petermattis that's why I labelled it 'dev/test'— this setup certainly not prod ready. The way I set it up means that when the container dies, the data is lost.

The first improvement now is to use persistent volumes in order to make sure that when the container re-starts (on the same host) it gets its data back.

In order to make sure that the data stays around even if and when the node (host) goes down, one would use external volumes which is comparable to what you know from K8S. I'll keep improving the setup ;)

@mhausenblas Thanks for the documentation pointers. I'll go do some reading.

Keyword stuffing to make this issue easier to search for: mesos, mesosphere

@bdarnell still a 1.0? I didn't fully grok this long discussion, so could you comment on what remains to be done and assign this?

This is basically our umbrella issue for mesosphere support, but I think it's better to tackle this on the docs side (like we did with docker swarm, etc). Closing this in favor of https://github.com/cockroachdb/docs/issues/507

Was this page helpful?
0 / 5 - 0 ratings

Related issues

couchand picture couchand  Â·  3Comments

intech picture intech  Â·  3Comments

nvanbenschoten picture nvanbenschoten  Â·  3Comments

bdarnell picture bdarnell  Â·  4Comments

magaldima picture magaldima  Â·  3Comments