Origin: oc cluster join -- how to use?

Created on 10 Mar 2017  Â·  27Comments  Â·  Source: openshift/origin

It would be great if a guide / readme provided on how to use oc cluster join command

Version

oc v1.5.0-rc.0+49a4a7a
kubernetes v1.5.2+43a9be4
features: Basic-Auth

Server https://127.0.0.1:8443
openshift v1.5.0-alpha.3+cf7e336
kubernetes v1.5.2+43a9be4

Steps To Reproduce

oc cluster up
oc cluster join

Current Result

Prompts for
Please paste the contents of your secret here and hit ENTER:
and does not take any values as input, i have to ctrl + c to exit

Expected Result

join another cluster

aredocumentation componencli prioritP2

Most helpful comment

:) waiting......

All 27 comments

let me know if you need help this task, i will try my best

@debianmaster sorry I haven't had time to look into this. But, if you want to try it out, the secret that it's expecting is the contents of admin.kubeconfig from the master's config. You need to make sure that the address specified inside that config is accessible to the node you're trying to add.

i tried pasting the content of admin.kubeconfig and HIT enter.
prompt never leaves to next step its struck on asking secret content.

@csrwng can you help me with this? i'm trying to automate platform scaling and i do not want to do it in a non- standard way

@debianmaster cluster join is at a very early, experimental stage. Using it to scale the platform is definitely not standard.

I am interested in this as well, although I haven't had time to test it, yet.

@debianmaster I think the intended usecase is a bit different from yours, as stated in https://github.com/openshift/origin/pull/9547

Add a new command oc cluster join which launches a container that acts as a node.

As I understand it you'd be bootstrapping that container on your machine as a node host to the cluster that is existing somewhere else. Question is are you supposed to run cluster up on that machine prior to cluster join? Have you tried it without?

I am imagining a use case where on one reasonably available bare-metal host I want to run an all-in-one setup (like cluster up) that provides the ability to scale out into a federated HA setup. Hence, my next Question:

Can anybody give an indication whether joining clusters to federations is something that will likely be supported within oc cluster in the future? (>v1.6)

😢 waiting.....

just a heads u to thoose following this thread, the paste is waiting for a EOF on the file, so paste it in, and then hit CTRL-D

did that work for you? what config file did you use? kubeconfig? @joshprismon

@smarterclayton Do you have a 2 liner on how to use this?

oc cluster join Don't work.

[root@c7 ~]# oc cluster join
Please paste the contents of your secret here and hit ENTER:
123456-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... OK
-- Checking for existing OpenShift container ... 
   Deleted existing OpenShift container
-- Checking for openshift/origin:v3.6.0-alpha.2 image ... OK
-- Checking Docker daemon configuration ... OK
-- Checking for available ports ... OK
-- Checking type of volume mount ... 
   Using nsenter mounter for OpenShift volumes
-- Creating host directories ... OK
-- Finding server IP ... 
   Using 127.0.0.1 as the server IP
-- Joining OpenShift cluster ... 
   Starting OpenShift Node using container 'origin'
FAIL
   Error: could not start OpenShift container "origin"
   Details:
     Last 10 lines of "origin" container log:
     error: --kubeconfig must be set to provide API server connection information

:) waiting......

@debianmaster Hi, can you please tell me what is the "secret"? I checked admin.kubeconfig but there were no "secret" attribute.
Thanks!

@wydwww still no luck. there is no good doc.

@xiaoping378: after it fails, try to run docker logs origin to get the full logs of the failed origin container. For me it says

# docker logs origin
Error: unknown flag: --bootstrap
Usage:
  openshift start node [options]
Options:
      --bootstrap-config-name='': On startup, the node will request a client cert from the master and get its config from this config map in the openshift-node namespace (experimental).
      --config='': Location of the node configuration file to run from. When running from a configuration file, all other command-line arguments are ignored.
      --disable='': The set of node components to disable
      --enable='dns,kubelet,plugins,proxy': The set of node components to enable
      --expire-days=730: Validity of the certificates in days (defaults to 2 years). WARNING: extending this above default value is highly discouraged.
      --hostname='node.example.com': The hostname to identify this node with the master.
      --images='registry.access.redhat.com/openshift3/ose-${component}:${version}': When fetching images used by the cluster for important components, use this format on both master and nodes. The latest release will be used by default.
      --kubeconfig='': Path to the kubeconfig file to use for requests to the Kubernetes API.
      --kubernetes='https://localhost:8443': removed in favor of --kubeconfig
      --latest-images=false: If true, attempt to use the latest images for the cluster instead of the latest release.
      --listen='https://0.0.0.0:8443': The address to listen for connections on (scheme://host:port).
      --network-plugin='': The network plugin to be called for configuring networking for pods.
      --recursive-resolv-conf='': An optional upstream resolv.conf that will override the DNS config.
      --volume-dir='openshift.local.volumes': The volume storage directory.
Use "openshift options" for a list of global command-line options (applies to all commands).

so it looks like there is some issue with passing the correct arguments around. But I'm trying it with OSE 3.7-to-be.

Of course, that Using 127.0.0.1 as the server IP also looks suspicious -- I would hope to see the IP address of the master there.

I've now retried with OpenShift Origin v3.6.1, on RHEL.

On the master:

# curl -LO https://github.com/openshift/origin/releases/download/v3.6.1/openshift-origin-client-tools-v3.6.1-008f2d5-linux-64bit.tar.gz
# tar xvzf openshift-origin-client-tools-v3.6.1-008f2d5-linux-64bit.tar.gz
# cp openshift-origin-client-tools-v3.6.1-008f2d5-linux-64bit/oc /usr/bin/oc
# yum install -y docker
# cat <<EOF >> /etc/containers/registries.conf

insecure_registries:
 - 172.30.0.0/16

EOF
# systemctl restart docker
# oc cluster up --public-hostname=$(hostname) --use-existing-config=true --host-data-dir=/var/lib/origin/openshift.local.etcd
# oc serviceaccounts create-kubeconfig -n openshift-infra node-bootstrapper > node-bootstrapper

The node-boostrapper file contains

apiVersion: v1
clusters:
- cluster:
    api-version: v1
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM2akNDQWRLZ0F3SUJBZ0lCQVRBTkJna3Foa2lHOXcwQkFRc0ZBREFtTVNRd0lnWURWUVFEREJ0dmNHVnUKYzJocFpuUXRjMmxuYm1WeVFERTFNRGswTmpBek9EY3dIaGNOTVRjeE1ETXhNVFF6TXpBM1doY05Nakl4TURNdwpNVFF6TXpBNFdqQW1NU1F3SWdZRFZRUUREQnR2Y0dWdWMyaHBablF0YzJsbmJtVnlRREUxTURrME5qQXpPRGN3CmdnRWlNQTBHQ1NxR1NJYjNEUUVCQVFVQUE0SUJEd0F3Z2dFS0FvSUJBUUROY29LYU1yN1owbTB4OThTSGxuTHIKdUtHWXExMWhtTEN6NVBHR0JCcXFwTGxmQzk3T1ZyVnJRMElDTXl2NTFmRTZRd01tdko1b3FMMW5FT3YzdFEyYgpCM0xVQUl3ZFBwL1ZTSy93QVhlUUxtblVLVFJOQU9DZTArcmhiSGpsTXoxKzdtbGkrZnlHVXpUQk5mdlNhMmhDCnhlcEt4b3RvcDUrbVIvVSt4N3JUTFV5eVZFRkVQckJqd1VjR2dtNmIrQmdJaXRGd3A5cU1Xb1JkTXkvZEZIc0wKai91TVVHVGtEOFJDeTZIMzhFQWFrblF4bG9BWEljOEFUR0N2bXo3U1lzSnVqazFQcmhnU3lnQnc3Uk05cE1vNQozeFdWSytXeTFQc1pmYXBQZVRad2RPdTZDTGlWZUp0Ym5lY1J4TWhibmdBZUwwYk5uU2lxbzJrVTV1NE9wV0loCkFnTUJBQUdqSXpBaE1BNEdBMVVkRHdFQi93UUVBd0lDcERBUEJnTlZIUk1CQWY4RUJUQURBUUgvTUEwR0NTcUcKU0liM0RRRUJDd1VBQTRJQkFRRERPQnBoR2R4ODlnYnA3K1pGUUpvbmhuM1gzUmhYMDN3UG8ySGlRL01iVTlQWgpWMFpCVE0xeDJoU3AvTlpRRDBQWm1Bdk94ZVZlTEdSN3FHWjk0Unp3Z093ekhpS3VGdG1DSVRCOW8wL2sxbHA1CnpScjFLbFZ1elhxSzRCUGJRa1grSEozSzRiVnN5SExrS2lTZHFjTEYwbk5iU0ptTlM1NW1mMStqVFpEanR2NUQKQWU1ekk5OTZwck9XRjBTSFRGMEJBQ0dnVU1KdHJTR2ZxNklEQU84L2lSdkthRGhQeUZzUnpTbVpwMjRCYUo3bQpYaWdzNG5paVJLNTNVZzJFYzV2U2JScldzWkZoT2VSTUNvVCtBVGJIWW5kSndSWHdBQ0NlNVNiY1Q5cVorOEo4CkZCdlRCWDFYMHBjVDBrd0ZNWGk5bjFGNy9lUTVrREZBU2pOdWQvanUKLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    server: https://127.0.0.1:8443
  name: 127-0-0-1:8443
contexts:
- context:
    cluster: 127-0-0-1:8443
    namespace: openshift-infra
    user: node-bootstrapper
  name: node-bootstrapper
current-context: node-bootstrapper
kind: Config
preferences: {}
users:
- name: node-bootstrapper
  user:
    token: eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJvcGVuc2hpZnQtaW5mcmEiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlY3JldC5uYW1lIjoibm9kZS1ib290c3RyYXBwZXItdG9rZW4tMnd4NDYiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoibm9kZS1ib290c3RyYXBwZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiI3ZmEwZTRlZC1iZTQ4LTExZTctOWJhNC0wMDEzMjBmOWJjMzkiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6b3BlbnNoaWZ0LWluZnJhOm5vZGUtYm9vdHN0cmFwcGVyIn0.NAhoU4Sk7psrwWtjM5-wPNkN6CX2iKbsdPMMotE2UJoJn8xhVG3PdJi1OoJzAcEl7mT5OrUzNJ_NviTjkeHPC8MzjMjdcSBEdxav9AVdbnnrFWsxyTyM6TnEMvehmOVbwSN9RgWMT2QeB_gbTuetsz3G14tgEsDtJ8QiNRT-toLtLtDwiqnRiRMoy1o9PC6mkM6NGZRkH7tRSrrbfZwL5KHRfZNH4Icuy2yATcOyxdHl_kYdP8nBNjZkgWB1-m9P2SX0Fmy2WS4S9WP2Ljehld6ROe1rtibnkqt4jeADC9eFwZFMi1GRj1h2LrW74rd7n2tKAS9hjClE1S5hR2kU4g

I've copied that file to second RHEL machine on which I plan to run oc cluster join.

On the second machine:

# curl -LO https://github.com/openshift/origin/releases/download/v3.6.1/openshift-origin-client-tools-v3.6.1-008f2d5-linux-64bit.tar.gz
# tar xvzf openshift-origin-client-tools-v3.6.1-008f2d5-linux-64bit.tar.gz
# cp openshift-origin-client-tools-v3.6.1-008f2d5-linux-64bit/oc /usr/bin/oc
# yum install -y docker
# cat <<EOF >> /etc/containers/registries.conf

insecure_registries:
 - 172.30.0.0/16

EOF
# systemctl restart docker
# oc cluster join --secret="$(cat node-bootstrapper)"
-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... OK
-- Checking for existing OpenShift container ... OK
-- Checking for openshift/origin:v3.6.1 image ... 
   Pulling image openshift/origin:v3.6.1
   Pulled 1/4 layers, 26% complete
   Pulled 2/4 layers, 78% complete
   Pulled 3/4 layers, 97% complete
   Pulled 4/4 layers, 100% complete
   Extracting
   Image pull complete
-- Checking Docker daemon configuration ... OK
-- Checking for available ports ... OK
-- Checking type of volume mount ... 
   Using nsenter mounter for OpenShift volumes
-- Creating host directories ... OK
-- Finding server IP ... 
   Using 127.0.0.1 as the server IP
-- Joining OpenShift cluster ... 
   Starting OpenShift Node using container 'origin'
FAIL
   Error: could not start OpenShift container "origin"
   Details:
     No log available from "origin" container

# docker logs origin
I1031 14:45:44.455576   30088 bootstrap_node.go:266] Bootstrapping from API server https://127.0.0.1:8443 (experimental)
F1031 14:45:44.846586   30088 start_node.go:140] Post https://127.0.0.1:8443/apis/certificates.k8s.io/v1beta1/certificatesigningrequests: dial tcp 127.0.0.1:8443: getsockopt: connection refused

So I believe the problem is that oc cluster join is using the 127.0.0.1 IP address, instead of connecting to the master.

I have even tried to use

oc cluster join --secret="$(cat node-bootstrapper)" --public-hostname=$MASTER

or replace the IP address in node-bootstrapper's line

    server: https://127.0.0.1:8443

with the master, to no avail.

@smarterclayton, how is the oc cluster join supposed to figure out the hostname / IP address of the master to which it should be joining?

I don't want to create a new ticket for this issue, so I will comments on this one.
The oc cluster join command is broken in v3.6.1, here how:

--host-data-dir doesn't work
When executing oc cluster join --use-existing-config --host-data-dir=/var/lib/origin/openshift.local.volumes --secret="$(cat kubeconfig)", the generated docker container doesn't have the proper volume created.
--volume="/var/lib/origin/openshift.local.config:/var/lib/origin/openshift.local.config:z" is missing.

Volume /var/lib/origin/openshift.local.config not define
Even without --host-data-dir, the docker container doesn't have any volume for /var/lib/origin/openshift.local.config. Basically, the certs will be lost the next time we run oc cluster join again. You will get something like this:

bootstrap client certificate does not match private key, you may need to delete the client CSR: tls: private key does not match public key

wrong cgroup driver used
Making reference to my own ticket #17190
The node configuration file /var/lib/origin/openshift.local.config/node/node-config.yaml is missing the following lines:

kubeletArguments:
  cgroup-driver:
  - cgroupfs

Otherwise kublet refuse to start with the following error message:

failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

For adventurous people: I manage to get the node started by running something similar:

docker run --name=origin --hostname=sylve --env="HOME=/root" --env="OPENSHIFT_CONTAINERIZED=true" --env="KUBECONFIG=/var/lib/origin/openshift.local.config/master/admin.kubeconfig" --env="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" --volume="/var/log:/var/log:rw" --volume="/var/run:/var/run:rw" --volume="/sys:/sys:rw" --volume="/sys/fs/cgroup:/sys/fs/cgroup:rw" --volume="/dev:/dev" --volume="/:/rootfs:ro" --volume="/var/lib/origin/openshift.local.volumes:/var/lib/origin/openshift.local.volumes:rslave" --volume="/var/lib/origin/openshift.local.config:/var/lib/origin/openshift.local.config:z" --network=host --privileged --restart= --label io.openshift.tags="openshift,core" --label license="GPLv2" --label io.k8s.description="OpenShift Origin is a platform for developing, building, and deploying containerized applications." --label build-date="20170911" --label name="CentOS Base Image" --label io.k8s.display-name="OpenShift Origin Application Platform" --label vendor="CentOS" --detach=true openshift/origin:v3.6.1 start node --bootstrap --kubeconfig=/var/lib/origin/openshift.local.config/node/node-bootstrap.kubeconfig

Re: join - starting with 3.7 we are officially supporting the bootstrap
path for nodes. join needs to be updated to work with it (and has not done
so), so please stay tuned. To see how bootstrapping is used, you can look
at openshift-ansible
https://github.com/openshift/openshift-ansible/blob/ec564267f4a25036c92a71be481cfd9e4c03537a/roles/openshift_node/tasks/bootstrap.yml

Briefly, start a master with the following settings in master-config.yml

kubernetesMasterConfig:
controllerArguments:
cluster-signing-cert-file:
- ca.crt
cluster-signing-key-file:
- ca.key

Create a new bootstrap config

oc sa create-kubeconfig -n openshift-infra node-bootstrapper >
/tmp/kubeconfig

Create a new node-config.yml file inside of a config map in the
openshift-node namespace as node-config (can give it any name) that will
form the template for the node.

Then run

openshift start node --bootstrap-config-name=node-config
--kubeconfig=/tmp/kubeconfig --config=/etc/origin/node/node-config.yml

The last flag determines where the bootstrap configuration is written to -
if there are any contents in this file, it is overwritten.

When the node starts, it'll say it's waiting for signed credentials. As an
admin on the master, do:

oc get csr | xargs oc adm certificate approve

once, and you should see a single cert approved (the node's client cert).
Then run it again and you should see the serving cert signed.

The node should then start up, download the node-config config map, update
its arguments, and then run.

On Thu, Nov 16, 2017 at 9:00 PM, Patrik Dufresne notifications@github.com
wrote:

I don't want to create a new ticket for this issue, so I will comments on
this one.
The oc cluster join command is broken in v3.6.1, here how:

--host-data-dir doesn't work
When executing oc cluster join --use-existing-config
--host-data-dir=/var/lib/origin/openshift.local.volumes --secret="$(cat
kubeconfig)", the generated docker container doesn't have the proper
volume created.
--volume="/var/lib/origin/openshift.local.config:/var/
lib/origin/openshift.local.config:z" is missing.

Volume /var/lib/origin/openshift.local.config not define
Even without --host-data-dir, the docker container doesn't have any
volume for /var/lib/origin/openshift.local.config. Basically, the certs
will be lost the next time we run oc cluster join again. You will get
something like this:

bootstrap client certificate does not match private key, you may need to delete the client CSR: tls: private key does not match public key

wrong cgroup driver used
Making reference to my own ticket #17190
https://github.com/openshift/origin/issues/17190
The node configuration file /var/lib/origin/openshift.
local.config/node/node-config.yaml is missing the following lines:

kubeletArguments:
cgroup-driver:

  • cgroupfs

Otherwise kublet refuse to start with the following error message:

failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

For adventurous people: I manage to get the node started by running
something similar:

docker run --name=origin --hostname=sylve --env="HOME=/root" --env="OPENSHIFT_CONTAINERIZED=true" --env="KUBECONFIG=/var/lib/origin/openshift.local.config/master/admin.kubeconfig" --env="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" --volume="/var/log:/var/log:rw" --volume="/var/run:/var/run:rw" --volume="/sys:/sys:rw" --volume="/sys/fs/cgroup:/sys/fs/cgroup:rw" --volume="/dev:/dev" --volume="/:/rootfs:ro" --volume="/var/lib/origin/openshift.local.volumes:/var/lib/origin/openshift.local.volumes:rslave" --volume="/var/lib/origin/openshift.local.config:/var/lib/origin/openshift.local.config:z" --network=host --privileged --restart= --label io.openshift.tags="openshift,core" --label license="GPLv2" --label io.k8s.description="OpenShift Origin is a platform for developing, building, and deploying containerized applications." --label build-date="20170911" --label name="CentOS Base Image" --label io.k8s.display-name="OpenShift Origin Application Platform" --label vendor="CentOS" --detach=true openshift/origin:v3.6.1 start node --bootstrap --kubeconfig=/var/lib/origin/openshift.local.config/node/node-bootstrap.kubeconfig

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/13336#issuecomment-345122695,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p5vDwP9fdRUopH83h-lnfRe_PVYDks5s3Oi9gaJpZM4MY_Zh
.

@ smarterclayton yep, I've read your PR #16571.
While openshift start node might work, it doesn't get called properly when trying to use oc cluster join ...
But did you have a look at #17331. I've raise this problem. It seams related to wrong arguments getting passed to the container.

oc cluster join is experimental and currently broken. It's on track to be
fixed for 3.9

On Thu, Nov 16, 2017 at 9:42 PM, Patrik Dufresne notifications@github.com
wrote:

@ smarterclayton yep, I've read your PR #16571
https://github.com/openshift/origin/pull/16571.
While openshift start node might work, it doesn't get called properly
when trying to use oc cluster join ...
But did you have a look at #17331
https://github.com/openshift/origin/issues/17331. I've raise this
problem. It seams related to wrong arguments getting passed to the
container.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/13336#issuecomment-345128987,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p1qYeucJi1L4jDcjeHaEcdyor514ks5s3PJ8gaJpZM4MY_Zh
.

@smarterclayton I'm not sure it's the right place to continue discussion. If you have a better place, please tell me.
With 3.7.0-rc0, I give a try to your command:

origin start node 
    --bootstrap-config-name=node-config
    --kubeconfig=/var/lib/origin/openshift.local.config/node/node-bootstrap.kubeconfig 
    --config=/var/lib/origin/openshift.local.config/node/node-config.yml
    --enable=kubelet
    --loglevel=4

In the logs, I see:

I1117 20:29:51.302904       1 start_node.go:274] Bootstrapping from master configuration
I1117 20:29:51.303009       1 bootstrap.go:58] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file
I1117 20:29:51.550320       1 csr.go:104] csr for this node already exists, reusing
I1117 20:29:51.552579       1 csr.go:112] csr for this node is still valid

Then nothing else happen. From the master server, I only received one cert. Probably the client server. I didn't get any serving cert to be approved.
Alos, my node it not showing in oc get nodes.

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

/remove-lifecycle stale

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

/remove-lifecycle rotten

Was this page helpful?
0 / 5 - 0 ratings