Che: Problem using Bridge Mode for Server and Workspace Containers (can't use --net=host)

Created on 14 Sep 2016  路  28Comments  路  Source: eclipse/che

I'm running codenvy/che-server image with docker in a host machine which I'm not able to change the /etc/resolv.conf file. So, in order to allow containers to connect to internet I have to use bridge network mode which I do with the following command:

VOLUMES="-v /var/run/docker.sock:/var/run/docker.sock -v /home/user/che/lib:/home/user/che/lib-copy -v /home/user/che/workspaces:/home/user/che/workspaces -v /home/user/che/storage:/home/user/che/storage"

docker run --name che $VOLUMES -P --dns my_dns1 --dns my_dns2 --env  http_proxy=myproxy:8080 codenvy/che-server --remote:REMOTE_IP

Up to that point I cat ssh into the server container and wget / clone any internet url. However, when I create a new workspace I always get a message saying "Internal Server Error: https://github.com/che-samples/blank: cannot open git-upload-pack" (actually not so meaningful).

Then I ssh into the newly created workspace containter and try to wget / clone any webpage/repo from internet with no luck.

If I do "cat /etc/resolv.conf" in both server and workspace containers, I can see that only the server has the correct DNS configutarion. This may be logic but problematic as the --net=host option is not really an option in my case.

Is there any way I can instruct che-server to use bridge mode for workspace containers as well? Also, to pass other variables like DNS or Proxy.

Reproduction Steps:

  1. Configure host DNS so you can't resolv google, or github or repo.maven.apache.org
    launch codenvy/che-server with default bridge mode adding dns and proxy parameters in run command
  2. Go to http://REMOTE_IP:8080/ (or your assigned port) and create a new workspace with default balnk project

Expected behavior:

Server and workspace containers should share their network configuration or at least provide a way to configure docker run arguments in the che.properties.

Observed behavior:

The server will launch correctly whereas the workspace container won't be able to clone the template project from github. As the workspace container will be still launched with --net=host parameter, it won't have the proper DNS or Proxy configuration.

image

OS and version: RHEL 7
Docker version: 1.8.0,

Additional information:

  • Problem started happening recently, didn't happen in an older version of Che: NO
  • Problem can be reliably reproduced, doesn't happen randomly: YES
kinquestion

Most helpful comment

@raphsoft - if it was "latest" from 4 days ago it is 4.7.2. I have added an issue to add an improvement to the Che CLI so that if you are running a che-server, it will connect to it to discover the version of that server.

All 28 comments

@garagtyi, @l0rd - this would be in your areas for thinking and comment.

I am thinking out loud. Is bridge mode the way to handle this or is it better to configure the docker daemon with the appropriate proxy configuration?

Thanks for reporting this issue. This sounds like an important area to consider. I am not our networking expert so will need to pull in others that can consider what to do.

Please provide version of Che.

We should always avoid --net=host. Bridge mode is more flexible. In the long term we should consider creating a custom che network. I think that @garagatyi was already considering that solution and I would be happy to help on this topic.

@raphsoft can you clone with Git cli in the workspace terminal? It can be a problem with JGit.

@eivantsov nope, I can't. I don't think it's a JGit problem as the message I receive is "unknown host". I tried setting -Dhttp.proxyHost -DsocksProxyHost and their ports in the che.conf file with no luck. Also, as I said before: If I do "cat /etc/resolv.conf" in both server and workspace containers, I can see that only the server has the correct DNS configutarion.

Basically I have this configuration:

  • A Host machine which I can't change network configuration in resolv.conf
  • A server container I started with --dns parameters and bridged network mode
  • A workspace container the server starts when I create a workspace with host network that can't resolve names, thus cloning templates or downloading maven artifacts is not possible.

@garagatyi Honestly I don't know what exact version of che-server I'm running. I looked for a version number everywhere into the container /home/user/che and in the application but couldn't find any. Most I can say is that I pulled the docker image two weeks ago and last update I did was 4 days ago. So it's labeled as "latest".

@raphsoft - if it was "latest" from 4 days ago it is 4.7.2. I have added an issue to add an improvement to the Che CLI so that if you are running a che-server, it will connect to it to discover the version of that server.

@raphsoft - this pull request for the che-launcher adds the version number of the running server to the output of "che start" and "che info". We had to do a little bit of hackery - but the server does report its version through an API so used that.
https://github.com/eclipse/che-dockerfiles/pull/14

@TylerJewell nice! So the version should be available when I request http://dockerserver:8080/api/? I just did that and can't see it :(

I have also compiled my fork of che and want to run it in a different server using the /bin/che.sh file. Can you please provide answer for the following questions?

  1. Does the new server need to have docker installed?
  2. Do I need anything else than the contents of the che/assembly/assembly-main/target/eclipse-che-5.0.0-M2-SNAPSHOT/eclipse-che-5.0.0-M2-SNAPSHOT folder? Seems to be the same content of the /home/users/che folder of the docker based version (so I will say I don't need anything else, do I?
  3. Say I want to modify how che-server launches its workspace containers, any clue where I can look?

I've merged the pull request - but it will be a day or two before those changes are rolled into the codenvy/che-launcher:nightly image. But it wil be there. All you will need to do is "che info" and if you have a running che server, it will find the version of that server that is running.

1: Not sure I understand your question. The /bin/che.sh of an assembly is just the command that is running inside of our che-server.

2: As a result, I recommend you set CHE_LOCAL_BINARY to the directory that you have build. And then you can use our Che CLI with che start, che stop and it will use the assembly that you built. The directory that you pointed to will have everything that you need in the binary.

3: That will be in the plugin-docker section of our source code. @garagatyi - would be able to point you to it better than I can. But somewhere in all of that is our docker run syntax :)

I tried to run my local build using both CLI and che.sh script. In both cases I got the following error.
Initially, I though I was missing to set and export the CHE_HOME variable, but after doing that, I still got the same.

I will appreciate any guidance you can provide.

2016-09-15 23:33:39,502[main]             [INFO ] [c.m.JmxRemoteLifecycleListener 332]  - The JMX Remote Listener has configured the registry on port 32001 and the server on port 32101 for the Platform server
2016-09-15 23:33:39,504[main]             [INFO ] [o.a.c.core.StandardService 435]      - Starting service Catalina
2016-09-15 23:33:39,504[main]             [INFO ] [o.a.c.core.StandardEngine 259]       - Starting Servlet Engine: Apache Tomcat/8.0.32
2016-09-15 23:33:39,623[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 910]        - Deploying web application archive /home/user/che/tomcat/webapps/dashboard.war
2016-09-15 23:33:57,188[ost-startStop-1]  [INFO ] [o.a.c.u.SessionIdGeneratorBase 241]  - Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [17,025] milliseconds.
2016-09-15 23:33:57,221[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 974]        - Deployment of web application archive /home/user/che/tomcat/webapps/dashboard.war has finished in 17,598 ms
2016-09-15 23:33:57,225[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 910]        - Deploying web application archive /home/user/che/tomcat/webapps/ide.war
2016-09-15 23:33:57,411[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 974]        - Deployment of web application archive /home/user/che/tomcat/webapps/ide.war has finished in 186 ms
2016-09-15 23:33:57,413[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 910]        - Deploying web application archive /home/user/che/tomcat/webapps/wsmaster.war
2016-09-15 23:34:01,456[ost-startStop-1]  [WARN ] [p.DockerExtConfBindingProvider 51]   - CHE_LOCAL_CONF_DIR set to the /home/user/che/conf/ but it must be directory not file
2016-09-15 23:34:02,539[ost-startStop-1]  [ERROR] [o.a.c.c.C.[.[.[/wsmaster] 4818]      - Exception sending context initialized event to listener instance of class org.eclipse.che.inject.CheBootstrap
com.google.inject.CreationException: Unable to create injector, see the following errors:

1) No implementation for java.lang.String[] annotated with @com.google.inject.name.Named(value=che.account.reserved_names) was bound.
  while locating java.lang.String[] annotated with @com.google.inject.name.Named(value=che.account.reserved_names)
    for the 4th parameter of org.eclipse.che.api.user.server.UserManager.<init>(UserManager.java:65)
  while locating org.eclipse.che.api.user.server.UserManager
    for the 1st parameter of org.eclipse.che.api.user.server.UserService.<init>(UserService.java:78)
  at org.eclipse.che.api.deploy.WsMasterModule.configure(WsMasterModule.java:41)

2) No implementation for java.lang.String[] annotated with .....(omitted)

3) No implementation for org.eclipse.che.account.spi.AccountDao was bound......(omitted)

4) No implementation for org.eclipse.che.account.spi.AccountDao was bound......(omitted)

5) No implementation for org.eclipse.che.account.spi.AccountDao was bound......(omitted)

6) No implementation for org.eclipse.che.account.spi.AccountDao was bound......(omitted)

7) No implementation for org.eclipse.che.account.spi.AccountDao was bound......(omitted)

8) No implementation for org.eclipse.che.account.spi.AccountDao was bound.
  while locating org.eclipse.che.account.spi.AccountDao
    for the 1st parameter of org.eclipse.che.account.api.AccountManager.<init>(AccountManager.java:34)
  while locating org.eclipse.che.account.api.AccountManager
    for the 4th parameter of org.eclipse.che.api.workspace.server.WorkspaceManager.<init>(WorkspaceManager.java:107)
  while locating com.google.inject.Provider<org.eclipse.che.api.workspace.server.WorkspaceManager>
    for the 2nd parameter of org.eclipse.che.plugin.docker.machine.local.node.provider.LocalWorkspaceFolderPathProvider.<init>(LocalWorkspaceFolderPathProvider.java:85)
  at org.eclipse.che.plugin.docker.machine.local.LocalDockerModule.configure(LocalDockerModule.java:52) (via modules: org.eclipse.che.api.deploy.WsMasterModule -> org.eclipse.che.plugin.docker.machine.local.LocalDockerModule)

8 errors
        at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:470) ~[guice-4.1.0.jar:na]
        at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:155) ~[guice-4.1.0.jar:na]
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:107) ~[guice-4.1.0.jar:na]
        at com.google.inject.Guice.createInjector(Guice.java:99) ~[guice-4.1.0.jar:na]
        at org.everrest.guice.servlet.EverrestGuiceContextListener.getInjector(EverrestGuiceContextListener.java:140) ~[everrest-integration-guice-1.13.1.jar:na]
        at com.google.inject.servlet.GuiceServletContextListener.contextInitialized(GuiceServletContextListener.java:47) ~[guice-servlet-4.1.0.jar:na]
        at org.everrest.guice.servlet.EverrestGuiceContextListener.contextInitialized(EverrestGuiceContextListener.java:85) ~[everrest-integration-guice-1.13.1.jar:na]
        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4812) [catalina.jar:8.0.32]
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5255) [catalina.jar:8.0.32]
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:147) [catalina.jar:8.0.32]
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:725) [catalina.jar:8.0.32]
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:701) [catalina.jar:8.0.32]
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:717) [catalina.jar:8.0.32]
        at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:939) [catalina.jar:8.0.32]
        at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1812) [catalina.jar:8.0.32]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_92-internal]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_92-internal]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_92-internal]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_92-internal]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92-internal]
2016-09-15 23:34:02,611[ost-startStop-1]  [ERROR] [o.a.c.c.C.[.[.[/wsmaster] 4818]      - Exception sending context initialized event to listener instance of class org.eclipse.che.everrest.ServerContainerInitializeListener
java.lang.NullPointerException: null
        at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:210) ~[guava-18.0.jar:na]
        at org.everrest.core.impl.RequestDispatcher.<init>(RequestDispatcher.java:84) ~[everrest-core-1.13.1.jar:na]
        at org.eclipse.che.everrest.ServerContainerInitializeListener.getEverrestProcessor(ServerContainerInitializeListener.java:165) ~[che-core-api-core-5.0.0-M2-SNAPSHOT.jar:5.0.0-M2-SNAPSHOT]
        at org.eclipse.che.everrest.ServerContainerInitializeListener.createWsServerEndpointConfig(ServerContainerInitializeListener.java:123) ~[che-core-api-core-5.0.0-M2-SNAPSHOT.jar:5.0.0-M2-SNAPSHOT]
        at org.eclipse.che.everrest.ServerContainerInitializeListener.contextInitialized(ServerContainerInitializeListener.java:91) ~[che-core-api-core-5.0.0-M2-SNAPSHOT.jar:5.0.0-M2-SNAPSHOT]
        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4812) [catalina.jar:8.0.32]
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5255) [catalina.jar:8.0.32]
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:147) [catalina.jar:8.0.32]
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:725) [catalina.jar:8.0.32]
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:701) [catalina.jar:8.0.32]
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:717) [catalina.jar:8.0.32]
        at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:939) [catalina.jar:8.0.32]
        at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1812) [catalina.jar:8.0.32]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_92-internal]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_92-internal]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_92-internal]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_92-internal]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92-internal]
2016-09-15 23:34:02,612[ost-startStop-1]  [ERROR] [o.a.c.core.StandardContext 5256]     - One or more listeners failed to start. Full details will be found in the appropriate container log file
2016-09-15 23:34:02,613[ost-startStop-1]  [ERROR] [o.a.c.core.StandardContext 5307]     - Context [/wsmaster] startup failed due to previous errors
2016-09-15 23:34:02,638[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 974]        - Deployment of web application archive /home/user/che/tomcat/webapps/wsmaster.war has finished in 5,224 ms
2016-09-15 23:34:02,640[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 910]        - Deploying web application archive /home/user/che/tomcat/webapps/ROOT.war
2016-09-15 23:34:02,709[ost-startStop-1]  [INFO ] [o.a.c.startup.HostConfig 974]        - Deployment of web application archive /home/user/che/tomcat/webapps/ROOT.war has finished in 69 ms

This looks suspiciously like a classpath error of some sort. @vparfonov @azatsarynnyy @musienko-maxim - any ideas?

@azatsarynnyy - how do you recommend a fix for @raphsoft?

@raphsoft check whether the che.account.reserved_names property exists in your che.properties file

I just see that PR #1790 was merged yesterday where this property was renamed from user.reserved_names to che.account.reserved_names.

Yep, user.reserved_names was present and I also added che.account.reserved_names just in case. Same thing. I'm now building 4.7.2 tag. Will let you know how it goes.

May be @akorneta can help there

@raphsoft Looks like you use latest build with old che.properties, can you share/attach build version with configs and I'll figure out what the problem is.

@raphsoft @TylerJewell my 2 cents here:

  1. Exporting http proxy in a workspace container will make it possible to allow utilities like wget, curl or git (native) connect to the internet.
  2. At the same time, since JGit is deployed with the workspace agent, I am not sure Java will pick this env - since it has to be added to JAVA_OPTS.
  3. Further problems include configuring Maven to work with the proxy (settings.xml edit) or any other tools like npm etc.

I tried a lot of things with no luck :( My best option was:

  1. I create a docker-in-docker setup so I launch a first "wrapper container" with --dns options and bridged mode.
  2. Then I run che-server within the wrapper and it should catch the correct configuration using --net=host.
  3. The che-server launches its children workspace containers (actually children to the wrapper) with --net=host as well and everything works just fine.

It worked pretty good for 1 and 2. I can expose 8000 and 8000 from the inner che-server to the wrapper container and then to the host, which allows me to access the che dashboard from any computer in the network.

The problem is that when I create the workspace it launches a ws-container in a random port and I can't map it in the same way as the che-server container. Maybe the question here is more related about docker-in-docker than eclipse che but I wanted to let you know.

If I could restrict the ports for the workspaces to say only 10, I would be able to expose them all in the wrapper and that may solve my problems temporary. There's some info here but not so descriptive https://eclipse-che.readme.io/docs/usage-docker-server. Tried editing this file: /proc/sys/net/ipv4/ip_local_port_range but even with root I can't edit.

Do you have any idea? I've been spending so many hours here that I'm about to leave it because of other priorities. To me it's just not fair to skip the chance of doing full test of che in my company just because of I can't setup the DNS in resolv.conf at the host level.

@raphsoft - yeah, you got pretty far. There are other threads on our GitHub issues that also talk about port limiting. And everything that has been tried so far seems more like a partial solution, and very difficult to configure.

Now, one thing is that Docker always uses the same starting ports in the ephemeral range. So if you knoew how many workspaces you were going to open, you could expose them all in the wrapper, as they start from the same position. The more ports that you expose, however, the slower that Docker will start a container.

The other thing that we have thought about as a long term solution is to implement some form of workspace tunneling within the Che server, or the che launcher. So within the che launcher, we'd open up a reverse-proxy, like traefik, and then have a single port that channels all workspace traffic into it. Each URL that goes through teh common port would have embedded within it, the actual URL that the browser is trying to communicate with. So there would be some sort of logic around encoding URLs that are sent to the browser and decoding when they come back to the reverse-proxy.

We generally know what we should do for such a situation, but we think it's a lot of pluming work, as we'd need to have a property in che that enables this mode, and then within Che we'd need to encode all outgoing URLs when the UD or IDE needs to receive a workspace URL. And then we'd need to have a reverse proxy launched inside of the che-launcher. And then we'd need some sort of decoding algorithm for the reverse proxy.

Can you please let me know if this is true:

As I saw in the source code, the che-server is connecting to port 2375 by default. So even if if launch a wrapper with docker service in port 2399 (with my desired networkin), the che-server will still connect to the host in 2375 and therefore launch workspaces with restricted network capabilities (not sure about it).

I'm guessing because even after doing the wrapper thing and pre-defined ports I was not able to get the workspace container to download anything from the outside world :(

Finally, if all I said before is true. I don't have a way to tell che-server my custom docker connection url, do I?

Hi @raphsoft - let me see if I can parse what you are saying.

The che-server container can connect to Docker of any host and port. You need to configure the che-server docker container to have the internal Docker services know how to connect to the daemon. If you look on this networking configuration page, we have a table that talks about the various configurations that you need to change. The first row is the instructions on how to tell the Che server to reach a Docker daemon. By default, we have the che-server docker container volume mount /var/run/docker.sock for it to access the daemon over a networking socket. But you don't have to use that.

Now, keep in mind that the configuration that you give to the Che server and how it connects to the Docker daemon is independent of how the workspace containers connect to the outside world. If you want to have the services inside of a container connect to the public Internet using a specialized port, you could have the Docker image that they are created from have these configuration items within their OS boot configuration.

@raphsoft you may export DOCKER_HOST and configure Docker to listen over tpc, and this way Chre won't communicate with Docker using Unix socket.

So what we can exactly do, so that che can communicate with the world?

Your workspace does not communicate using Docker to the outside world. If you want your internal worksapce to communicate to the outside world, you need to configure the internal proxy configuration parameters inside of the workspace after it has been created.

Closing as I think the original issue has bene resolved. We have a new CLI which also simplifies how proxy configurations within workspaces are set up. If you have a specific configuration issue please open a new issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

JamesDrummond picture JamesDrummond  路  3Comments

vanzhiganov picture vanzhiganov  路  3Comments

LaneGeek picture LaneGeek  路  3Comments

sleshchenko picture sleshchenko  路  3Comments

redeagle84 picture redeagle84  路  3Comments