Using the standalone-chrome or node-chrome images, I often observe a timeout when attempting to launch a new Chrome session.
Steps to reproduce:
1) launch a standalone chrome instance: docker run -d --name chrome selenium/standalone-chrome:2.47.1
2) build the test container: docker build -t selenium/test:local ./Test
3) run the repeat test script: ./test-repeat.sh chrome
4) If necessary, repeat steps 1 and 3 until the problem is observed
Note:
In order to narrow the focus of the test to launching new sessions, you may want to temporarily modify Test/smoke-test.js to omit the part of the test that tries navigating to github.com after starting the session. After modifying the script, make sure to rebuild the selenium/test:local image to pick up the changes.
Expected results:
All 50 sessions launch and quit successfully - the test passes.
Actual results:
One of the session launch commands will hang indefinitely.
Docker host: boot2docker v1.7.0 (Tiny Core Linux)
I can reproduce this issue. My first run was actually clean, but it hung on the 41st attempt during my second run.
Here's a bit more detail about what I see happening on the system when the launch hangs.
The last line from the selenium-server stdout just says that it's launching a new Chrome session:
➜ ~ docker logs 9251a | tail -n 4
12:26:19.769 INFO - Executing: [new session: Capabilities [{platform=ANY, javascriptEnabled=true, browserName=chrome, version=}]])
12:26:19.771 INFO - Creating a new session for Capabilities [{platform=ANY, javascriptEnabled=true, browserName=chrome, version=}]
Starting ChromeDriver 2.16.333243 (0bfa1d3575fc1044244f21ddb82bf870944ef961) on port 17315
Only local connections are allowed.
Inside of the container, I see that some Chrome processes are in fact running:
root@9251a325813b:/# ps auxww
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
seluser 1 0.1 0.1 17968 2796 ? Ss 12:25 0:00 /bin/bash /opt/bin/entry_point.sh
seluser 5 0.0 0.0 4448 1640 ? S 12:25 0:00 /bin/sh /usr/bin/xvfb-run --server-args=:99.0 -screen 0 1360x1020x24 -ac +extension RANDR java -jar /opt/selenium/selenium-server-standalone.jar
seluser 16 0.6 1.2 207472 26296 ? Sl 12:25 0:00 Xvfb :99 :99.0 -screen 0 1360x1020x24 -ac +extension RANDR -nolisten tcp -auth /tmp/xvfb-run.qPOeDR/Xauthority
seluser 27 17.2 5.6 3042680 116716 ? Sl 12:25 0:09 java -jar /opt/selenium/selenium-server-standalone.jar
root 241 0.1 0.1 18144 3376 ? Ss 12:26 0:00 bash
seluser 811 0.3 0.5 381852 11012 ? Sl 12:26 0:00 /opt/selenium/chromedriver-2.16 --port=17315
seluser 816 0.8 3.8 555444 79020 ? Sl 12:26 0:00 /opt/google/chrome/chrome --no-sandbox --disable-background-networking --disable-client-side-phishing-detection --disable-component-update --disable-default-apps --disable-hang-monitor --disable-prompt-on-repost --disable-sync --disable-web-resources --enable-logging --ignore-certificate-errors --load-extension=/tmp/.com.google.Chrome.HFdqNc/internal --log-level=0 --metrics-recording-only --no-first-run --password-store=basic --remote-debugging-port=12545 --safebrowsing-disable-auto-update --safebrowsing-disable-download-protection --test-type=webdriver --use-mock-keychain --user-data-dir=/tmp/.com.google.Chrome.XIexPl data:,
seluser 824 0.0 0.0 4368 660 ? S 12:26 0:00 cat
seluser 825 0.0 0.0 4368 656 ? S 12:26 0:00 cat
seluser 827 0.0 1.9 347672 40388 ? S 12:26 0:00 /opt/google/chrome/chrome --type=zygote --enable-logging --log-level=0 --no-sandbox --user-data-dir=/tmp/.com.google.Chrome.XIexPl
seluser 828 0.0 0.3 87176 6432 ? S 12:26 0:00 /opt/google/chrome/nacl_helper --no-sandbox
seluser 845 0.1 2.2 441212 46220 ? Sl 12:26 0:00 /opt/google/chrome/chrome --type=gpu-process --channel=816.0.1041773465 --enable-logging --log-level=0 --no-sandbox --user-data-dir=/tmp/.com.google.Chrome.XIexPl --v8-natives-passed-by-fd --v8-snapshot-passed-by-fd --supports-dual-gpus=false --gpu-driver-bug-workarounds=2,45,57 --disable-accelerated-video-decode --gpu-vendor-id=0x0000 --gpu-device-id=0x0000 --gpu-driver-vendor --gpu-driver-version --user-data-dir=/tmp/.com.google.Chrome.XIexPl --v8-natives-passed-by-fd --v8-snapshot-passed-by-fd --enable-logging --log-level=0
seluser 874 0.0 0.0 0 0 ? Z 12:26 0:00 [chrome] <defunct>
seluser 879 3.7 0.7 555444 15400 ? S 12:26 0:01 /opt/google/chrome/chrome --no-sandbox --disable-background-networking --disable-client-side-phishing-detection --disable-component-update --disable-default-apps --disable-hang-monitor --disable-prompt-on-repost --disable-sync --disable-web-resources --enable-logging --ignore-certificate-errors --load-extension=/tmp/.com.google.Chrome.HFdqNc/internal --log-level=0 --metrics-recording-only --no-first-run --password-store=basic --remote-debugging-port=12545 --safebrowsing-disable-auto-update --safebrowsing-disable-download-protection --test-type=webdriver --use-mock-keychain --user-data-dir=/tmp/.com.google.Chrome.XIexPl data:,
I am a bit suspicious of the defunct proc there, but it doesn't seem to consistently appear along with the problem, so I'm not sure that it's related to the hang at all.
For what it's worth, I don't see anything out of the ordinary in the Chrome debug log:
root@9251a325813b:/# cat /tmp/.com.google.Chrome.XIexPl/chrome_debug.log
[816:816:0814/122620:ERROR:browser_main_loop.cc(185)] Running without the SUID sandbox! See https://code.google.com/p/chromium/wiki/LinuxSUIDSandboxDevelopment for more information on developing with the sandbox on.
[816:816:0814/122620:INFO:audio_manager_pulse.cc(258)] Failed to connect to the context. Error: Connection refused
[845:845:0814/122620:ERROR:sandbox_linux.cc(345)] InitializeSandbox() called with multiple threads in process gpu-process
[816:816:0814/122620:WARNING:password_store_factory.cc(346)] Using basic (unencrypted) store for password storage. See http://code.google.com/p/chromium/wiki/LinuxPasswordStorage for more information about password storage options.
[874:874:0814/122620:ERROR:renderer_main.cc(200)] Running without renderer sandbox
[874:874:0814/122635:INFO:child_thread_impl.cc(666)] ChildThreadImpl::EnsureConnected()
(Those errors appear even when the session launch succeeds.)
As a sort of control I've executed the same test (repeatedly launch/quit Chrome sessions) against an Ubuntu 14.04 VM that is set up just like the Docker images -- the same Chrome and chromedriver versions, using xvfb with the same screen geometry, etc... The session launches are successful there 100% of the time, as far as I can tell, so the problem does appear to be specific to running chromedriver within a container.
Happens to me on an Ubuntu 14.04 VM (not in a container):
chrome_debug.log:
[2854:2854:0825/123047:ERROR:nss_util.cc(97)] Failed to create /.pki/nssdb directory.
[2891:2891:0825/123047:ERROR:sandbox_linux.cc(345)] InitializeSandbox() called with multiple threads in process gpu-process
[2854:2854:0825/123047:WARNING:password_store_factory.cc(346)] Using basic (unencrypted) store for password storage. See http://code.google.com/p/chromium/wiki/LinuxPasswordStorage for more information about password storage options.
[1:1:0825/123102:INFO:child_thread_impl.cc(666)] ChildThreadImpl::EnsureConnected()
Chrome process list:
selenium 1347 1 0 12:04 ? 00:00:11 /usr/bin/java -jar /usr/local/selenium/server/selenium-server-standalone.jar -role node -nodeConfig /usr/local/selenium/config/selenium_node.json -Dwebdriver.chrome.driver=/usr/local/selenium/drivers/chromedriver/chromedriver
selenium 2848 1347 0 12:30 ? 00:00:00 /usr/local/selenium/drivers/chromedriver_linux64-2.18/chromedriver --port=22923
selenium 2854 2848 0 12:30 ? 00:00:00 /opt/google/chrome/chrome --disable-background-networking --disable-client-side-phishing-detection --disable-component-update --disable-default-apps --disable-hang-monitor --disable-popup-blocking --disable-prompt-on-repost --disable-sync --disable-web-resources --enable-logging --ignore-certificate-errors --load-extension=/tmp/.com.google.Chrome.oy5xLY/internal --log-level=0 --metrics-recording-only --no-first-run --password-store=basic --remote-debugging-port=12573 --safebrowsing-disable-auto-update --safebrowsing-disable-download-protection --test-type=webdriver --use-mock-keychain --user-data-dir=/tmp/.com.google.Chrome.nwXnbU data:,
selenium 2869 2854 0 12:30 ? 00:00:00 /opt/google/chrome/chrome --type=zygote --enable-logging --log-level=0 --user-data-dir=/tmp/.com.google.Chrome.nwXnbU
selenium 2870 2869 0 12:30 ? 00:00:00 /opt/google/chrome/nacl_helper
selenium 2873 2869 0 12:30 ? 00:00:00 /opt/google/chrome/chrome --type=zygote --enable-logging --log-level=0 --user-data-dir=/tmp/.com.google.Chrome.nwXnbU
selenium 2891 2854 0 12:30 ? 00:00:00 /opt/google/chrome/chrome --type=gpu-process --channel=2854.0.384655797 --enable-logging --log-level=0 --user-data-dir=/tmp/.com.google.Chrome.nwXnbU --v8-natives-passed-by-fd --v8-snapshot-passed-by-fd --supports-dual-gpus=false --gpu-driver-bug-workarounds=2,45,57 --disable-accelerated-video-decode --gpu-vendor-id=0x15ad --gpu-device-id=0x0405 --gpu-driver-vendor --gpu-driver-version --user-data-dir=/tmp/.com.google.Chrome.nwXnbU --v8-natives-passed-by-fd --v8-snapshot-passed-by-fd --enable-logging --log-level=0
selenium 2916 2873 0 12:30 ? 00:00:00 [chrome] <defunct>
selenium 2923 2854 0 12:30 ? 00:00:05 /opt/google/chrome/chrome --disable-background-networking --disable-client-side-phishing-detection --disable-component-update --disable-default-apps --disable-hang-monitor --disable-popup-blocking --disable-prompt-on-repost --disable-sync --disable-web-resources --enable-logging --ignore-certificate-errors --load-extension=/tmp/.com.google.Chrome.oy5xLY/internal --log-level=0 --metrics-recording-only --no-first-run --password-store=basic --remote-debugging-port=12573 --safebrowsing-disable-auto-update --safebrowsing-disable-download-protection --test-type=webdriver --use-mock-keychain --user-data-dir=/tmp/.com.google.Chrome.nwXnbU data:,
I have also experienced this and trying to discover the cause has not been very successful. If I switch our tests to run using firefox, I do not experience this issue. When using the selenium-node-chrome-debug, I can see the browser open and running tests. During a hang, however, the browser is only open on the task bar and not able to be interacted with.
I am also having this issue. This will prevent me to use docker-selenium. I need it to be stable so I can have continuous integration and monitoring over it. And I can't use firefox since I am using some chromeOptions.
Seeing this issue as well.
Checking in to report I am also seeing this issue. Google search led me here...
I am having these issues as well, I found that decreasing the memory available to docker reduced how frequently this happens.
Seeing this as well
Any progress here? I had this issues on rare occasions in previous releases, but after updating it occures so often that I can't run my tests anymore, no matter how often I try.
I am having similar issues that I haven't investigated closely yet.
Is there any reason to believe that the container should be running an init process? Lack of proper zombie killing is one reason containers can behave strangely, right?
I think I'll try to build a Chrome WebDriver container that uses Yelp's dumb_init (http://engineeringblog.yelp.com/2016/01/dumb-init-an-init-for-docker.html) and see if that helps. Otherwise I'll have to make a timeout on a higher level to force delete and recreate the containers.
This has definitely gotten worse in the last month or two. I can't run our tests at all now.
Have you guys seen the instructions in https://github.com/SeleniumHQ/docker-selenium#running-the-images?
it says to make sure you mount /dev/shm on the container. It has already improved robustness for me greatly.
Yes, I have /dev/shm bind mounted.
My team has been seeing these issues as well and we're not using docker, just a regular grid setup based on Ubuntu 14.04, so perhaps this is not specifically related to docker?
To reproduce, we wrote a simple selenium script that repeatedly opens Chrome, visits a generic web page and quits the browser. After about 50 iterations the script hangs trying to start a new Chrome session. This only happens using a RemoteWebDriver, though. Using a local Chrome instance we haven't seen it hanging. HTH
@sterago
wrote a simple selenium script that repeatedly opens Chrome
Is it with or without selenium grid? Probably simple node would work?
@Vanuan
The script fails when using a browser handle obtained by contacting a selenium hub. Hub and node are running on the same machine.
So it's anywhere in this chain
hub -> node -> chromedriver -> chrome
|
hub <- node <- chromedriver <-----
According to the log:
04:44:24.140 INFO - Creating a new session for Capabilities [{rotatable=false, nativeEvents=false, browserName=chrome, takesScreenshot=false, javascriptEnabled=false, version=, platform=ANY, cssSelectorsEnabled=false}]
Starting ChromeDriver 2.20.353124 (035391233162d32c80f1dce587c8154a13830c3b) on port 20575
Only local connections are allowed.
the hub reaches the node, but the node is unable to reach chromedriver because this message isn't printed:
04:44:24.450 INFO - Done: [new session: Capabilities [{rotatable=false, nativeEvents=false, browserName=chrome, takesScreenshot=false, javascriptEnabled=false, version=, platform=ANY, cssSelectorsEnabled=false}]]
@Vanuan Yes, and considering that when using a local ChromeDriver instance on that same machine it never hung during our tests, one could assume that the chromedriver <-> chrome part of the chain can be excluded as, at least in isolation, it seems not to trigger the issue. Perhaps there is some kind of deadlock happening during the communication between node and chromedriver? The result of some quick tests we did strace'ing all the parts involved seemed to suggest something along those lines, but I wouldn't bet on it.
I've just reproduced it by directly connecting to the node (without a hub).
@Vanuan Is that using Docker? What's the OS? Our test script is using the Python selenium bindings, yours as well?
No, mine is using ruby bindings. Yes. It's a docker node. Here's a run command:
docker run -d \
-p 5555:5555 \
-e HUB_PORT_4444_TCP_ADDR=${HUB_HOST} \
-e HUB_PORT_4444_TCP_PORT=4444 \
-e REMOTE_HOST=${CURRENT_HOST}:5555 \
-v /dev/shm:/dev/shm \
--name=chrome \
selenium/node-chrome:2.52.0
And I reproduce the timeout by connecting directly to ${CURRENT_HOST}:5555
It happens less frequently though. Client, hub and node are all separate machines.
This does look strange:
07:44:44.661 INFO - Creating a new session for Capabilities [{rotatable=false, nativeEvents=false, browserName=chrome, takesScreenshot=false, javascriptEnabled=false, version=, platform=ANY, cssSelectorsEnabled=false}]
Starting ChromeDriver 2.20.353124 (035346203162d32c80f1dce587c8154a1efa0c3b) on port 1317
Only local connections are allowed.
...
08:37:00.129 INFO - Command failed to close cleanly. Destroying forcefully (v2). [/opt/selenium/chromedriver-2.20, --port=20521][ {}]
08:37:01.137 ERROR - Unable to kill process with PID 1563
08:37:01.138 WARN - Exception thrown
...
Caused by: org.openqa.selenium.WebDriverException: java.lang.reflect.InvocationTargetException
Driver info: driver.version: unknown
at org.openqa.selenium.remote.server.DefaultDriverProvider.callConstructor(DefaultDriverProvider.java:113)
...
java.util.concurrent.ExecutionException: org.openqa.selenium.WebDriverException: java.lang.reflect.InvocationTargetException
Build info: version: '2.52.0', ...
...
Driver info: driver.version: ChromeDriver
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:665)
at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:249)
at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:131)
at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:144)
at org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:170)
at org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:138)
... 14 more
Caused by: org.openqa.selenium.WebDriverException: java.net.SocketTimeoutException: Read timed out
...
Driver info: driver.version: ChromeDriver
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:91)
...
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
...
at org.openqa.selenium.remote.internal.ApacheHttpClient.execute(ApacheHttpClient.java:90)
at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:142)
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:82)
chromedriver is running there though:
docker exec chrome ps aux|grep chromedriver
seluser 3931 0.0 0.1 391412 9292 ? Sl 07:44 0:00 /opt/selenium/chromedriver-2.20 --port=1317
seluser 4689 0.0 0.0 391412 7432 ? Sl 08:23 0:00 /opt/selenium/chromedriver-2.20 --port=30084
And it's listening:
netstat -tl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:12698 *:* LISTEN
tcp 0 0 localhost:30084 *:* LISTEN
tcp 0 0 localhost:1317 *:* LISTEN
tcp 0 0 localhost:12106 *:* LISTEN
tcp 0 0 localhost:12050 *:* LISTEN
tcp 0 0 *:5555 *:* LISTEN
Reproduced again. Port used by chromedriver: 1081
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
# Selenium-standalone.jar talking to chromedriver:
tcp 0 0 localhost:1081 localhost:41303 ESTABLISHED 14332/chromedriver-
tcp 0 0 localhost:41303 localhost:1081 ESTABLISHED 21/java
# Chrome talking to chromedriver
tcp 0 0 localhost:12544 localhost:38406 ESTABLISHED 14337/internal --lo
tcp 0 0 localhost:38406 localhost:12544 ESTABLISHED 14332/chromedriver-
# Selenium-standalone.jar talking to selenium hub
tcp 0 0 b2962a5d73da:49516 $SELENIUM_HUB:4444 ESTABLISHED 21/java
tcp 1 0 b2962a5d73da:49506 $SELENIUM_HUB:4444 CLOSE_WAIT 21/java
# Selenium-standalone.jar talking to itself?
tcp 0 0 b2962a5d73da:5555 $CONTAINER_IP:33263 ESTABLISHED 21/java
# Client (tests) talking to Selenium-standalone.jar server
tcp 0 0 b2962a5d73da:5555 $DRIVER_CLIENT:60783 ESTABLISHED 21/java
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
# chromedriver
tcp 0 0 localhost:1081 *:* LISTEN 14332/chromedriver-
# chrome extension that communicates with chromedriver
tcp 0 0 localhost:12544 *:* LISTEN 14337/internal --lo
# selenium node that communicates with chromedriver and exposes a port
tcp 0 0 *:5555 *:* LISTEN 21/java
How can I check where the connection times out? What's the code is responsible for writing Done: [new session:?
For me, this NEVER happens if I start chrome from inside an X11 session. No matter if it runs against Xvfb or not, or whether Xvfb has been started on the console. The problem appears to be X11 resource usage related to a live/real X11 user desktop session. I traced it down to DBUS. Setting DBUS_SESSION_BUS_ADDRESS=/some/nonsense in Jenkins fixed my testing there and chrome/chromium are starting up there again without any problems at all.........
@jjYBdx4IL how exactly did you configure the DBUS_SESSION_BUS_ADDRESS environment variable? Is it configured for the environment where the selenium node process runs or somewhere else?
You you guys try installing dbus-x11 and see if that helps?
@elgalu Thanks for the tip - in our case, it's already installed on the machine running the selenium node
All I can say is that I set a fake env var with random content in jenkins. I don't use a permanent selenium server, my browsers get started by chromedriver on the fly.
Anecdotally, setting DBUS_SESSION_BUS_ADDRESS=/dev/null appears to have made my selenium environment reliable. Very very excited to see progress on this!
Me too @pwaller. Where does one need to set this variable? For the selenium server?
@adeslade: You need to set it on the selenium/node-chrome (or selenium/node-chrome-debug) container. So my selenium docker-compose.yml now looks like this. Or if you're running it with straight docker: docker run --env DBUS_SESSION_BUS_ADDRESS=/dev/null ... selenium/node-chrome-debugd
That's great, thanks for the example.
thanks @pwaller - I would like to try this and see if it helps also in a non-docker environment but as I am not familiar with docker I am not sure I understand for which process this environment variable should be set. Should it be for the java process running the selenium node jar perhaps? Or for the chromedriver executable?
@sterago it's the thing under which the chromedriver runs, as I understand it.
thanks. I'll try it out and report back
The DBUS fix is totally working for me! I'm running 32 concurrent nodes and was running into this about 2-5 times every ten minutes, and the DBUS stuff has finally eliminated those darned startup errors.
The picture explains it more clearly:

Wow, the DBUS fix looks promising!
Didn't have any timeout so far
We added DBUS_SESSION_BUS_ADDRESS=/dev/null to the to the list of system-wide environment variables (/etc/environment). Hopefully this is not too invasive causing problems in other parts of the system, but so far it seems it did help with the problem of chrome randomly hanging at launch in a non-docker environment as well. Thanks for all the pointers - hopefully there will be an official fix for this issue soon, but I guess it wouldn't be from this very project (selenium-docker) but rather the main selenium project, as far as I understand?
The cause of this and why the fix works is still unknown.
@jjYBdx4IL might explain it better
I have no clue and reported the issue to Google.
@jjYBdx4IL can you link your report?
don't have a link sry
@jjYBdx4IL :+1:
Just tried the DBUS_SESSION_BUS_ADDRESS=/dev/null solution and it appears to be working for me!
I've been battling with this for months, so it's great to finally have a workaround _(though I may be counting my chickens early; still have yet to test extensively)_.
Also, I'm confident this is an underlying selenium or docker issue rather than a chromedriver issue, as I had previously tried replacing Chrome with Firefox and inevitably ran into the same thing (random hangs whilst creating a new session).
@pwaller i have a similar issue but my configuration does not use selenium-standalone or docker...the chrome browser freezes when running tests for cucumber-jvm,selenium on Jenkins in linux ubuntu box....the tests are in sequential manner...
@SaiPawan your issue sounds unrelated. This issue has to do with the chromedriver hanging when trying to create a new session. With your issue, is the browser starting and at least performing part of your test? With this particular issue, the console log will show:
Only local connections are allowed.
With no console output afterwards, it just hangs.
I've been running configs with the ENV variable suggested and have not had issue since. We could always make that a preset ENV variable in the node base image, assuming there are no ill side effects associated with that.
I hope this is the answer, this instability is annoying!
Why isn't this just a default part of the Container? Or would you want to use this in some instances?
Thanks
@alexkogon because it looks like to be a bug in chrome or chromedriver or xvfb-related or something else. We don't know the root cause.
@charford In my scenario some tests do run before the browser freeze and somewhere during the execution the chrome browser freezes...if left as it is it takes lot of time to recover (sometimes about 4 hours)...but once it recovers it runs other tests good..The main issue that i have is that the total scripts execution time increases a lot...
@Vanuan ok might be nice to incorporate it into the container, being the best implementation of the current software and not the ideal one, such that you just have to use the container and it works rather than knowing you have to work around the bug of some of the software in the container. If we know there is a bug, let's adapt the implementation to work around that bug if we know how. A matter of philosophy I suppose though.
@alexkogon you already have to use /dev/shm workaround. And there's no suggestion to incorporate it into an image.
@Vanuan I already did :)
What do i need to do to use the DBUS_SESSION_BUS_ADDRESS =/dev/null. I am using selenium standalone server with hub on windows machine and trying to use chrome browser of a mac. I have 70 tests for which hub creates a session on node , test steps are fired and browser is quit. So at times randomly at times for chrome browser session is not created and the hub has to wait until timeout occurs . Thanks
@charansethi unless you explicitly installed a port of DBUS for Windows, by default there is no DBUS system there, so I am afraid that this work around is not applicable in this case
@Vanuan looks like it was incorporated after all, despite there being "no suggestion" to do so.
@kayabendroth thanks!
Wow! Thanks!
I set the environment variable with export DBUS_SESSION_BUS_ADDRESS=/dev/null and no longer have any hanging chromedriver processes!
I ran one file that had an empty it statement with 12 concurrent browser sessions, and looped it 6000 times. After almost 2 hours of continuously running, there was not a single hanging process.
[12:20:33] Protractor ran 6000 files (6000 specs) -- 6000 passed
[12:20:33]
[12:20:33] Finished 'protractor' after 1.92 h
Had this same issue and looks like adding this to my selenium /etc/init.d script did the trick:
export DBUS_SESSION_BUS_ADDRESS=/dev/null
This has been running for some days now without leaving any chromium processes open. Thanks for the tip!
What a crazy fix. Been spinning my tests for ~30 times without a single failure. Thanks guys.
Crazy fix indeed, so the DBUS_SESSION_BUS_ADDRESS=/dev/null magic fix works, would be nice to know what are we doing exactly, if anyone can enlighten us that would be great. I see this in chromedriver logs:
(google-chrome-base:544): LIBDBUSMENU-GLIB-WARNING **: Unable to get session bus: Address element '/dev/null' does not contain a colon (:)
I'll try a new alternative, if anyone has an input please let me know
sudo mkdir -p /var/run/dbus
sudo service dbus start
It seems I also need to -v /var/run/dbus:/var/run/dbus during docker run. Does anybody knows what are the implications of doing that?
Hi Leo you are mapping a parent OS directory to the Docker "VM" with the -v command. I'm not sure you have to do that though and I believe this fix to the Docker Selenium Chrome Node to workaround a bug in the Selenium ChromeDriver itself was incorporated directly into the base image, after some discussion as to whether or not that was appropriate (which I was involved in on the "yes do it" side?)
BTW FYI the Windows version is coming along well; I saw the hardest part (getting a Windows VM to appear when running a Docker image) working when I was on site last week, we spent a few hours trying to get the Ansible script which installs Selenium software into windows working then discovered we had the wrong version of PowerShell installed and that's why it wasn't working. We're all busy with other stuff now but this should be there soon I let you know...
Cool news about Windows @alexkogon it will be awesome to have IE in docker-selenium at some point!
Regarding dbus, I found a way without having to share /var/run/dbus, we are testing it and seems stable so far, the hack DBUS_SESSION_BUS_ADDRESS=/dev/null is removed and doing this instead:
# Dockerfile
sudo apt-get -qyy install dbus-x11
# entry.sh
sudo rm -f /var/lib/dbus/machine-id
sudo mkdir -p /var/run/dbus
sudo service dbus restart
service dbus status
export $(dbus-launch)
export NSS_USE_SHARED_DB=ENABLED
echo "-- INFO: DBUS_SESSION_BUS_ADDRESS=${DBUS_SESSION_BUS_ADDRESS}"
#=> e.g. DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-APZO4BE4TJ,guid=6e9c098d053d3038cb0756ae57ecc885
echo "-- INFO: DBUS_SESSION_BUS_PID=${DBUS_SESSION_BUS_PID}"
#=> e.g. DBUS_SESSION_BUS_PID=44
@elgalu being old fashioned I also don't believe code works on Firefox or Chrome for Windows or Mac OSs unless I've seen it working on those OSs, but hopefully I'm wrong :)
Am I correct in thinking you are working on your own Docker grid to incorporate the Chrome fix here and that's why you aren't just using the standard one with it already in there, or is the image still hanging?
It was hanging, in the output we could only see was:
Starting chromedriver on port {{some_port}} Only local connections are allowed.
But then nothing.
After implementing the above dbus solution it works consistently (so far).
@elgalu don't confuse docker on windows with being able to run windows binaries under docker. Docker on Windows still uses linux vm.
@Vanuan if you are referring to what I was talking about it is not Docker on Windows, it is running a Windows Selenium Grid Node via Docker. What we have done is embed a Windows VM inside a Docker image, so that when you run Docker (on Linux of course) the Container runs a Windows VM internally and then runs Selenium on Windows from there.
Whoa, VM inside Docker...
+1 DBUS_SESSION_BUS_ADDRESS=/dev/null worked for me and our build server.
One other dev does still see a hang though but might be old docker containers potentially. Definitely more stable with this magic!
This works for me as well. I can duplicate the issue by running this on my linux box (no docker, no grid):
from selenium import webdriver
from pyvirtualdisplay import Display
count = 0
while True:
count += 1
print("Attempt: ", str(count).rjust(4))
with Display(visible=False, size=(500, 500)):
try:
driver = webdriver.Chrome()
finally:
driver.quit()
Running export DBUS_SESSION_BUS_ADDRESS=/dev/null before running the above script prevents the hanging from happening.
We're running regression tests using Selenium & Java via Go CD. The test agents are AWS with Centos. Firefox works just fine, Chrome shows the intermittent hang problem. In the Go task config we set the environment variable DBUS_SESSION_BUS_ADDRESS to /dev/null and now Chrome works just fine.
setting to null works on ubuntu 16.04 with jenkins running chrome 64bit in a xvnc env, roughly running 70 tests, restarting chrome around 100 times. before it hanged every 2nd or 3rd run
Likewise. Running this in a loop would reliably fail within a couple of minutes:
ruby -e "require 'selenium-webdriver'; Selenium::WebDriver.for(:chrome).navigate.to 'http://google.com'"
Changing it to the following appears to have magically fixed things 🎉
DBUS_SESSION_BUS_ADDRESS=/dev/null ruby -e "require 'selenium-webdriver'; Selenium::WebDriver.for(:chrome).navigate.to 'http://google.com'"
thanks @pwaller, its works for me.
This just started happening to me with the latested version of docker and the DBUS_SESSION_BUS_ADDRESS=/dev/null solved the problem.
Hi,
I have been facing this issue despite of using node-chrome:3.1.0, which should already be immune according to this line in the Dockerfile:
https://github.com/SeleniumHQ/docker-selenium/blob/3.1.0-astatine/NodeChrome/Dockerfile#L61
However, if I run docker run selenium/node-chrome:3.1.0 env|grep DBUS I see no output.
The environment setting is there, though:
test@host:~$ docker run selenium/node-chrome:3.1.0 env|grep DBUS
test@host:~$ docker run selenium/node-chrome:3.1.0 cat /etc/environment
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"
DBUS_SESSION_BUS_ADDRESS=/dev/null
test@host:~$
Specifying DBUS_SESSION_BUS_ADDRESS as a container environment variable seems to solve the problem.
Do the variables defined in /etc/environment get into the environment as intended?
I am running Docker 1.9.1 and docker-compose 1.8.1 on an Ubuntu 14.04.5 with Gnome desktop.
Best regards,
Csaba
Debian initialization script does expand /etc/environment to environment variables.
But in Docker you have your own entrypoint, be it bash or jvm. Or in your case - env command.
Previously there was a correct way using ENV which is handled by Docker: https://github.com/SeleniumHQ/docker-selenium/commit/88584def77585863a226ca72bbce0f9dce561242
But for some reason it was changed to this unreliable way which depends on your entrypoint+command.
But for some reason it was changed
Looks like to make sudo work was the reason, because that happens to read /etc/environment: https://github.com/SeleniumHQ/docker-selenium/issues/358
It seems like the correct answer to make sudo work correctly would be to instead add the following to sudoers, or to invoke sudo with --preserve-env (aka -E), and to use ENV in the Dockerfile as before 88584de.
Defaults env_keep += "DBUS_SESSION_BUS_ADDRESS"
/cc @a-k-g and @Sovetnikov from #358.
I'm facing the same issue using docker compose. Trying to specify the var like so:
selenium:
image: selenium/standalone-chrome-debug:latest
environment:
- DBUS_SESSION_BUS_ADDRESS=/dev/null
ports:
- 4444:4444
- 5900:5900
extra_hosts:
- "app.docker:172.17.0.1"
Generally fine locally but hanging in travis 1 in 2 builds. Am I missing something?
One thing that stopped the hanging for me was adding cls.driver.close() before cls.driver.quit()
# Django
@classmethod
def tearDownClass(cls):
cls.driver.close()
super(BaseTestCase, cls).tearDownClass()
cls.driver.quit()
I'm still encountering this hanging issue with selenium/hub:3.3 and selenium/node-chrome:3.3 (and I am using the /dev/shm:/dev/shm, as instructed).
Previous session ends cleanly:
ch1_1 | 15:21:01.253 INFO - Executing: [close window])
ch1_1 | 15:21:01.313 INFO - Done: [close window]
ch1_1 | 15:21:01.319 INFO - Executing: [delete session: 09c90232-daa5-4742-a0fc-78266faee5cd])
ch1_1 | 15:21:01.334 INFO - Done: [delete session: 09c90232-daa5-4742-a0fc-78266faee5cd]
New sessions starts, but hangs for a couple minutes (note the timestamps) before doing anything:
ch1_1 | 15:21:01.253 INFO - Executing: [close window])
ch1_1 | 15:21:01.313 INFO - Done: [close window]
ch1_1 | 15:21:01.319 INFO - Executing: [delete session: 09c90232-daa5-4742-a0fc-78266faee5cd])
ch1_1 | 15:21:01.334 INFO - Done: [delete session: 09c90232-daa5-4742-a0fc-78266faee5cd]
selenium-hub_1 | 15:21:01.339 INFO - Got a request to create a new session: Capabilities [{browserName=chrome, javascriptEnabled=true, version=, platform=ANY}]
selenium-hub_1 | 15:21:01.340 INFO - Trying to create a new session on test slot {seleniumProtocol=WebDriver, browserName=chrome, maxInstances=1, version=57.0.2987.133, applicationName=, platform=LINUX}
ch1_1 | 15:21:01.343 INFO - Executing: [new session: Capabilities [{browserName=chrome, javascriptEnabled=true, version=, platform=ANY}]])
ch1_1 | 15:21:01.343 INFO - Creating a new session for Capabilities [{browserName=chrome, javascriptEnabled=true, version=, platform=ANY}]
ch1_1 | Starting ChromeDriver 2.29.461571 (8a88bbe0775e2a23afda0ceaf2ef7ee74e822cc5) on port 13298
ch1_1 | Only local connections are allowed.
selenium-hub_1 | 15:22:17.417 INFO - Got a request to create a new session: Capabilities [{browserName=chrome, javascriptEnabled=true, version=, platform=ANY}]
selenium-hub_1 | 15:23:15.382 INFO - Trying to create a new session on test slot {seleniumProtocol=WebDriver, browserName=chrome, maxInstances=1, version=57.0.2987.133, applicationName=, platform=LINUX}
ch1_1 | 15:23:15.386 INFO - Executing: [new session: Capabilities [{browserName=chrome, javascriptEnabled=true, version=, platform=ANY}]])
ch1_1 | 15:23:15.387 INFO - Creating a new session for Capabilities [{browserName=chrome, javascriptEnabled=true, version=, platform=ANY}]
ch1_1 | Starting ChromeDriver 2.29.461571 (8a88bbe0775e2a23afda0ceaf2ef7ee74e822cc5) on port 1620
ch1_1 | Only local connections are allowed.
ch1_1 | 15:23:15.796 INFO - Detected dialect: OSS
...
We had similar issue while running highload tests with a lot of chrome instanses working parallel.
After a bit of research we found that setting DBUS_SESSION_BUS_ADDRESS to /dev/null is bad idea, because chrome uses Dbus a lot. You can see it by "busctl" command while chrome running on systemd linux:
:1.569 16477 chrome root :1.569 docker-e7bb264286f8b53... - -
:1.570 16476 chrome root :1.570 docker-e7bb264286f8b53... - -
:1.571 16474 chrome root :1.571 docker-e7bb264286f8b53... - -
:1.572 16475 chrome root :1.572 docker-e7bb264286f8b53... - -
:1.573 16473 chrome root :1.573 docker-e7bb264286f8b53... - -
:1.574 16489 chrome root :1.574 docker-e7bb264286f8b53... - -
:1.575 16478 chrome root :1.575 docker-e7bb264286f8b53... - -
:1.576 16472 chrome root :1.576 docker-e7bb264286f8b53... - -
:1.609 16474 chrome root :1.609 docker-e7bb264286f8b53... - -
:1.610 16478 chrome root :1.610 docker-e7bb264286f8b53... - -
:1.611 16473 chrome root :1.611 docker-e7bb264286f8b53... - -
:1.612 16477 chrome root :1.612 docker-e7bb264286f8b53... - -
:1.613 16489 chrome root :1.613 docker-e7bb264286f8b53... - -
:1.614 16475 chrome root :1.614 docker-e7bb264286f8b53... - -
:1.615 16472 chrome root :1.615 docker-e7bb264286f8b53... - -
:1.616 16476 chrome root :1.616 docker-e7bb264286f8b53... - -
:1.617 16477 chrome root :1.617 docker-e7bb264286f8b53... - -
:1.618 16474 chrome root :1.618 docker-e7bb264286f8b53... - -
:1.619 16473 chrome root :1.619 docker-e7bb264286f8b53... - -
:1.620 16472 chrome root :1.620 docker-e7bb264286f8b53... - -
:1.621 16478 chrome root :1.621 docker-e7bb264286f8b53... - -
:1.622 16476 chrome root :1.622 docker-e7bb264286f8b53... - -
:1.623 16475 chrome root :1.623 docker-e7bb264286f8b53... - -
:1.624 16489 chrome root :1.624 docker-e7bb264286f8b53... - -
The DBUS_SESSION_BUS_ADDRESS variable is used for setting up dbus socket address, but you just set it to null ... So after a lot of time spending to tracing chrome instance running into a docker we found the following container setup:
docker run -d -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /var/run/dbus/:/var/run/dbus/ \
-v /tmp:/tmp \
-v /dev/shm:/dev/shm \
-e DBUS_SESSION_BUS_ADDRESS="unix:path=/var/run/dbus/system_bus_socket" \
--privileged \
-p 4444:4444 \
-p 5900:5900 \
docker.io/selenium/standalone-chrome-debug:3.3.1-cesium
It works well with systemd distributions like CentOS 7 or Ubuntu Server 17.04 because /var/run/dbus/system_bus_socket is systemd dbus socket. On other systems (like OS X or non-systemd linux) you must use another socket address.
@vv-p That sounds like a great discovery! Can you think of a DBUS configuration that's portable across different Linux distributions (and running on MacOS, etc.)? I'm not familiar with how DBUS works.
@vv-p @peterstory please see this comment
@diemol as I wrote before, setting DBUS_SESSION_BUS_ADDRESS to /dev/null is terrible bad. So yes, you need to install and run DBus inside container or link it outside to get chrome working stable.
@vv-p But why is it terrible bad? I just want to understand how this affects the tests executed in the container.
If it is bad as in we are having errors in the logs but at the end the browser is working properly for the test, the question is, would it be ok to live with that?
Or is it bad as in the tests cannot be executed because Chrome does not work at all? If so, would be cool to have a way to reproduce it and then we can install dbus just how @elgalu did it for his image. We need to be aware that then one more process would run inside the container and I am not sure how the performance can be affected by that.
I mean, at the end, there is a way to find a solution. In my opinion we should try to find one that is practical and maintainable.
Second one. I don't think dbus affects container perfomance.
Anecdotal: we've been running Chrome in docker for all our Jenkins builds (with containers based on the jenkinsci/blueocean image) and it's been stable ever since I set DBUS_SESSION_BUS_ADDRESS=/dev/null in mid-January.
Also experiencing this issue in Windows 10 (no container) when using Beta Chrome release 60.0.3112.32, but only when running chrome in the new --headless mode (which chromedriver doesn't ostensibly support yet, admittedly, but for my very simple test case it seems fine when I do get chromedriver launched).
Launching chromedriver from python with --headless and --disable-gpu nets me the intermittent error on around 2/30 tries , with or without --no-sandbox (which I've seen suggested a lot from googling).
As an aside, adding --remote-debugging-port=9222 breaks it since chromedriver generates its own port to try and access devtools on, but that's on me for using chromedriver for headless chrome before it's supported, and not relevant to my use case nor probably this issue.
With --headless and --disable-gpu, it simply silently fails to connect as others have observed:
[2.039][INFO]: Launching chrome: "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --
disable-background-networking --disable-client-side-phishing-detection --disable-default-apps --
disable-gpu --disable-hang-monitor --disable-infobars --disable-notifications --disable-popup-blocking
--disable-prompt-on-repost --disable-setuid-sandbox --disable-sync --disable-web-resources
--enable-automation --enable-logging --force-fieldtrials=SiteIsolationExtensions/Control
--headless --ignore-certificate-errors --load-component-extension="C:\Users\kkoponen\AppData\Local\Temp\scoped_dir29680_29077\internal"
--log-level=0 --metrics-recording-only --no-first-run --no-sandbox --password-store=basic
--remote-debugging-port=12215 --safebrowsing-disable-auto-update --test-type=webdriver
--use-mock-keychain --user-data-dir="C:\Users\kkoponen\AppData\Local\Temp\scoped_dir29680_13552" data:,
[2.050][DEBUG]: DevTools request: http://localhost:12215/json/version
[4.060][DEBUG]: DevTools request failed
[4.111][DEBUG]: DevTools request: http://localhost:12215/json/version
[4.312][DEBUG]: DevTools request failed
[4.363][DEBUG]: DevTools request: http://localhost:12215/json/version
[6.118][DEBUG]: DevTools request failed
[6.168][DEBUG]: DevTools request: http://localhost:12215/json/version
[6.368][DEBUG]: DevTools request failed
As I understand it that linked patch isn't relevant to windows users.
Anyone have any ideas about preventing this in windows?
This seems to be a duplicate of #89 which apparently has a fix.
Though this seems to be related to docker I'd like to share my experience with a local Selenium 3.4.0 standalone server without docker.
The symptoms were the same: Selenium was stuck when creating a new session.
It turns out the issue is no longer happening using Chrome 59.
See my MCVE with minimal client code, setup descriptions and observations: https://github.com/tholewebgods/selenium-new-session-freeze-mcve
All,
I may have some further insight into this behavior. I found that, when we were automating selenium to verify a google-login page, after the redirection occurred, we did a driver.get(...) to the page it was redirected to. This caused the driver to throw a TimeOutException waiting for a get to occur, when in reality, it never performed the get. So moral of the story? Don't do that!
Adding DBUS_SESSION_BUS_ADDRESS=/dev/null as an environment variable in docker-compose totally fixed the problem! Now, I feel that it is running faster than it was!
Now my question is, how can I set this variable in circleCI :(
@hutber Go to your project settings in Circle CI, click the link on the left for "Environment Variables", then add the key/value pair.
I don't believe this issue is strictly related to Docker and I am getting different results based on the browser driver I use. Invoking NUnit console runner from AWS Run-Command (send-command) on a remote EC2. Selenium will fail to navigate to a url for the first one or two tests (inconsistently). My workaround: I made a separate TestFixture with the order attribute of 1. Fixture contains two tests. The test: driver.navigate().gotourl(app) and then assert(pass). Then all remaining tests run fine. Sometimes geckodriver still fails. Not using Selenium Grid currently, but it is necessary to implement it.
DBUS_SESSION_BUS_ADDRESS=/dev/null
In Excel vba (Windows) how it works?
Where should I enter this code and what is the syntax in vba?
SeleniumBasic WebDriver 2.0.9.0 (for vba) and latest Chrome Driver (2.37) (16/03/2018).
Windows 7 64-bit.
I have a solution: I stopped using selenium and went a different way.
I hope that the developers have read and are looking for a solution to solve the problem, present on both Windows and Linux at all.
HI @PS1Online,
I am not sure how this could apply to your context. This env var was necessary many releases ago for the docker-selenium images. You seem to be running in a complete different environment (Excel VBA).
Perhaps the most simple way is for you to join https://seleniumhq.herokuapp.com, there are lots of people there that can potentially help you.
Most helpful comment
For me, this NEVER happens if I start chrome from inside an X11 session. No matter if it runs against Xvfb or not, or whether Xvfb has been started on the console. The problem appears to be X11 resource usage related to a live/real X11 user desktop session. I traced it down to DBUS. Setting DBUS_SESSION_BUS_ADDRESS=/some/nonsense in Jenkins fixed my testing there and chrome/chromium are starting up there again without any problems at all.........