Hi,
I am using CentOS 7 for docker selenium. I can see that the test cases execution is very slow. When I run it on mac machine the docker selenium execution is very fast like normal grid Node setup but when i used docker hub for running test it is very slow.
Has any one faced the issue on the latest docker?
Docker Version: 1.9.1
Docker Machine version: docker-machine version 0.5.2 ( 0456b9f )
Docker Compose version 1.5.2, build 7240ff3
Hi,
We're suffering a very similar issue. A full test run on a late 2013 MBP Pro Retina, i7 with a VM Ware Fusion docker host (Ubunut 14,04.2 LTS, 2 cores + 2gb of ram) passes in around 35 mins
However when we moved this to our server infrastructure, a Ubuntu 15.10 VM inside a ESXI 5.x host with 4 xeon cores and 4gb of RAM, the tests take around 1 hour 40 mins to run, not pass, a lot fail we think due to very high CPU usage and the machine struggling.
Watching the execution using the -debug chrome image and VNC Viewer we can see it is very slow to render pages.
Our gut instinct comparing developer machines to the server, is that chrome isn't getting any kind of GPU acceleration. However we haven't ruled out proxy issues etc but that seems less likely as everything is inside docker and networked together.
Docker version 1.9.1
selenium docker images are latest
first of all, check disk load and use tmpfs for /tmp. chrome becomes significantly faster when using its cache, placed in ram
Hi,
My machine configuration is below mentioned. My test case execution time is 3*Normal Execution time. I am using 3 node to execute the 40 test cases which takes 35 minutes on Centos machine but on my mac i7 16 GB ram it takes 9 minutes to complete with same no of node.
When i run it on try to run it on Centos machine with 10 node, no test run all get timed out.
System Memory Details----------------------------------------
MemTotal: 1884808 kB
MemFree: 75176 kB
MemAvailable: 158568 kB
Buffers: 0 kB
Cached: 169076 kB
SwapCached: 358900 kB
Active: 681668 kB
Inactive: 847688 kB
Active(anon): 597608 kB
Inactive(anon): 765208 kB
Active(file): 84060 kB
Inactive(file): 82480 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 2097148 kB
SwapFree: 84228 kB
Dirty: 248 kB
Writeback: 0 kB
AnonPages: 1001476 kB
Mapped: 59972 kB
Shmem: 2476 kB
Slab: 130432 kB
SReclaimable: 58952 kB
SUnreclaim: 71480 kB
KernelStack: 16848 kB
PageTables: 54640 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 3039552 kB
Committed_AS: 7059724 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 163280 kB
VmallocChunk: 34359543808 kB
HardwareCorrupted: 0 kB
AnonHugePages: 38912 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 124864 kB
DirectMap2M: 1972224 kB
df -k /tmp
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/centos-root 49746196 12384072 37362124 25% /
CPU Info-----------------------------------------
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz
stepping : 2
microcode : 0x2d
cpu MHz : 2294.686
cache size : 25600 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 15
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi ept vpid fsgsbase smep
bogomips : 4589.37
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
@dgangwar
You are comparing execution time on i7 / 16Gb ram with E5/ less than 2GB. Am I right?
I can't see your swap in dynamics, but it seems to me, that you don't have enough memory..
SwapTotal: 2097148 kB
SwapFree: 84228 kB
ans practically no free and cached mem...
Try to run tests and watch on swap i/o and cpu iowait time.
@apakhomov
After increasing the RAM to 64 GB I was still not able to get the Test cases execution time decreased. Now it takes around 28 min to execute 37 TC with 3 node where macbook i7 16 GB still execute within 10 mins.
But the difference between these System is that MACBook has more CPU than Linux machine.
Do you think this can be a problem????
MAC-------
dgangwar$ sysctl hw
hw.ncpu: 8
hw.byteorder: 1234
hw.memsize: 17179869184
hw.activecpu: 8
hw.physicalcpu: 4
hw.physicalcpu_max: 4
hw.logicalcpu: 8
hw.logicalcpu_max: 8
hw.cputype: 7
hw.cpusubtype: 8
hw.cpu64bit_capable: 1
hw.cpufamily: 280134364
LINUX--------------
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz
Stepping: 2
CPU MHz: 2294.686
BogoMIPS: 4589.37
Virtualization: VT-x
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0
@dgangwar
Show "vmstat -a 2", top/atop/htop heading with expanded cpu info, iostat -dx 2 (all during running tests).
@apakhomov
We are having only Single CPU linux machines right now. But can try for high no of cpu machine if that may be the problem.
vmstat -a 2 ->>>>>>>>>>>>>>>>>>>>>>>
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free inact active si so bi bo in cs us sy id wa st
11 0 0 51067292 5061112 5668864 0 0 212 504 1231 1777 11 4 84 0 0
9 0 0 51005736 5082844 5707828 0 0 0 2826 1978 4278 52 11 36 2 0
0 0 0 50971608 5080904 5743480 0 0 0 266 4843 3429 64 9 27 0 0
top ->>>>>>>>>>>>>>>>>>
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30310 dgangwar 20 0 1122228 94080 34600 S 29.3 0.1 0:00.91 chrome
29279 dgangwar 20 0 1128816 162572 41604 S 21.0 0.2 0:09.25 chrome
29887 dgangwar 20 0 1121932 153352 33788 S 15.7 0.2 0:04.82 chrome
29084 dgangwar 20 0 586764 88268 55396 R 6.0 0.1 0:01.33 chrome
30193 dgangwar 20 0 581132 77816 48592 S 4.3 0.1 0:00.33 chrome
23632 vagrant 20 0 3575444 99960 42148 S 2.7 0.2 8:05.09 VBoxHeadless
29803 dgangwar 20 0 579852 76136 48560 S 2.7 0.1 0:00.74 chrome
29905 dgangwar 20 0 184196 44012 14148 S 2.7 0.1 0:10.30 Xvfb
784 root 20 0 4372 592 496 S 2.3 0.0 1:34.18 rngd
29069 dgangwar 20 0 385696 8324 6440 S 2.0 0.0 0:00.63 chromedriver-2.
30431 dgangwar 20 0 184196 43912 14148 S 1.3 0.1 0:07.08 Xvfb
29797 dgangwar 20 0 385696 8272 6436 S 0.7 0.0 0:00.30 chromedriver-2.
30509 dgangwar 20 0 184132 43952 14156 S 0.7 0.1 0:07.87 Xvfb
11 root 20 0 0 0 0 R 0.3 0.0 0:05.92 rcuos/0
iostat -dx 2 ->>>>>>>>>
Linux 3.10.0-327.3.1.el7.x86_64 23/12/15 _x86_64_ (1 CPU)
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.01 0.66 2.56 8.76 78.99 330.91 72.40 0.06 5.64 3.04 6.40 0.81 0.92
dm-0 0.00 0.00 2.46 9.35 72.60 323.06 66.99 0.07 5.75 3.11 6.45 0.77 0.91
dm-1 0.00 0.00 0.01 0.00 0.17 0.00 39.37 0.00 0.82 0.82 0.00 0.66 0.00
dm-2 0.00 0.00 3.91 6.77 131.85 160.49 54.71 0.01 1.08 0.31 1.52 0.23 0.24
dm-3 0.00 0.00 0.12 0.06 3.92 1.84 63.75 0.00 1.56 1.22 2.22 0.79 0.01
dm-4 0.00 0.00 0.16 0.07 3.82 1.92 51.14 0.00 0.57 0.28 1.26 0.38 0.01
dm-5 0.00 0.00 0.43 0.56 18.63 11.60 60.80 0.00 1.92 1.64 2.14 0.73 0.07
dm-7 0.00 0.00 0.43 0.87 18.65 18.06 56.22 0.00 2.07 2.39 1.91 0.94 0.12
dm-6 0.00 0.00 0.43 0.73 18.64 14.79 57.38 0.00 2.14 2.82 1.73 0.89 0.10
dm-8 0.00 0.00 0.43 0.58 18.63 12.22 60.64 0.00 2.07 2.37 1.85 0.90 0.09
dm-9 0.00 0.00 0.43 0.48 18.63 9.90 62.44 0.00 2.05 2.41 1.72 0.87 0.08
Could you provide a longer vmstat's output? 3 lines is not enough. You can increase interval to 5-10 seconds and do 20-30-... measures to see cpu usage dynamics.. As I can see by 3rd line, cpu idle is only 27% and is probably increasing..
How many threads/browsers do your tests use? It's likely you are to add cores..
@apakhomov I am running 3 browsers in parallel. I will try to get machine with more core and try.
vmstat -a 10 20
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free inact active si so bi bo in cs us sy id wa st
10 0 0 47149236 6259680 8206400 0 0 169 456 1218 235 10 4 86 0 0
4 0 0 46928536 6256164 8430304 0 0 82 4240 2750 2615 69 5 26 0 0
4 0 0 46935240 6258192 8421368 0 0 0 529 2461 2307 48 3 48 0 0
2 0 0 46928140 6263056 8423372 0 0 0 11878 2882 3084 71 6 23 0 0
0 0 0 47175648 6223544 8218032 0 0 0 1231 2336 2540 62 4 34 0 0
0 0 0 47177992 6223344 8215560 0 0 0 705 2081 2542 49 3 48 0 0
0 0 0 47188460 6223668 8205020 0 0 0 8142 2625 2856 59 4 36 1 0
0 0 0 47189932 6225764 8202296 0 0 0 1173 2109 2615 54 4 42 0 0
0 0 0 47194840 6228796 8193536 0 0 0 1139 2436 2661 58 3 39 0 0
0 0 0 47174776 6231196 8212164 0 0 0 976 2293 2455 60 4 37 0 0
6 0 0 47171472 6234908 8211608 0 0 0 1656 2469 2578 63 3 33 0 0
1 0 0 47198740 6234960 8183636 0 0 0 200 2596 2758 59 4 37 0 0
6 0 0 47180492 6235400 8201684 0 0 0 1345 2156 2667 61 4 35 0 0
0 0 0 47174580 6235932 8206248 0 0 0 632 2454 2740 53 4 43 0 0
0 0 0 47195208 6235996 8186980 0 0 1 409 1577 2461 39 3 58 0 0
0 0 0 46981336 6272448 8361708 0 0 0 2819 2127 2634 39 5 56 0 0
0 0 0 46906676 6273964 8434256 0 0 0 1332 2075 2453 44 3 53 0 0
6 0 0 46892484 6279140 8443024 0 0 0 2850 2477 2654 54 5 41 0 0
0 0 0 46832508 6281572 8499820 0 0 0 8882 2627 2792 63 4 32 0 0
0 0 0 47134704 6241460 8240628 0 0 2 863 2370 2582 53 4 43 0 0
@apakhomov
I got the CPU increase from 1 -> 2 but still do not see any improvement with the test execution. I also tried to attach a normal node to the Docker grid hub and could see that the the Test command and rendering on the non docker node is vey very slow.
@philjones88 , Did you get a success in improving the test execution time??
I increased from 2 to 4 CPUs (Xeon in ESXI) and 8gb of ram. No speed increase anywhere near how my old MBP does.
My plan for next year is to see if it's just because it's a few year old Xeon and ESXI hardware. We are getting a brand new dual CPU Xeon server with 256gb of RAM and a few TB of storage and SSD space. I will migrate my docker host VM to this and see if it makes a difference.
If you look in chrome://gpu when connected to a debug chrome node on my MBP I can see GPU is enabled but lots of disabled and warnings on the Xeon server.
I still think the tests are slower due to the DOM and other stuff having to be done on the CPU.
If the new server doesn't speed up execution our next step will be to attempt a server GPU or simply buy a few desktop grade computers to try.
Sent from my iPhone
On 25 Dec 2015, at 09:00, Dharmesh Gangwar [email protected] wrote:
@apakhomov
I got the CPU increase from 1 -> 2 but still do not see any improvement with the test execution. I also tried to attach a normal node to the Docker grid hub and could see that the the Test command and rendering on the non docker node is vey very slow.@philjones88 , Did you get a success in improving the test execution time??
—
Reply to this email directly or view it on GitHub.
Has anyone found a solution for this one? I am facing a similar issue while working on dockerizing Watir-Webdriver tests while using just a single container
Unfortunately not on my end. We threw brand new expensive hardware at the issue and it seems closer to what the developer MBP's get.
We get the same GPU result in Chrome even with the latest ESXI version.
Still heavy CPU usage but the new Xeons have more power so, kind of a fix...
Thanks for the reply Phil! :)
I too am looking into some solutions (to avoid adding more hardware) will
get back if I do find something
On Fri, Jan 29, 2016 at 5:35 PM, Phil Jones [email protected]
wrote:
Unfortunately not on my end. We threw brand new expensive hardware at the
issue and it seems closer to what the developer MBP's get.We get the same GPU result in Chrome even with the latest ESXI version.
Still heavy CPU usage but the new Xeons have more power so, kind of a
fix...—
Reply to this email directly or view it on GitHub
https://github.com/SeleniumHQ/docker-selenium/issues/135#issuecomment-176722997
.
Thanks and Regards,
Trupti Potdar
Try to use this solution and please write the result.
We use such config and chrome in docker is significantly faster than previous solution with KVM guests
pre_entry_point.sh:
#!/bin/bash
echo -e "Re-mounting shm and tmp\n\n"
sudo umount -l /dev/shm
sudo umount -l /tmp
sudo mount -t tmpfs -o size=1280m tmpfs /dev/shm
sudo mount -t tmpfs -o size=768m tmpfs /tmp
echo -e "\n\n"
cat /proc/mounts
echo -e "\n\n"
df -h
echo -e "\n\nStarting node\n\n"
. /opt/bin/entry_point.sh
docker-compose.yml :
node-chrome:
image: selenium/node-chrome-debug:2.48.2
volumes:
- ./pre_entry_point.sh:/opt/bin/pre_entry_point.sh
privileged: true
command:
- /opt/bin/pre_entry_point.sh
environment:
- HUB_PORT_4444_TCP_ADDR=selenium-grid
- HUB_PORT_4444_TCP_PORT=4444
- TZ=Europe/Moscow
@apakhomov : Actually the case is I am using Watir-Webdriver and Firefox
I've also been experiencing this issue, are there any guides on optimizing the Selenium container?
Is this issue still present with the most recent images?
closing. assuming this is issue is not present using the latest docker version and image versions. if this is an incorrect assumption, we can reopen.
This issue is still happening. I tried running my tests with 3.4.0 selenium-firefox & selenium-firefox-debug, and they are extremely slow.
When I run same tests using Selenium running on my MAC, they work just fine
can we please reopen this ticket and help us investigate what could be wrong here
@rajatjindal83 are you using the details provided in https://github.com/SeleniumHQ/docker-selenium#running-the-images?
specifically, the /dev/shm bit?
Hi @ddavison
that was for Chrome isn't it? I can give it a shot.
on recommendation of someone in #selenium IRC, i tried same set of tests with Chrome docker container (even this without /dev/shm bit), and they were pretty fast as compared to firefox docker container.
I think this might be a problem with Firefox container only. Are there any commands/debug steps that I can run to find more information for you?
Thanks
Rajat Jindal
that's för both firefox and chrome
For me, running a container with selenium/firefox worked fine on my Ubuntu laptop (average 8 seconds per request), but after putting into AWS ECS and putting several containers on the same host, it would work slower for any single request, but if all containers on that host were active at the same time it would come to a screeching halt (minutes/never). After mounting /dev/shm, there was a big improvement but still too slow to be usable (average 90 seconds per request). Interestingly, if I launch EC2 t2.micro instances outside of ECS, each running a single docker container (without mounting /dev/shm), they run OK but not great (maybe average of 20 seconds per), but on a larger host if you add multiple with each container using the same amount of RAM and CPU, they are in that 90 second range if all containers are active concurrently. However if on the larger host I test with only about half the containers being active concurrently, the performance is better (closer to the 20 second range), so it is only when all containers on the host are active. I I have tried multiple host instance sizes, and even more than doubling the RAM per container and I still can't get any improvement in performance. So for now I am running outside of ECS on individual t2.micro instances. I assume this has something to do with swap. Is there a way to instruct Docker to disable use of swap? Strange that it can work OK ok a t2.micro with 1GB of RAM buton a larger host when I give the containers 2GB of RAM it's slow if the host has all containers running. I'm also seeing the same behavior with Chrome.
I am running the tests inside docker-container with chrome. If i use up to 2 threads, the tests are running fine. However, when i go 3 or above 3 thread, my chrome browsers taking forever to wait for requests to complete. Is there anything i need to work around with my yml while configuring the containers.
Following is my docker-compose yml file:
version: "3"
services:
selenium-hub:
image: selenium/hub
container_name: selenium-hub
environment:
- GRID_MAX_SESSION=5
- GRID_BROWSER_TIMEOUT=100000
- GRID_TIMEOUT=90000
- GRID_NEW_SESSION_WAIT_TIMEOUT=300000
ports:
- "4444:4444"
chrome1:
image: selenium/node-chrome-debug
restart: always
depends_on:
- selenium-hub
environment:
- NODE_MAX_INSTANCES=10
- NODE_MAX_SESSION=10
- HUB_PORT_4444_TCP_ADDR=selenium-hub
- HUB_PORT_4444_TCP_PORT=4444
ports:
- "5904:5900"
volumes:
- /dev/shm:/dev/shm
chrome2:
image: selenium/node-chrome-debug
restart: always
depends_on:
- selenium-hub
environment:
- NODE_MAX_INSTANCES=10
- NODE_MAX_SESSION=10
- HUB_PORT_4444_TCP_ADDR=selenium-hub
- HUB_PORT_4444_TCP_PORT=4444
ports:
- "5905:5900"
volumes:
- /dev/shm:/dev/shm
chrome3:
image: selenium/node-chrome-debug
restart: always
depends_on:
- selenium-hub
environment:
- NODE_MAX_INSTANCES=10
- NODE_MAX_SESSION=10
- HUB_PORT_4444_TCP_ADDR=selenium-hub
- HUB_PORT_4444_TCP_PORT=4444
ports:
- "5906:5900"
volumes:
- /dev/shm:/dev/shm
firefox1:
image: selenium/node-firefox-debug
restart: always
depends_on:
- selenium-hub
environment:
- NODE_MAX_INSTANCES=10
- NODE_MAX_SESSION=10
- HUB_PORT_4444_TCP_ADDR=selenium-hub
- HUB_PORT_4444_TCP_PORT=4444
ports:
- "5901:5900"
firefox2:
image: selenium/node-firefox-debug
restart: always
depends_on:
- selenium-hub
environment:
- NODE_MAX_INSTANCES=10
- NODE_MAX_SESSION=10
- HUB_PORT_4444_TCP_ADDR=selenium-hub
- HUB_PORT_4444_TCP_PORT=4444
ports:
- "5902:5900"
firefox3:
image: selenium/node-firefox-debug
restart: always
depends_on:
- selenium-hub
environment:
- NODE_MAX_INSTANCES=10
- NODE_MAX_SESSION=10
- HUB_PORT_4444_TCP_ADDR=selenium-hub
- HUB_PORT_4444_TCP_PORT=4444
ports:
- "5903:5900"
Most helpful comment
I've also been experiencing this issue, are there any guides on optimizing the Selenium container?