I have 2 grids:
with 1 Chrome node. You can use the following docker compose file to spin up the same environment
docker-compose.yml.zip
I am sending 2 Chrome tests in parallel but experiencing different behaviour between Grid v3 vs Grid v4
nunitProject.zip
Selenium 3 Grid behaviour


Logs:
selenium3GridLogs.txt
Selenium 4 Grid behaviour
After the first test is finished I can see graphql response saying there is available slot, yet it is not being assigned to my second test
{
"data": {
"grid": {
"nodes": [
{
"id": "bb800483-fb81-4fea-849b-62ec4a23cb7b",
"uri": "http://172.19.0.3:5555",
"status": "UP",
"sessions": [],
"maxSession": 1,
"capabilities": "[\n {\n \"slots\": 1,\n \"browserName\": \"chrome\"\n }\n]"
}
],
"uri": "http://172.19.0.2:4444",
"totalSlots": 1,
"usedSlots": 0,
"sessionCount": 0
}
}
}


Logs:
seleniumGrid4Logs.txt
Thank you for providing all the details. I was able to run the setup from the docker-compose file and run the C# tests. However, I am unable to reproduce the issue. Based on the logs, I can see the first request being handled and the session deleted from the local store. Then the second request then gets picked up from the queue and a session is created, after a while that is deleted too. The behaviour is as desired.
Regarding the Grid UI, we are currently working on getting information related to queued sessions out via GraphQL and display it on the UI.
Hi @pujagani, thanks for your time. It is a bit tricky to replicate this issue, but it is still happening. Looks like this happens when you spin up fresh instance of grid/node (meaning this behaviour will happen only first time you send you tests) and one of the tests has to wait a bit longer in the session queue, e.g. 20s
I have updated the test solution so you can trigger the test from the command line (to rule out any weird behaviour of IDE, I am using Rider for example)
Steps to reproduce:
Expected behaviour
Current behaviour
Interesting thing is after the first failure if I execute the very same tests from that point onwards grid sessions queuing works fine, both tests will consistently pass all the time.
Thank you for sharing the detailed steps to recreate the issue. I attempted to recreate it and I only saw the 2nd request error out the very first time I ran the docker-compose file. I did not see it again.
To narrow down the problem area, following attempts were made:
So far, I have not been successful in recreating the issue successfully in order to triage, fix and test the fix.
If possible, please share how are you creating the "fresh" instances each time. Even the docker commands might help. Thank you!
Since the issue seems to happen only on startup and once, it seems like there might be a timing issue. There could be a chance that the Distributor within the Hub is not fully ready (since it checks every second if the queue has a request and then polls) when it receives 1st request and that request is ignored and 2nd request is received when it is ready and it gets executed correctly.
Though such a thing would only happen one-off I am guessing and not every time the fresh instances are created.
To rule-out any race conditions if at all, I have created a PR to ensure the Distributor only uses the local Grid model to check for Grid capacity instead of the Remote Node status. https://github.com/SeleniumHQ/selenium/pull/9120
To ensure the Grid is ready before starting the tests, please refer https://github.com/SeleniumHQ/docker-selenium#waiting-for-the-grid-to-be-ready.
thanks @pujagani, one more thing I will try is to run grid in full mode, make sure all components are ready, then run the same tests and if I experience the same problem I will collect logs from all components and report back.
Hi @pujagani, I did more testing on this with the latest prerelease 4.0.0-beta-1-prerelease-20210128, restarted the environment multiple times and I am not seeing the reported behaviour any more. I think we can close this ticket. Thanks again for your time you spend with your investigation.
concurrentSessionsOk.log
Most helpful comment
Hi @pujagani, I did more testing on this with the latest prerelease 4.0.0-beta-1-prerelease-20210128, restarted the environment multiple times and I am not seeing the reported behaviour any more. I think we can close this ticket. Thanks again for your time you spend with your investigation.
concurrentSessionsOk.log