Cypress: Recover from renderer / browser crashes

Created on 16 Dec 2016  ·  66Comments  ·  Source: cypress-io/cypress

Related to #348.

It is actually possible for Cypress to implement strategies when the renderer (or browser process) crashes during a test run - something like recoverFromRendererCrashes: true by default.

There is already a mechanism for Cypress to "reload" mid-run, rebuild the state of every previous run test, skip over previously run tests, and continue with the next one in line.

In fact this is exactly what cy.visit already does under the hood.

We can utilize this same process upon a renderer / browser process crashing to continue on with the run.

So it may look something like this:

(Running Tests)

✓ test 1 - foo
✓ test 2 - bar
✓ test 3 - baz

Oh noes the renderer process crashed... we will attempt to recover

...Restarting tests at 'test 4 - quux'

✓ test 4 - quux
✓ test 5 - ipsum

Taking this a step further, we are starting to see several patterns emerge with how and why renderer processes crash - it is almost always related to extremely long test runs in a memory starved environment (such as Docker).

It may even be a good idea for us to always preemptively "break up" headless runs by spec file.

In other words, we could have an option like restartBrowserBetweenSpecFiles: true which would automatically kill the renderer / browser process before moving on to a different spec file (but still rebuild the state of the UI correctly, and still have a single contiguous video recording).

To the user it would look like nothing is really different, but internally the renderer process would be killed and then restarted.

This would forcefully purge primed memory from the process, which could keep environments like docker from ever crashing to begin with.

Depends on: #6170

4️⃣ proposal 💡 feature

Most helpful comment

We've started hitting this fairly frequently now too

All 66 comments

We actually have these crashes halfway in a single spec and we have stalling too. I tried debugging this with strace and it seems to be constantly trying to acquire some locks.
Our app seems to make the browser allocate 400+ MB of memory fast and the whole suite can go up to 2 GB...
So resetting between specs might not be enough. Maybe between it/test is also an option?

Setting the --ipc=host does fix this, but I wonder what happens if two instances of the test run simultaneously. Could a clash occur?

How could two instances of the test run occur simultaneously? If you wanted to parallelize you would do it over two different docker containers.

yes, two docker instances. It might be a false fear of a clash. I'm completely unaware/ignorant of what the two docker instances do share with --ipc=host

Hi I am running test cases on aws ec2 small instance and I am having this issue
https://on.cypress.io/renderer-process-crashed
Is there any way to avoid this

Did you try the --ipc=host fix?

But I am not using Docker

if not sandboxed, you might have multiple chrome instances fighting over resources. What is your setup? any concurrency? are you open to a different setup?

Any update on this? As Im now getting the error with Chromium usually crashes when running amount of test suites.

This issue has been superseded by this: https://github.com/cypress-io/cypress/issues/681

That will remove the need to recover since it fixes the problem at its core

We've started hitting this fairly frequently now too

I'm having this happen randomly on travis-ci with cypress 3.0.2 (I just recently started using cypress so no clue if it happened in a previous version). It might be good to add this flag even with #681 resolved.

Edit: I was able to resolve my issue by only calling .visit() once and resetting the state of the application between tests. I know that's not ideal, but it works for now.

In hindsight my fix with --ipc=host might be related to the shared memory issue I described in https://github.com/cypress-io/cypress/issues/350 and giving the container more shared memory might resolve crashes.

I'm also getting this issue now with cypress v3.1.0. Any updates?

Hi cypress team!

We are also getting this error when we use cypress run as well as cypress open

We noticed that it happens more when we use cy.wait. We can consistently reproduce it when we use cy.wait with a value greater than 20000. This is on our circle-ci linux containers fyi.

Hi, I'm currently trying to use cypress in Gitlab CI. I Figured out most parts, except the browser crashing.

my current gitlab CI test job is the following:

test_dev:
  only:
    - dev
  stage: test
  image: cypress/base:10
  script:
    - npm i --save-dev cypress
    - $(npm bin)/cypress run --reporter junit --reporter-options "mochaFile=results_[hash].xml,toConsole=true"
  artifacts:
    paths:  
      - cypress/videos
    reports:
      junit: results_*.xml
    expire_in: 1 week

This works great when the browser doesn't crash, including test reporting in gitlab's merge requests. However, it fails 50% of the times. Using the --ipc=host tag is afaik not an option in Gitlab CI.

Have you tried increasing the shared memory instead, like I discribe in https://github.com/cypress-io/cypress/issues/350 ?

I am using shared runners on gitlab ci, and shm-size doesn't seem to be an option for shared runners. Thanks anyway

I think you can configure it using this documentation https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runners-docker-section

Hi Please provide fix / explanation of this issue. It is always happening on one test case (and only one test case). I do not think it has to do with memory but there is no way to know. I was able to reproduce it locally without docker. I think it has to do with origin (subdomain) changes. Thanks

EDIT: Just ran in debug mode there is no way of knowing what is causing this problem unfortunately

Hi, we're also experiencing this issue in Kubernetes (using Jenkins as our CI engine). Would be happy to provide additional information if helpful.

I've recently started running into the issue, as our codebase starts to acquire more dependencies. It's intermittent and unpredictable. Sometimes I get a passing test, sometimes it fails the moment it begins.

After more experimentation, I've found that using the cypress/browsers:chrome69 image instead of the cypress/base:10 made the issue go away. This issue is likely to be tied to an older version of electron being unable to handle a larger codebase, and I think more effort should go into updating electron.

One useful thing in meantime would be if Cypress could have some way to communicate this to the caller that the browser failed. Then I could re-run the test inside CI automatically. Maybe an exit code from npm call could be different? Or some other way to determine that tests failed because of Chrome failing and not because of tests failing. Could this be added in meantime? So recovery could then be done outside of Cypress.

I think that since this issue has been made there is now a better fix for the problem by asking Chrome not to use /dev/shm. I opened #3633 for more details about this.

I'm hitting this issue on a small digital ocean droplet (no docker / container). The test runs perfectly a dozen or so times and then starts crashing with this error. If I reboot the droplet it starts working again then eventually dies. Looks like a memory leak to me.

There appears to be plenty of memory in my docker container

df -h /dev/shm
Filesystem      Size  Used Avail Use% Mounted on
shm              30G  8.0K   30G   1% /dev/shm

I'm also unable to figure out how to add the --ipc=host flag for my CircleCI build... Doesn't appear to be an option.

I am using shared runners on gitlab ci, and shm-size doesn't seem to be an option for shared runners. Thanks anyway

I have this same issue.

A few days ago I started facing the same issue regardless no changes were made. It's running on Travis without docker and against a separate app that is not installed in the same code base.
What interesting, that switching to --browser chrome seems to help with it, so looks like it is related to the electron no matter if it is headless or not - in both cases it's failing. However, with chrome, you lose the video recording.
Any progress on this topic? @brian-mann

I've become very impatient waiting for the Cypress folks to fix these crashing issues. In the meantime, I've created a very similar API using selenium and am having no memory issues. There's no recording of tests, but at least it's reliable. Here's a code snippet for you if you want to try it out.

import { Builder, ThenableWebDriver, By, WebElement, Key, Condition } from "selenium-webdriver"

/**
 * Wrap any promised coming from the Selenium driver so that we can
 * get stack traces that point to our code.
 */
async function wrapError<T>(p: Promise<T>) {
    const e = new Error()
    e["__wrapError"] = true
    try {
        const result = await p
        // Wait just a little bit in case the browser is about to navigate
        // or something.
        await new Promise(resolve => setTimeout(resolve, 20))
        return result
    } catch (error) {
        if (error["__wrapError"]) {
            throw error
        }
        e.message = error.message
        throw e
    }
}

async function waitFor(
    driver: ThenableWebDriver,
    fn: () => Promise<boolean | object>,
    timeout = 2000
) {
    await driver.wait(
        new Condition("wait", async () => {
            try {
                const result = await fn()
                return Boolean(result)
            } catch (error) {
                return false
            }
        }),
        timeout
    )
}

class Element {
    private promise: Promise<WebElement>
    then: Promise<WebElement>["then"]
    catch: Promise<WebElement>["catch"]

    constructor(
        public driver: ThenableWebDriver,
        promise: Promise<WebElement> | WebElement
    ) {
        this.promise = Promise.resolve(promise)
        this.then = this.promise.then.bind(this.promise)
        this.catch = this.promise.catch.bind(this.promise)
    }

    /** Map in the monadic sense. */
    map(fn: (elm: WebElement) => Promise<WebElement | undefined | void>) {
        return new Element(
            this.driver,
            wrapError(
                this.promise.then(async elm => {
                    const result = await fn(elm)
                    if (result) {
                        return result
                    } else {
                        return elm
                    }
                })
            )
        )
    }

    waitFor(fn: (elm: WebElement) => Promise<boolean | object>) {
        return this.map(elm => waitFor(this.driver, () => fn(elm)))
    }

    mapWait(fn: (elm: WebElement) => Promise<WebElement>) {
        return this.waitFor(fn).map(fn)
    }

    click() {
        return this.map(elm => elm.click())
    }

    clear() {
        return this.map(elm => elm.clear())
    }

    type(text: string) {
        return this.map(elm => elm.sendKeys(text))
    }

    enter() {
        return this.map(elm => elm.sendKeys(Key.RETURN))
    }

    backspace() {
        return this.map(elm => elm.sendKeys(Key.BACK_SPACE))
    }

    find(selector: string) {
        return this.mapWait(elm => {
            return elm.findElement(By.css(selector))
        })
    }

    findAll(selector: string) {
        return new Elements(
            this.driver,
            this.promise.then(elm => {
                return waitFor(this.driver, () =>
                    elm.findElements(By.css(selector))
                ).then(() => {
                    return elm.findElements(By.css(selector))
                })
            })
        )
    }

    contains(text: string) {
        return this.mapWait(elm => {
            // TODO: escape text.
            // https://stackoverflow.com/questions/12323403
            return elm.findElement(By.xpath(`//*[contains(text(), '${text}')]`))
        })
    }

    clickText(text: string) {
        return this.contains(text).click()
    }
}

class Elements {
    private promise: Promise<Array<WebElement>>
    then: Promise<Array<WebElement>>["then"]
    catch: Promise<Array<WebElement>>["catch"]

    constructor(
        public driver: ThenableWebDriver,
        promise: Promise<Array<WebElement>> | Array<WebElement>
    ) {
        this.promise = Promise.resolve(promise)
        this.then = this.promise.then.bind(this.promise)
        this.catch = this.promise.catch.bind(this.promise)
    }

    /** Map in the monadic sense. */
    map(
        fn: (
            elm: Array<WebElement>
        ) => Promise<Array<WebElement> | undefined | void>
    ) {
        return new Elements(
            this.driver,
            wrapError(
                this.promise.then(async elms => {
                    const result = await fn(elms)
                    if (Array.isArray(result)) {
                        return result
                    } else {
                        return elms
                    }
                })
            )
        )
    }

    waitFor(fn: (elm: Array<WebElement>) => Promise<boolean | object>) {
        return this.map(elm => waitFor(this.driver, () => fn(elm)))
    }

    mapWait(fn: (elm: Array<WebElement>) => Promise<Array<WebElement>>) {
        return this.waitFor(fn).map(fn)
    }

    clickAll() {
        return this.map(async elms => {
            await Promise.all(elms.map(elm => elm.click()))
        })
    }

    atIndex(index: number) {
        return new Element(
            this.driver,
            wrapError(
                this.promise.then(elms => {
                    const elm = elms[index]
                    if (!elm) {
                        throw new Error("Element not found!")
                    }
                    return elm
                })
            )
        )
    }
}

export class Browser {
    private promise: Promise<void>
    then: Promise<void>["then"]
    catch: Promise<void>["catch"]

    constructor(public driver: ThenableWebDriver, promise?: Promise<void>) {
        this.promise = Promise.resolve(promise)
        this.then = this.promise.then.bind(this.promise)
        this.catch = this.promise.catch.bind(this.promise)
    }

    visit(route: string) {
        return new Browser(
            this.driver,
            wrapError(
                this.promise.then(async () => {
                    await this.driver.get(route)
                })
            )
        )
    }

    rerender() {
        return new Browser(this.driver, wrapError(rerender(this.driver)))
    }

    flushTransactions() {
        return new Browser(this.driver, wrapError(flushTransactions(this.driver)))
    }

    find(selector: string) {
        return new Element(
            this.driver,
            wrapError(
                this.promise
                    .then(() => {
                        return waitFor(this.driver, async () =>
                            this.driver.findElement(By.css(selector))
                        )
                    })
                    .then(() => {
                        return this.driver.findElement(By.css(selector))
                    })
            )
        )
    }

    getClassName(className: string) {
        return this.find("." + className)
    }

    getTitle() {
        return this.driver.getTitle()
    }

    waitFor(fn: () => Promise<boolean>, timeout = 2000) {
        return new Browser(this.driver, waitFor(this.driver, fn))
    }

    waitToLeave(url: string) {
        return new Browser(
            this.driver,
            wrapError(
                waitFor(
                    this.driver,
                    async () => {
                        const currentUrl = await this.driver.getCurrentUrl()
                        return url !== currentUrl
                    },
                    10000
                )
            )
        )
    }

    waitForRoute(url: string) {
        return new Browser(
            this.driver,
            wrapError(
                waitFor(
                    this.driver,
                    async () => {
                        const currentUrl = await this.driver.getCurrentUrl()
                        return url === currentUrl
                    },
                    10000
                )
            )
        )
    }
}

We're seeing this issue crop up on Drone, which also doesn't support the --ipc=host option. Our containers already have 16GB memory. Some notes on the behavior:

  1. Electron logs an error message when it crashes, but actually fail the test run. Our build is green despite the fact that half the tests caused a renderer crash.

  2. Chrome doesn't even log a message—it dies silently and the test run hangs forever.

  3. The crash does appear to happen at the exact same time on every run, but it's not clear what we're doing to cause it. Rearranging our test code or skipping certain tests resolves the problem temporarily, but it always creeps back in.

I haven't contributed to Cypress before, but I'd be willing to take a stab at fixing the problem if someone (@brian-mann ?) can show me where to start. My team has lost a ton of time troubleshooting this and I'd love to put it to bed.

@nmuth Please see our contributing guide on how to start: https://github.com/cypress-io/cypress/blob/develop/CONTRIBUTING.md

Are you using version 3.3.1?

@jennifer-shehane Yup, we're on 3.3.1. I've read the contributing guide. I'm still coming to grips with the code. It looks like the crash handler for Electron is here. Where can I hook in to provide a crash handler for Chrome? Would that be in the launcher package?

@RockChild Are you on 3.3.x? I commented in another thread that this seems to have popped up since 3.3.0 dropped about ~2 weeks ago.

@jbinto yeah, looks like it started crashing after upgrade to 3.3.1, so I'll try to downgrade to 3.3.0. Thanks for your insights!

I switched to cypress/browsers:chrome69, changed the package version to 3.3.0 and, with the following build step config in drone.io, it seems that the renderer doesn't crash anymore:

steps:
  - name: dev-tests
    image: cypress/browsers:chrome69
    shm_size: 4096000000
    mem_limit: 1000000000
    commands:
      - npm ci
      - $(npm bin)/cypress verify
      - $(npm bin)/cypress run

Later edit - it just crashed this morning, so it seems that this is not it. Isn't there any way to auto-restart the test if it crashes ?

@RockChild Downgrading to 3.3.0 (or even 3.2.0) has not resolved this issue for us.

Similar to you we just started seeing this on or around May 27. No idea what has changed, and we have tried just about everything to fix this. It is gradually getting worse, with almost 100% crash rate today (when it started a few weeks ago it was closer to 5-10%).

Only happening on CircleCI. /dev/shm is 30GB there. No pattern to where the tests fail. Nothing interesting when using DEBUG=cypress:*.

If you’re seeing consistent crashes and would like this implemented, please leave a note in the issue.
Yes, please.

Any update on this ? I've tried making a wrapper using Cypress in a .js file but it seems that the renderer errors aren't caught by Cypress.catch()

Please fix.

We are hitting this problem as well. Not using Docker.
Unfortunately, this issue makes cypress way too unreliable for automated tests.

Experiencing consistent browser crashes inside of my Jenkins pipeline. Unable to get around this. I may also have to investigate alternative solutions as this is rather unstable and unpredictable (the reason I moved away from CodeCeption).

Things run great locally, but once we try running the tests on our Jenkins server the browser crashes every time and my tests never pass.

Currently on day 2 of debugging this. If I can't resolve this today I'll have to move away from Cypress.

@EvanHerman (and anyone else on this thread): FWIW, since switching to Chrome (from Electron) and setting some flags we have not seen a crash in CI for almost 2 months now.

See https://github.com/cypress-io/cypress/issues/350#issuecomment-503231128 for details.

And just to add - there is open pull request that adds video recording to Chrome https://github.com/cypress-io/cypress/pull/4791 which is THE main thing stopping people from using Chrome on CI

@jbinto Thanks for the tip - I'll switch out the image and test things out. 👍

Edit: Works perfectly, thanks again @jbinto - saved me a lot of headache!

@bahmutov That's great news! Looking forward to having the view back recorded videos in Chrome.

Please fix this, using cypress 3.4.1

Please fix this. I’m not able to set ipc=host in my ci/cd pipeline

Please fix this or provide work-around for different environments. In my case I'm running Cypress using Jenkins and pipelines where I do not have access to flags.

Please fix this issue as I am hitting 'sad face' error with docker. I am using latest cypress 3.6.1

We're seeing this quite often lately as well with 3.7.0

Happening quite a bit with electron on 3.8

Update, I've found that since removing the CPU limits from our POD definition this issue is no longer occurring. Whether this is advisable I'm unqualified to say.

Migrated from 3.7.0 to 3.8.1 and now we are seeing this error for the first time (we do cypress run inside a docker image, with GitLab CI)

We are still seeing this issue for a couple of our apps which utilise Cypress. Others are working fine.


Update: For the remaining apps which had memory leaks, it appears updating to 3.8.1 and running the tests in Chrome rather than Electron has done the trick.

I'm seeing this issue after upgrading cypress. Especially when i run it with percy and take many snapshots.

I am facing the same issue as well, I tried optimizing my tests and following the best practices guide: https://docs.cypress.io/guides/references/best-practices.html. I can see improvements but still crashes the CI most of the time.

I am also facing this issue consistently using headless Electron in CI. I'm on 3.8.3.

Also facing this issue. No changes to the code base. It just started to happen on all of our Jenkins jobs. Please advise!

We started facing this issue today, with no changes, cypress 4.2.0 on Gitlab CI in Electron browser

This is the output:

We detected that the Chromium Renderer process just crashed.
...
TypeError: onError is not a function
     at BrowserWindow.onCrashed (/root/.cache/Cypress/4.2.0/Cypress/resources/app/packages/server/lib/modes/run.js:538:7)
     at WebContents.<anonymous> (/root/.cache/Cypress/4.2.0/Cypress/resources/app/packages/server/lib/gui/windows.js:181:34)
     at WebContents.emit (events.js:215:7)

@jaroslav-kubicek I saw that yesterday on Gitlab CI and disabling the video seemed to help. Still digging in.

We're also seeing the same problem with the same error that @jaroslav-kubicek posted 😞

We solved our issues by pulling the latest image (4.3.0) and making sure to add "numTestsKeptInMemory": 1 in our cypress.json file.

We have resolved crashes on our end by passing the --disable-dev-shm-usage flag to Chrome in plugins/index.js
https://docs.cypress.io/guides/guides/continuous-integration.html#In-Docker

module.exports = (on, config) => {
  // add --disable-dev-shm-usage chrome flag
  on('before:browser:launch', (browser, launchOptions) => {
    if (browser.family === 'chromium') {
      console.log('Adding Chrome flag: --disable-dev-shm-usage');
      launchOptions.args.push('--disable-dev-shm-usage');
    }
    return launchOptions;
  });
};

We have resolved crashes on our end by passing the --disable-dev-shm-usage flag to Chrome in plugins/index.js
https://docs.cypress.io/guides/guides/continuous-integration.html#In-Docker

module.exports = (on, config) => {
  // add --disable-dev-shm-usage chrome flag
  on('before:browser:launch', (browser, launchOptions) => {
    if (browser.family === 'chromium') {
      console.log('Adding Chrome flag: --disable-dev-shm-usage');
      launchOptions.args.push('--disable-dev-shm-usage');
    }
    return launchOptions;
  });
};

We totally did the same thing - completely forgot.

module.exports = (on, config) => {
  on('before:browser:launch', (browser = {}, launchOptions) => {
    if (browser.name === 'chrome') {
      launchOptions.args.push('--disable-dev-shm-usage')
      return launchOptions
    }

    return launchOptions
  })
}

Also experiencing this issue. Tried both solutions provided by @sesamechicken with no luck.

Happening during Netlify build w/ the netlify-plugin-cypress plugin. I am just starting to add Cypress to my project and only have a single simple test to run, so the issue shouldn't be many tests exhausting memory.

Please fix it, I'm on 4.6

also just got it while running cypress github-actions

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jennifer-shehane picture jennifer-shehane  ·  3Comments

brian-mann picture brian-mann  ·  3Comments

carloscheddar picture carloscheddar  ·  3Comments

szabyg picture szabyg  ·  3Comments

rbung picture rbung  ·  3Comments