Puppeteer: Support browser contexts to launch different sessions

Created on 17 Jul 2017  Â·  77Comments  Â·  Source: puppeteer/puppeteer

Support browser contexts (Target.createBrowserContext) so to avoid launching multiple instances if one wants a pristine session. See proposal https://github.com/GoogleChrome/puppeteer/issues/66 and original discussion at https://github.com/cyrus-and/chrome-remote-interface/issues/118.

A viable user scenario might be testing several users logged in simultaneously into the service.
We might expose browser contexts as a string literal option to the browser.newPage:

browser.newPage(); // creates a new page in a default browser context
browser.newPage({ context: 'default' }); // same as previous call
browser.newPage({ context: 'another-context' }); // creates a page in another browser context
feature upstream

Most helpful comment

Ok, I got this working. Thanks for the tip, @ks07 .

const puppeteer = require('puppeteer');
const Page = require('puppeteer/lib/Page');

async function newPageWithNewContext(browser) {
  const {browserContextId} = await browser._connection.send('Target.createBrowserContext');
  const {targetId} = await browser._connection.send('Target.createTarget', {url: 'about:blank', browserContextId});
  const client = await browser._connection.createSession(targetId);
  const page = await Page.create(client, browser._ignoreHTTPSErrors, browser._screenshotTaskQueue);
  page.browserContextId = browserContextId;
  return page;
}

async function closePage(browser, page) {
  if (page.browserContextId != undefined) {
    await browser._connection.send('Target.disposeBrowserContext', {browserContextId: page.browserContextId});
  }
  await page.close();
}

(async () => {
  const browser = await puppeteer.launch();
  const page = await newPageWithNewContext(browser);
  await page.goto('https://example.com');
  console.log(await page.cookies());

  await closePage(browser, page);
  await browser.close();
})();

UPDATE (2017-11-04):
I've updated the example with a suggestion on how to close the page and its context.

All 77 comments

This would be very useful for me - currently I'm forced to start multiple chrome instances if I'm wanting to have multiple simultaneous sessions on the same site.

Is there any way to workaround this currently? Or do I need to revert to using chrome-remote-interface to access context creation?

+1, this feature would be crucial to me as well. Right now simulating 5+ simultaneous users is very memory intensive, since each user needs to launch its own chrome instance.

Would be useful to be able to use an unlimited number of browser contexts to simulate the desired number of users while using only one browser instance.

In case anybody is still stuck waiting on this, it's a surprisingly easy change to hack it working in the very simple case where you want a new context for every page. All that is required is adding a call to Target.createBrowserContext at the top of Browser.newPage, and passing the returned contextID into the call to Target.createTarget (viewing how this is done at cyrus-and/chrome-remote-interface#118 might make this clearer).

Note that depending on what you're doing, you will probably want to add some support to delete these browser contexts when pages are closed - but if your Puppeteer instance is short-lived then this might not matter.

@ks07 FYI: browser contexts are not supported in headful mode (which is the reason we wait on implementing this in pptr).

@aslushnikov - Dang that's a bummer, is there a roadmap for it to be implemented in headful mode?

+1
Is there a solution?

@miles-collier could you just clarify how pptr deals with parallel pages:
Does it suspend javascript in inactive page (like headful chrome does)?
Thanks!

Ok, I got this working. Thanks for the tip, @ks07 .

const puppeteer = require('puppeteer');
const Page = require('puppeteer/lib/Page');

async function newPageWithNewContext(browser) {
  const {browserContextId} = await browser._connection.send('Target.createBrowserContext');
  const {targetId} = await browser._connection.send('Target.createTarget', {url: 'about:blank', browserContextId});
  const client = await browser._connection.createSession(targetId);
  const page = await Page.create(client, browser._ignoreHTTPSErrors, browser._screenshotTaskQueue);
  page.browserContextId = browserContextId;
  return page;
}

async function closePage(browser, page) {
  if (page.browserContextId != undefined) {
    await browser._connection.send('Target.disposeBrowserContext', {browserContextId: page.browserContextId});
  }
  await page.close();
}

(async () => {
  const browser = await puppeteer.launch();
  const page = await newPageWithNewContext(browser);
  await page.goto('https://example.com');
  console.log(await page.cookies());

  await closePage(browser, page);
  await browser.close();
})();

UPDATE (2017-11-04):
I've updated the example with a suggestion on how to close the page and its context.

Are there any problems doing that @barbolo? Short or long term issues? Would closing the page close the context?

I haven't had any problems using this method - although as mentioned this probably won't work with headless: false.

@miles-collier Unfortunately, I can't answer you properly. Maybe @aslushnikov can say what he thinks about issues with this approach.

Are there any problems doing that @barbolo? Short or long term issues?

@miles-collier The major issue is that contexts are not supported in headful chrome.

Would closing the page close the context?

Protocol gives full control for context management: you can create and dispose contexts when needed.
However, in the code snippet the author issues a command to dispose the context whenever the page closes.

@barbolo hi barboli. if i use headless :false,there is a error.

(node:32100) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: Protocol error (Target.createBrowserContext):Not supported undefined
(node:32100) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

@zhaopengme as others have pointed, this approach doesn't work in headful chrome. That's why this issue is open.

@barbolo ╮(╯▽╰)╭!thanks!

@aslushnikov Am I correct that contexts can be saved to / and resumed from disk when this lands? I remember this pointed out in a discussion somewhere, but can't seem to find it now

After updating puppeteer I get the following error on your code @barbolo:
Error: Protocol error (Page.getFrameTree): 'Page.getFrameTree' wasn't found undefined

@miles-collier you have two options:

  1. Use an older version of puppeteer;
  2. Update the code sample I've provided above according to the latest implementation of Browser.newPage:
    https://github.com/GoogleChrome/puppeteer/blob/6512ce768ddce790095e2201d8ada3c24407fc57/lib/Browser.js#L97-L103

Since puppeteer seems to be still in alpha (maybe beta?), the code will probably change a lot in the coming months. I'd go with option 1 and be locked with a version that works for the project I'm working on. When puppeteer goes v1.0.0 (stable), I'd make a change.

Valid point @barbolo, I don't know enough about the system to keep maintaining those functions. I'll have to stick with a version until maybe this becomes added into Puppeteer natively.

@aslushnikov @barbolo

i am triggering above script you shared from a .bat file programatically to simulate just 20 tabs & browsing same application from each tab . I dont think its opening tabs of same browser instance. because CPU utilization hitting 100% even for 6 tabs( concurrent users). Is ithere any better approach?

Thanks,
vij

Any updates to this issue now that pptr has hit v1.0.0? Supporting page contexts for isolations of sessions without having to make separate browser instances seems like a feature that I'd really like to see implemented!

Hey @barbolo,

Your code doesn't work since @aslushnikov commit (5368051610babf52e7df927c323d315a493fb56f) and v1.0.0. Any tip to work again?

EDIT:

It seems it's a chromium issue too... this hack works well on nov commits.

With commit: ec8e40f1cb2d9214387bfbbc2e5dcf48b8db7148

const puppeteer = require('puppeteer');
const Page = require('puppeteer/lib/Page');

let browser;
let page;

(async function() {
    browser = await puppeteer.launch({headless: true, args: ['--no-sandbox']});

    let response;
    let cookies;

    page = await newPage();
    response = await page.goto('http://perdu.com'); // no cookies
    cookies = await page._client.send('Network.getAllCookies');
    console.log(cookies.cookies.length);
    await closePage(page);

    page = await newPage();
    response = await page.goto('http://www.lefigaro.fr'); // has cookies
    cookies = await page._client.send('Network.getAllCookies');
    console.log(cookies.cookies.length);
    await closePage(page);

    page = await newPage();    
    response = await page.goto('http://perdu.com'); // no cookies
    cookies = await page._client.send('Network.getAllCookies');
    console.log(cookies.cookies.length);
    await closePage(page);    

    await browser.close();
})();

async function newPage()
{
    console.time('newpage');
    const {browserContextId} = await browser._connection.send('Target.createBrowserContext');
    const {targetId} = await browser._connection.send('Target.createTarget', {url: 'about:blank', browserContextId});
    const client = await browser._connection.createSession(targetId);
    const page = await Page.create(client, browser._ignoreHTTPSErrors, browser._screenshotTaskQueue);
    page.browserContextId = browserContextId;
    console.timeEnd('newpage');
    return page;
}

async function closePage(page)
{
    console.time('closepage');
    if (page.browserContextId !== undefined) {
        await browser._connection.send('Target.disposeBrowserContext', {browserContextId: page.browserContextId});
    }
    await page.close();
    console.timeEnd('closepage');        
}

Sometimes it works.. sometimes it fails... :(

Output OK (the 3rd request should return 0 cookies):

$ nodejs js/puppeteer_debug.js 
newpage: 43.684ms
0
closepage: 1.528ms
newpage: 19.084ms
103
closepage: 2.811ms
newpage: 19.360ms
0
closepage: 2.757ms

Output fail (the 3rd request should return 0 cookies):

$ nodejs js/puppeteer_debug.js 
newpage: 45.471ms
0
closepage: 1.904ms
newpage: 22.531ms
90
closepage: 2.658ms
newpage: 24.441ms
71
closepage: 3.121ms

seems screenshot function is not working in v1.0.0 with @barbolo's solution, it says
TypeError: Cannot read property 'postTask' of undefined
I think the reason is the constructor of Page.js has been changed, so i have made some changes now it works
changes are available at https://gist.github.com/Ariex/6f425fbcab09e13d8bec39aba7c4a9b5

Also, i have tested my code with @HanXHX's tests code, the 3rd one always returns 0 in my all 10 around tests

@Ariex's solution worked beautifully for me and decreased the CPU load considerably. Thank you!

@Ariex's solution doesn't work for me for some reason. I'm using headless Chrome. I've got different browserContextId but page.goto(GMAIL_URL) still returns the same logged in content!

Maybe it's just me? @markusmobius do you have a code example online? Thanks.

Thanks @Ariex
I ran your test locally. It seems to work. Something is wrong with Microsoft website, it always crashes my Chrome on the 3rd run. (Error: 7 Bus error nohup google-chrome, or 8 Segmentation fault)

With your code, it's weird that every time I test with Logged-in Gmail page, it still shows the previous tab content. Maybe Chrome supports Gmail internally so it can remember user even without cookies?

My test script is like this: 1. New Incognito Page 2. Login to Gmail 3. Capture screenshot - Repeat with step 1. Do that around three times, it will start showing the previous tab (logged in) screenshot! I feel dangerous to reuse Chrome instance for multiple users.

hi @ngduc, that is weird, i have use that code running 2 incognitoPage at same time and they are not affect each other. Do you have a simple code for your test? Somehow i got different page when I access mail.google.com in my real chrome than puppeteer chrome..

@aslushnikov This would be an interesting addition. Is headful support for this in Chromium somewhere in the roadmap or not really? If not, couldn't we still add this to PPTR with a disclaimer that it only works in headless?

Agreed @alixaxel, having this built in and maintained would be nice. Perhaps throw an error or message if they are trying to run to headful mode.

+1 hopefully this is a supported feature in near future.

A pretty common use case for many apps where there are use cases involving more than one users.

+1

If having different browser contexts means also having a separate localStorage, one for each context, this feature would be extremely useful for all web apps that store their state in localStorage. (e.g. multipage apps that use localStorage to remember their state during a page transition). For those kind of apps at the moment there is no way to run many concurrent instances. I definitely vote for this feature !

+1
Definitely hope this gets added soon.

A word of caution: at the moment it seems that launching Chromium with different browserContextIds is incompatible with the --single-process startup flag. In environments like Lambda, where only single-thread execution is available, it causes newPage to freeze.

+1

+1

@aslushnikov, given there's no movement on this from the Chromium side, what do you think about shipping it just for headless mode? A few important use cases (e.g. pptr in the cloud) would benefit from having this and we already have other APIs (pdf) that don't work with headful. Just have to throw/warn in the API call and make notes of the differences docs :)

I think the benefits outweigh having the feature available in both headful and headless. Thoughts?

@ebidel @aslushnikov Since I work with puppeteer mostly in serverless environments, this would be a much welcome addition given that Chromium startup times are nearly a second on AWS Lambda and over 2 seconds on Google Cloud Functions, so getting rid of this latency would be amazing.

However, after manually patching puppeteer to support this feature and running a couple of tests I noticed a couple of quirks that I think are critical and worth mentioning.

1) It doesn't work with the --single-process startup argument. Fatally, Lambda and GCF will freeze on newPage() if --single-process is not specified, rendering it useless for serverless use cases.
2) Another very important quirk is that browser session data (cookies and so on) seems to be shared between all the custom browser contexts created. Meaning, the browser will effectively only have two isolated contexts, the default one and every other context created. Maybe it's a problem with the implementation (@haroldhues patch for what it's worth), but assuming that's the expected behavior it doesn't seem to be of much value.

How are you using Chromium on GCF? From my experience, their VMs are too old to support some of the newer Chromium deps and you can't install custom software.

@ebidel I compiled Chromium and NSS on Debian Jessie and I patched LD_LIBRARY_PATH to look for NSS on the right place. The module it's on GitLab if you're interested but I don't intend on maintaining it.

@alixaxel nice one! FWIW, the Cloud team is looking into updates that would help here. Basically removing all the headache of needing to compile and link in libs yourself.

Thanks for the head-ups, that would be awesome! :1st_place_medal:

@alixaxel I can't speak to the problems with --single-process because I haven't needed to use it, but I have not been seeing any problems with browser contexts using the changes in my fork. Pages created in different contexts shouldn't share session data (I added some sanity checks to my tests to double check that).

It is also worth noting that in recent versions of Puppeteer you can use browser contexts to create isolated pages via the CDPSession interface. example here, lightly tested. It is pretty hacky and uses the internal Target._targetId property though.

Another very important quirk is that browser session data (cookies and so on) seems to be shared between all the custom browser contexts created. Meaning, the browser will effectively only have two isolated contexts, the default one and every other context created.

@alixaxel I didn't notice any functional issues there. In my tests, I was playing with three pages in three different contexts, and both localStorage and cookies were well isolated. Do you have a repro?

@alixaxel ok, I found an issue with headless browser contexts; it's fixed upstream as https://crrev.com/553657, we'll roll it in pptr soon.

@aslushnikov I just tried the master branch and notice that every createIncognitoBrowserContext call will create a new browser window in headful mode. Is it designed? If it is, will this increase cpu & memory usage?

I just tried the master branch and notice that every createIncognitoBrowserContext call will create a new browser window in headful mode. Is it designed?

Yes, this is expected.

If it is, will this increase cpu & memory usage?

I don't think there's much overhead due to this behavior.

Hi @aslushnikov, thanks for the work provinding this new feature. Regardind your last message, just to be clear, does this mean that incognito browser context won't be available in headless mode ?

Regardind your last message, just to be clear, does this mean that incognito browser context won't be available in headless mode ?

@QuentinLB on the contrary: browser contexts are available in both headless and headful modes.

@aslushnikov That's great, I hope this feature will be released soon, I'm still using @barbolo & @Ariex workaround from january in production.

Can someone please share a sample?

I am trying to do a similar thing here for which I have created #2670, but, haven't heard anything on it.

I have a Node Express app which uses puppeteer and launches a browser window with a pre-configured URL and perform some tasks on it. Everything works great when there is juts one window opened. As soon as a second window is opened, the events of the first one are swallowed and both the browser windows are left in orphan state.

Any ideas what I should do to resolve this problem?

CC: @aslushnikov, @QuentinLB

Hi @mhaseebkhan, as I said in my previous message I use the gist provided by @Ariex : https://gist.github.com/Ariex/6f425fbcab09e13d8bec39aba7c4a9b5

My use case is somewhat different our yours because I'm not listening to anything besides console output but for cookies and session this snippet works great so far.

Maybe you should redesign your service, Browserless made an interesting article (https://docs.browserless.io/blog/2018/06/04/puppeteer-best-practices.html) where they suggested to avoid browser & page reuse.

Hope it'll work out for you.

@aslushnikov, @QuentinLB: Here is a test script containing a README with instructions.

Please have a look and suggest.

@mhaseebkhan Nothing wrong with your code beside page = await browser.newPage({ context: 'multiple-test-context-' + Math.floor(Math.random() * 1000) }) I couldn't find any reference of these kind of parameter for browser.newPage in the current documentation.
Also, I don't know if you're aware that since the 1.5.0 realease incognito contexts are available if you need them : https://github.com/GoogleChrome/puppeteer/blob/v1.5.0/docs/api.md#browsercreateincognitobrowsercontext

@QuentinLB: Here is the mention of using await browser.newPage({ context: ..... I just generated a random number to ensure that each browser has its own context.

If you try running the project and make multiple calls simultaneously, you will see the events getting jumbled up.

@mhaseebkhan You're right, I wasn't aware of this option. Unfortunately I won't be of much help since I haven't tried the new BrowserContext. Good luck.

@mhaseebkhan the api ended up implemented in a different way; please, check out the released documentation here for puppeteer v1.5.0: https://pptr.dev/#?product=Puppeteer&version=v1.5.0&show=api-class-browsercontext

@aslushnikov is it possible to have more than 2 total browser contexts (the default, and an incognito) at any time? The use case is trying to leverage a single browser instance to spin up as many isolated sessions (to isolate cookies for each page) in parallel as possible

is it possible to have more than 2 total browser contexts (the default, and an incognito) at any time?

@antonellil totally. You can have multiple isolated incognito contexts.

is it possible to have more than 2 total browser contexts (the default, and an incognito) at any time?

@antonellil totally. You can have multiple isolated incognito contexts.

@aslushnikov
As of docs there is browser.createIncognitoBrowserContext() but no browser.createBrowserContext(). So the strict answer is "You can have multiple isolated incognito contexts and only one default browser context". Correct?

@vitalets correct.

@aslushnikov interesting, I'm not seeing isolation from cookies from other pages in other incognito browser contexts created on the same browser. below is the gist of the code that is running with multiple outstanding incognito browser contexts from the same browser. it results in the first navigation being successful, and all later navigations result in 30s timeouts.

browser = await puppeteerBrowserPool.acquire();

browserContext = await browser.createIncognitoBrowserContext();

page = await browserContext.newPage();

puppeteerBrowserPool.release(browser);

if (userId) {
   await cookieHelper.attachUserInfoToPuppeteerPage({page, userId});
}

await page.goto(url, {
   waitUntil: 'networkidle0',
});

browserContext.close();

I can confirm that the below code works as expected

browser = await puppeteerBrowserPool.acquire();

browserContext = await browser.createIncognitoBrowserContext();

page = await browserContext.newPage();

if (userId) {
   await cookieHelper.attachUserInfoToPuppeteerPage({page, userId});
}

await page.goto(url, {
   waitUntil: 'networkidle0',
});

await browserContext.close();

puppeteerBrowserPool.release(browser);

Both snippets reuse the same browser instances, but the latter works as far as i can tell because only 1 incognito context is active on the browser at a time. In the first snippet, having multiple incognito contexts active only results in the first navigation being successful, and the later ones to fail, which leads me to believe that the incognito contexts are not isolated from each other.

Thanks for your time and taking a look at this

Ok, I got this working. Thanks for the tip, @ks07 .

const puppeteer = require('puppeteer');
const Page = require('puppeteer/lib/Page');

async function newPageWithNewContext(browser) {
  const {browserContextId} = await browser._connection.send('Target.createBrowserContext');
  const {targetId} = await browser._connection.send('Target.createTarget', {url: 'about:blank', browserContextId});
  const client = await browser._connection.createSession(targetId);
  const page = await Page.create(client, browser._ignoreHTTPSErrors, browser._screenshotTaskQueue);
  page.browserContextId = browserContextId;
  return page;
}

async function closePage(browser, page) {
  if (page.browserContextId != undefined) {
    await browser._connection.send('Target.disposeBrowserContext', {browserContextId: page.browserContextId});
  }
  await page.close();
}

(async () => {
  const browser = await puppeteer.launch();
  const page = await newPageWithNewContext(browser);
  await page.goto('https://example.com');
  console.log(await page.cookies());

  await closePage(browser, page);
  await browser.close();
})();

UPDATE (2017-11-04):
I've updated the example with a suggestion on how to close the page and its context.

the code report this error, how can I fix it ?

(node:65668) UnhandledPromiseRejectionWarning: Error: Protocol error (Target.attachToTarget): Invalid parameters targetId: string value expected
at Promise (/opt/web/alipay/sf/pup/node_modules/puppeteer/lib/Connection.js:86:56)
at new Promise ()
at Connection.send (/opt/web/alipay/sf/pup/node_modules/puppeteer/lib/Connection.js:85:12)
at Connection.createSession (/opt/web/alipay/sf/pup/node_modules/puppeteer/lib/Connection.js:157:36)
at newPageWithNewContext (/opt/web/alipay/sf/pup/examples/ctx.js:9:44)
at process._tickCallback (internal/process/next_tick.js:68:7)
(node:65668) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:65668) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

Just in case someone comes across this, this is a working code based on the previous one:

const puppeteer = require('puppeteer');

async function newPageWithNewContext(browser) {
  const {browserContextId} = await browser._connection.send('Target.createBrowserContext');
  const page = await browser._createPageInContext(browserContextId);
  page.browserContextId = browserContextId;
  return page;
}

async function closePage(browser, page) {
  if (page.browserContextId != undefined) {
    await browser._connection.send('Target.disposeBrowserContext', {browserContextId: page.browserContextId});
  } else {
    await page.close();
  }
}

(async () => {
  const browser = await puppeteer.launch();
  const page = await newPageWithNewContext(browser);
  await page.goto('https://example.com');
  console.log(await page.cookies());

  await closePage(browser, page);
  await browser.close();
})();

But please note that this is basically the same as the new Incognito API createIncognitoBrowserContext.

Browser context is not working on AWS Lambda. I am getting this error.
Protocol error (Target.createTarget): Target closed.
@acalatrava Neither your code nor createIncognitoBrowserContext API work.
Anyone experiencing the same?
Here's my simple code
const context = await browser.createIncognitoBrowserContext()
const page = await context.newPage()
await page.goto('https://google.com')
await context.close();

@ramesh82 in lambda, chrome is running in single process mode (platform limitation)
and for additional browser contexts you need not to be in single process mode
https://github.com/adieuadieu/serverless-chrome/blob/886c2c4f5c5b2e8f9b79670e0935ee8afb970aab/packages/lambda/src/flags.js#L11

you may try to remove the single process, tell me if it worked

@Bnaya Thanks for your response. Launching a newpage using the context (context.newPage()) is timing out after removing single process flag.

I have the same issue on lambda. Anybody knows of a workaroud?

I guess you can always spawn more lambdas ;)

:) no you cannot. There's no way of choosing how many instances are running. This is the point of lambda: the scale is handled by the system. I make N calls to the lambda but I cannot choose how many lambda handle them

read about reserved-concurrent-executions :)

Thx!

So does anyone got browser context working on aws lambda?

I have the same problem @patrickkh7788
My error:
(node:12662) UnhandledPromiseRejectionWarning: Error: Protocol error (Target.attachToTarget): Invalid parameters targetId: string value expected

Any solution thanks?

this work for me:

````javascript
const puppeteer = require('puppeteer');

async function closePage(browser, page) {
if (page.browserContextId != undefined) {
await browser._connection.send('Target.disposeBrowserContext', { browserContextId: page.browserContextId });
}
await page.close();
}

async function newPageWithNewContext(browser) {
const { browserContextId } = await browser._connection.send('Target.createBrowserContext');
const { targetId } = await browser._connection.send('Target.createTarget', { url: 'about:blank', browserContextId });
var targetInfo = { targetId: targetId }
const client = await browser._connection.createSession(targetInfo);
const page = await browser.newPage({ context: 'another-context' }, client, browser._ignoreHTTPSErrors, browser._screenshotTaskQueue);
page.browserContextId = browserContextId;
return page;
}

(async function () {

const browser = await puppeteer.launch({ headless: false });
const page = await newPageWithNewContext(browser);

await page.goto('your_page');
var a = await page.cookies();
console.log(------->: a, a[0].value);

const page2 = await newPageWithNewContext(browser);
await page2.goto('your_page');
var b = await page2.cookies();
console.log(------->: b, b[0].value);

await closePage(browser, page);
await browser.close();

})();
````

Was this page helpful?
0 / 5 - 0 ratings