Question 1

How does Ceven handle dynamic content that loads after the initial page hit?

Accepted Answer

Ceven uses Browserless to launch a full headless Chrome instance rather than just making a simple HTTP request. This means the browser actually executes the JavaScript on the page, runs the scripts, and waits for the DOM to stabilize. You can specify wait conditions in your workflow, telling the agent to wait for a specific CSS selector to appear before it attempts to scrape the data. This ensures that you get the final rendered state of the page, including content loaded via AJAX or React, which traditional scrapers usually miss entirely.

Question 2

Can Browserless handle sites with aggressive bot detection?

Accepted Answer

Yes. Ceven leverages the Browserless unblocker feature which rotates proxies and mimics real human browser fingerprints. This includes rotating user agents and handling the TLS handshake in a way that looks like a standard consumer browser. When the agent detects a block or a CAPTCHA, it can automatically reroute the request through the unblocker endpoint. This significantly increases the success rate for scraping sites that actively try to block headless browsers or automated scripts, though some extremely high security sites may still require manual proxy configuration.

Question 3

What are the limitations regarding session duration and timeouts?

Accepted Answer

Browserless imposes a maximum execution time for each session to prevent runaway scripts from consuming all resources. Depending on your Browserless plan, a script that runs longer than the allotted timeout will be killed by the server. If you are running a very heavy Puppeteer script that performs dozens of page navigations, you might hit these limits. To solve this, we recommend breaking long tasks into smaller, discrete workflow steps. This allows the agent to save state in a database between calls and restart a fresh browser session for each sub task.

Question 4

How does the PDF generation handle page layouts and CSS?

Accepted Answer

The PDF tool in Browserless uses the Chrome Print to PDF functionality, meaning it renders the page exactly as Chrome would if you pressed Ctrl P. It supports custom page formats, margins, and the waituntil parameter, which ensures the PDF is not generated until the network is idle. If a site has a specific print stylesheet, Chrome will honor that by default. You can use custom Puppeteer scripts via Ceven to hide specific elements like nav bars or footers before triggering the PDF print to get a cleaner document.

Question 5

Is there a limit to how many pages I can scrape at once?

Accepted Answer

The limit is determined by your Browserless concurrency settings. If you have a plan that allows five concurrent sessions, and your Ceven workflow tries to launch ten browsers at the exact same second, the Browserless API will return a 429 rate limit error for the excess requests. Ceven handles this by implementing an exponential backoff retry logic, meaning it will queue the failed requests and try them again after a short delay. For very high volume needs, you should increase your concurrency limit in the Browserless dashboard to avoid latency.

Question 6

Can I interact with forms and click buttons on a page?

Accepted Answer

Absolutely. While the simple fetch and scrape tools are for read only tasks, the Execute Custom Function tool allows you to send full Puppeteer scripts. This means the agent can click buttons, type text into input fields, select dropdown options, and navigate through a multi step checkout or login flow. You can define the sequence of actions in the script and then have the agent return the final HTML or a screenshot of the result. This makes Browserless a powerful tool for end to end browser automation beyond simple scraping.

Question 7

How are files handled when downloading via a headless browser?

Accepted Answer

When a script triggers a file download in Browserless, the file is temporarily stored in a fresh download directory on the Browserless server. Ceven uses the Download file tool to fetch that file from the server and bring it into your workflow context. Because these directories are ephemeral and deleted once the session ends, the agent must retrieve the file immediately after the download completes. If the session closes before the file is pulled, the data is lost and the script must be run again to regenerate the file.

Question 8

Does Browserless support authenticated sessions or cookies?

Accepted Answer

Yes, you can pass cookies or local storage state into your Browserless requests. When using custom scripts, you can use the Puppeteer API to set cookies before navigating to a page. This allows the agent to act on behalf of a logged in user. However, be aware that some sites use short lived session tokens or multi factor authentication that can invalidate these cookies quickly. For these cases, it is often better to have the agent perform the login flow as part of the script using stored credentials provided securely via Ceven environment variables.

Browserless

Try Browserless in Ceven

Why use Ceven?

AI native Browserless integration

Managed auth

Agent optimized design

Enterprise grade security

Supported tools

Frequently asked questions

Related integrations

Alternatives to Browserless

Try Ceven on your stack