Skip to content

Still getting blocked by indeed #71

Closed
@MounirAia

Description

@MounirAia

I am using camoufox (+crawlee and apify residential proxy) and I am still getting blocked by indeed's Cloudflare protection.

Is there something I am missing? I am posting this message because on the camoufox python official repo, it says that it can skip the protection mechanism from Cloudflare. I don't know if the camoufox-js package supports the same features as the official repo.

const maxRequestsPerCrawl = userInput.maxItems ? { maxRequestsPerCrawl: userInput.maxItems } : {};
const crawler = new PlaywrightCrawler({
  proxyConfiguration: proxyConfiguration,
  requestHandler: router,
  launchContext: {
    launcher: firefox,
    launchOptions: await camoufoxLaunchOptions({
      headless: false,
      proxy: await proxyConfiguration?.newUrl(),
      geoip: true,
      os: "windows",
    }),
  },
  maxConcurrency: 5,
  maxRequestRetries: 100,
  navigationTimeoutSecs: 10,
  // requestHandlerTimeoutSecs: 3,
  preNavigationHooks: [
    async ({ page }) => {
      // Enable request interception
      await page.route("**/*", (route) => {
        const request = route.request();
        const resourceType = request.resourceType();

        // Allow only 'document' (HTML pages), block everything else
        if (resourceType !== "document") {
          route.abort(); // Block request
        } else {
          route.continue(); // Allow request
        }
      });
    },
  ],
  ...maxRequestsPerCrawl,
});

Metadata

Metadata

Assignees

Labels

t-toolingIssues with this label are in the ownership of the tooling team.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions