Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add superscraper blog #2828

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

docs: add superscraper blog #2828

wants to merge 6 commits into from

Conversation

souravjain540
Copy link
Collaborator

blog regarding superscraper approved by Lukas, Rado, and marketing

@souravjain540 souravjain540 requested a review from B4nan January 29, 2025 17:49
website/blog/authors.yml Outdated Show resolved Hide resolved
---
slug: superscraper-with-crawlee
title: 'Inside implementing Superscraper with Crawlee.'
description: 'This blog explains how SuperScraper works, highlights its implementation details, and provides code snippets to demonstrate its core functionality.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, the crawlee blog has only been about crawlee and we haven't mentioned Apify much. This is quite a change in direction. Are you sure we want to publish this here? Just to be sure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to have such pieces, too; after all, best way to build Actors should be Crawlee. There will be more in future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in title its Superscraper, in description its SuperScraper

also, it feels weird to say "this blog", its a blogpost or article, "blog" is the platform where we serve it, where all the articles are, no?

return crawler;
```

### Mapping standby HTTP requests to Crawlee requests
Copy link
Contributor

@janbuchar janbuchar Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just isolated functions and you need to browse the repo (or use a lot of imagination) to see how it all fits together. I'd appreciate a more top-down approach - maybe start with a code snippet with the big picture and then show how the individual functions are implemented?

---
slug: superscraper-with-crawlee
title: 'Inside implementing Superscraper with Crawlee.'
description: 'This blog explains how SuperScraper works, highlights its implementation details, and provides code snippets to demonstrate its core functionality.'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in title its Superscraper, in description its SuperScraper

also, it feels weird to say "this blog", its a blogpost or article, "blog" is the platform where we serve it, where all the articles are, no?

website/blog/2025/01-29-superscraper/index.md Outdated Show resolved Hide resolved
authors: [SauravJ, RadoC]
---

[SuperScraper](https://github.com/apify/super-scraper) is an open-source Actor that combines features from various web scraping services, including [ScrapingBee](https://www.scrapingbee.com/), [ScrapingAnt](https://scrapingant.com/), and [ScraperAPI](https://www.scraperapi.com/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a link on the Actor word to https://docs.apify.com/platform/actors, so people are not confused by that. We don't use that in the crawlee docs much, only in the platform and deployment guides most likely.

website/blog/2025/01-29-superscraper/index.md Outdated Show resolved Hide resolved
website/blog/2025/01-29-superscraper/index.md Outdated Show resolved Hide resolved
website/blog/2025/01-29-superscraper/index.md Outdated Show resolved Hide resolved
website/blog/2025/01-29-superscraper/index.md Outdated Show resolved Hide resolved

The following function stores a response object in the key-value map:

```js
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code is TS since you have type hints in there, not JS, I can see the highlighting wouldn't be happy about this. most (if not all) examples here should use ts.

url: https://github.com/chudovskyr
image_url: https://ca.slack-edge.com/T0KRMEKK6-U04MGU11VUK-7f59c4a9343b-512
socials:
github: chudovskyr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep the line break at the end of the file

website/blog/2025/01-29-superscraper/index.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants