Skip to content

Scrape crunchbase companies, people, investors, acquisitions data including website urls, social urls, emails, phone numbers, employee count, funding information etc.

Notifications You must be signed in to change notification settings

codercurious/crunchbase-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 

Repository files navigation

Crunchbase Scraper

Interested in using this scraper? Get it here: Crunchbase Scraper

Demo video

Crunchbase is a platform where you can discover innovative companies, connect with the people behind them, and uncover new opportunities. It has become a prime source of business information for millions of users around the world.

Features

  • Scrape crunchbase company pages
  • Scrape crunchbase profile URLs
  • Find organizations by website in bulk
  • Scrape crunchbase company search results
  • Scrape crunchbase funding search results
  • Scrape crunchbase contacts search results
  • Scrape crunchbase investors search results
  • Scrape crunchbase aquisitions search results
  • Scrape crunchbase events search results
  • Scrape crunchbase schools search results
  • Scrape crunchbase hubs search results
  • Scrape crunchbase people search results

You can extract following data from crunchbase using this scraper:

Company data fields

πŸ” Identifier πŸ‘₯ Num Employees Enum πŸ“¦ Categories
🌍 Location Identifiers πŸ“„ Short Description πŸ“ˆ Rank Org Company
🌐 Website 🐦 Twitter πŸ“˜ Facebook
πŸ”— LinkedIn πŸ“§ Contact Email ☎️ Phone Number
πŸ’° Stock Symbol πŸ“° Num Articles 🏷️ Hub Tags
πŸ“ Description πŸ“œ Job Posting Link Source πŸŒ† Location Group Identifiers
🌈 Diversity Spotlights πŸ’΅ Revenue Range 🏭 Operating Status
πŸ“… Exited On πŸ“… Founded On πŸ“… Closed On
🏒 Company Type πŸ’Ό Investor Type πŸš€ Investor Stage
πŸ“Š Num Portfolio Organizations πŸ’³ Num Investments/Funding Rounds πŸ’Ό Num Lead Investments
πŸšͺ Num Diversity Spotlight Investments πŸšͺ Num Exits πŸšͺ Num Exits IPO
πŸŽ“ Program Type πŸ“… Program Application Deadline ⏳ Program Duration
🏫 School Type πŸ“š School Program πŸ‘©β€πŸŽ“ Num Enrollments
πŸ‘©β€πŸŽ“ Num Founder Alumni πŸ“š School Method πŸŽ“ Num Alumni
πŸ—‚ Category Groups πŸ‘₯ Num Founders πŸ‘₯ Founder Identifiers
πŸ“ˆ Num Funding Rounds πŸ’Ό Funding Stage πŸ“… Last Funding At
πŸ’° Last Funding Total πŸ’Ό Last Funding Type πŸ’° Last Equity Funding Total
πŸ’Ό Last Equity Funding Type πŸ’° Equity Funding Total πŸ’° Funding Total
πŸ‘₯ Num Lead Investors πŸ‘₯ Num Investors πŸ‘₯ Investor Identifiers
πŸ’Ό Num Acquisitions 🏁 Acquisition Status πŸ” Acquisition Identifier
πŸ’° Acquisition Price πŸ“… Acquisition Announced On πŸ” Acquirer Identifier
🏁 Acquisition Type πŸ“œ Acquisition Terms πŸ’Ό IPO Status
πŸ“… Went Public On πŸ“… Delisted On πŸ’° IPO Valuation
πŸ’° IPO Amount Raised πŸ“ˆ Stock Exchange Symbol πŸ“… Last Layoff Date
πŸ“… Last Key Employee Change Date πŸ“… Num Event Appearances πŸ“ˆ Rank Org
πŸ“ˆ Rank Org School πŸ“‰ Rank Delta D7 πŸ“‰ Rank Delta D30
🏒 Num Org Similarities πŸ‘” Contact Job Departments πŸ‘₯ Num Contacts
πŸ‘₯ Num Private Contacts πŸ“ˆ SEMrush Visits Latest Month ⏳ SEMrush Visits MoM %
⏳ SEMrush Visits Latest 6 Months Avg ⏳ SEMrush Visit Duration ⏳ SEMrush Visit Pageviews
⏳ SEMrush Visit Duration MoM % ⏳ SEMrush Visit Pageview MoM % ⏳ SEMrush Bounce Rate
⏳ SEMrush Bounce Rate MoM % ⏆ SEMrush Global Rank ⏆ SEMrush Global Rank MoM
⏳ SEMrush Global Rank MoM % πŸ’» Builtwith Num Technologies Used πŸ“± Apptopia Total Apps
πŸ“± Apptopia Total Downloads 🏒 Siftery Num Products βš™οΈ IPqwery Num Patent Granted
βš™οΈ IPqwery Num Trademark Registered πŸ“Š IPqwery Popular Patent Category πŸ“Š IPqwery Popular Trademark Class
πŸ’° Aberdeen Site IT Spend πŸ’° PrivCo Valuation Range πŸ“… PrivCo Valuation Timestamp
πŸ“ Num Private Notes 🏷️ Private Tags

Person data fields

πŸ” Identifier πŸ‘” Primary Job Title πŸ‘₯ Primary Organization
🌍 Location Identifiers πŸ“ˆ Rank Person πŸ“˜ Facebook
πŸ”— LinkedIn 🐦 Twitter ⚧ Gender
πŸ‘¨β€πŸ¦± First Name πŸ‘¨β€πŸ¦³ Last Name πŸ“ Description
πŸŒ† Location Group Identifiers πŸ“° Num Articles πŸ‘©β€πŸ« Attended Schools
🏒 Num Founded Organizations 🏒 Current Organizations 🏭 Num Portfolio Organizations
🏭 Num Investments/Funding Rounds πŸ† Num Partner Investments πŸ₯‡ Num Lead Investments
πŸšͺ Num Exits πŸšͺ Num Diversity Spotlight Investments πŸšͺ Num Exits IPO
πŸ“… Num Event Appearances πŸ“‰ Rank Delta D7 πŸ“‰ Rank Delta D30
πŸ“‰ Rank Delta D90

Crunchbase data API

The actor stores results in a dataset. You can export data in various formats such as CSV, JSON, XLS, etc. You can scrape and access data on demand using API. For more information, Go to Crunchbase scraper API integration page

Importance of Crunchbase Data

Data from Crunchbase is highly sought after. It can provide invaluable insights about startups, their funding rounds, key individuals involved, and much more. Therefore, scraping this data can equip businesses with information necessary for decision-making and strategy development.

Why Apify for Crunchbase Data Scraping?

Apify is a web scraping and automation platform. It allows you to extract data, automate workflows, and integrate with your existing software. It's flexible, easy to use, and scalable, making it a top choice for many businesses.

Crunchbase Data Scraper: The Apify Actor

Crunchbase Data Scraper is a specific Apify actor that focuses on retrieving data from Crunchbase.

How Does It Work?

This actor is programmed to navigate through the Crunchbase's complex website structure, find the relevant data, and extract it in a structured, usable format.

Features of the Crunchbase Data Scraper

The Crunchbase Data Scraper actor offers features such as being able to extract company profiles, key person profiles, funding rounds, acquisitions, and more. It provides a lot of flexibility, allowing you to specify what data you want.

Use Cases

Crunchbase Data Scraper is highly beneficial for market researchers, sales teams, data analysts, and more. It helps streamline various processes, from lead generation to industry analysis.

Sample output data for company search results

{
	"uuid": "e36f580e-6c0e-47de-accf-15de75f62cc9",
	"name": "Stability AI",
	"type": "organization",
	"imageUrl": "https://res.cloudinary.com/crunchbase-production/image/upload/c_lpad,h_25,w_25,f_auto,b_white,q_auto:eco,dpr_1/yngvetlwqatjdqwmxg9g",
	"link": "https://www.crunchbase.com/organization/stability-ai",
	"numberOfEmployees": [
		51,
		100
	],
	"website": {
		"value": "https://stability.ai"
	},
	"linkedin": {
		"value": "https://www.linkedin.com/company/stability-ai"
	},
	"short_description": "Stability AI is an artificial intelligence-driven visual art startup that designs and implements open AI tools.",
	"categories": [
		{
			"entity_def_id": "category",
			"permalink": "artificial-intelligence",
			"uuid": "c4d8caf3-5fe7-359b-f9f2-2d708378e4ee",
			"value": "Artificial Intelligence"
		},
		{
			"entity_def_id": "category",
			"permalink": "image-recognition",
			"uuid": "af9307c9-6413-72ae-aac7-4391df240dd2",
			"value": "Image Recognition"
		},
		{
			"entity_def_id": "category",
			"permalink": "information-technology-dbca",
			"uuid": "dbca89fa-f083-5438-b4ad-d3fdeceb78e7",
			"value": "Information Technology"
		},
		{
			"entity_def_id": "category",
			"permalink": "software",
			"uuid": "c08b5441-a05b-9777-b7a6-012728caddd9",
			"value": "Software"
		}
	],
	"location_identifiers": [
		{
			"permalink": "london-england",
			"uuid": "aad17950-576b-8c44-8fd4-f44dbeb59220",
			"location_type": "city",
			"entity_def_id": "location",
			"value": "London"
		},
		{
			"permalink": "england-united-kingdom",
			"uuid": "79eb923b-9e93-e0db-2fe0-75f0c430c2cb",
			"location_type": "region",
			"entity_def_id": "location",
			"value": "England"
		},
		{
			"permalink": "united-kingdom",
			"uuid": "a30e342c-1742-6b1c-66e9-461de680e54b",
			"location_type": "country",
			"entity_def_id": "location",
			"value": "United Kingdom"
		},
		{
			"permalink": "europe",
			"uuid": "6106f5dc-823e-5da8-40d7-51612c0b2c4e",
			"location_type": "continent",
			"entity_def_id": "location",
			"value": "Europe"
		}
	],
	"twitter": {
		"value": "https://www.twitter.com/stabilityai"
	},
	"contact_email": "[email protected]",
	"rank_org_company": 40
}

Documentation

This JSON data represents companies as a result of a search performed in the Crunchbase database.

Here are the descriptions for each field in the JSON data:

  • uuid: The unique identifier for the company in the database. Each uuid is a string following the standard UUID format.
  • name: The official name of the company.
  • type: This field indicates the type of the entry. For this particular entry, the type is 'organization'.
  • imageUrl: URL of the company's logo or relevant image.
  • link: The direct link to the company's profile on Crunchbase.
  • numberOfEmployees: An array indicating the range of the company's employee count.
  • website: An object that contains the value field which provides the company's official website URL.
  • linkedin: An object that contains the value field which provides the LinkedIn profile URL of the company.
  • short_description: A brief description of the company and its primary functions or industry.
  • categories: An array of category objects that the company falls under. Each object in the array has the following fields:
    • entity_def_id: The identifier of the category entity.
    • permalink: A URL-friendly version of the category name.
    • uuid: The unique identifier for the category.
    • value: The actual name of the category.
  • location_identifiers: An array of location objects that correspond to the company's location. Each object in the array has the following fields:
    • permalink: A URL-friendly version of the location name.
    • uuid: The unique identifier for the location.
    • location_type: The type of the location. It could be 'city', 'region', 'country', or 'continent'.
    • entity_def_id: The identifier of the location entity.
    • value: The actual name of the location.
  • twitter: An object that contains the value field which provides the Twitter profile URL of the company.
  • contact_email: The contact email address for the company.
  • rank_org_company: The company's rank among other companies in the Crunchbase database.

The structure of this JSON data makes it easy to parse and use in various applications, such as website scrapers, data analysis tools, and so on. Remember, however, to respect the data usage terms and conditions of Crunchbase when using their data.