This is a demo that showcases some of Typesense's features using 1 Million commit messages from the Linux kernel repo.
View it live here: https://linux-commits-search.typesense.org/
This search experience is powered by Typesense which is a fast, open source typo-tolerant search-engine. It is an open source alternative to Algolia and an easier-to-use alternative to ElasticSearch.
The dataset was extracted by running git log
on the Linux Kernel git repo.
The dataset is ~950MB on disk, with ~1 million records. It took 45 minutes to index this dataset on a 3-node Typesense cluster with 4vCPUs per node and the index was ~3GB in RAM.
The app was built using the Typesense Adapter for InstantSearch.js and is hosted on S3, with CloudFront for a CDN.
The search backend is powered by a geo-distributed 3-node Typesense cluster running on Typesense Cloud, with nodes in Oregon, Frankfurt and Mumbai.
src/
andindex.html
- contain the frontend UI components, built with Typesense Adapter for InstantSearch.jsscripts/
- contains the scripts to extract, transform and index the git log data into Typesense.
-
Create a
.env
file using.env.example
as reference. -
Extract commit history
mkdir data/linux
cd data/linux
git checkout https://github.com/torvalds/linux
yarn extractCommitHistory:merges
yarn extractCommitHistory:nonMerges
- Transform and index the data
bundle install
gzip data/git-log-output
yarn transformDataset
yarn run typesenseServer
UPDATE_COLLECTION_ALIAS=true yarn index
- Install dependencies and run the local server:
yarn
yarn start
Open http://localhost:3000 to see the app.
The app is hosted on S3, with Cloudfront for a CDN.
yarn build
yarn deploy