Data Integration for Production Data Stores. 💫
Conduit is a data streaming tool written in Go. It aims to provide the best user experience for building and running real-time data pipelines. Conduit comes with batteries included, it provides a UI, common connectors, processors and observability data out of the box.
Conduit pipelines are built out of simple building blocks which run in their own goroutines and are connected using Go channels. This makes Conduit pipelines incredibly performant on multi-core machines. Conduit guarantees the order of received records won't change, it also takes care of consistency by propagating acknowledgments to the start of the pipeline only when a record is successfully processed on all destinations.
Conduit connectors are plugins that communicate with Conduit via a gRPC interface. This means that plugins can be written in any language as long as they conform to the required interface.
Conduit was created and open-sourced by Meroxa.
- Quick start
- Installation guide
- Connectors
- Processors
- API
- UI
- Documentation
- Known limitations
- Contributing
- Download and extract the latest release.
- Download
the example pipeline
and put it in the directory named pipelinesin the same directory as the Conduit binary.
- Run conduit (./conduit). The example pipeline will start automatically.
- Write something to file example.inin the same directory as the Conduit binary.$ echo "hello conduit" >> example.in`
- Read the contents of example.outand notice an OpenCDC record:$ cat example.out {"position":"MTQ=","operation":"create","metadata":{"file.path":"./example.in","opencdc.readAt":"1663858188836816000","opencdc.version":"v1"},"key":"MQ==","payload":{"before":null,"after":"aGVsbG8gY29uZHVpdA=="}}
- The string hello conduitis a base64 encoded string stored in the fieldpayload.after, let's decode it:$ cat example.out | jq ".payload.after | @base64d" "hello conduit"
- Explore the UI by opening http://localhost:8080and build your own pipeline!
Download a pre-built binary from the latest release and simply run it!
./conduit
Once you see that the service is running you may access a user-friendly web
interface at http://localhost:8080. You can also interact with
the Conduit API directly, we recommend navigating
to http://localhost:8080/openapi and exploring the HTTP API through Swagger
UI.
Conduit can be configured through command line parameters. To view the full list
of available options, run ./conduit --help.
Requirements:
git clone [email protected]:ConduitIO/conduit.git
cd conduit
make
./conduitNote that you can also build Conduit with make build-server, which only
compiles the server and skips the UI. This command requires only Go and builds
the binary much faster. That makes it useful for development purposes or for
running Conduit as a simple backend service.
Our Docker images are hosted on GitHub's Container Registry. To run the latest Conduit version, you should run the following command:
docker run -p 8080:8080 ghcr.io/conduitio/conduit:latest
The Docker image includes the UI, you can access it by navigating
to http://localhost:8080.
For the full list of available connectors, see the Connector List. If there's a connector that you're looking for that isn't available in Conduit, please file an issue .
Conduit loads standalone connectors at startup. The connector binaries need to
be placed in the connectors directory relative to the Conduit binary so
Conduit can find them. Alternatively, the path to the standalone connectors can
be adjusted using the CLI flag -connectors.path.
Conduit ships with a number of built-in connectors:
- File connector provides a source/destination to read/write a local file (useful for quickly trying out Conduit without additional setup).
- Kafka connector provides a source/destination for Apache Kafka.
- Postgres connector provides a source/destination for PostgreSQL.
- S3 connector provides a source/destination for AWS S3.
- Generator connector provides a source which generates random data (useful for testing).
Additionally, we have prepared a Kafka Connect wrapper that allows you to run any Apache Kafka Connect connector as part of a Conduit pipeline.
If you are interested in writing a connector yourself, have a look at our Go Connector SDK. Since standalone connectors communicate with Conduit through gRPC they can be written in virtually any programming language, as long as the connector follows the Conduit Connector Protocol .
A processor is a component that operates on a single record that flows through a pipeline. It can either change the record (i.e. transform it) or filter it out based on some criteria.
Conduit provides a number of built-in processors, which can be used to filter and replace fields, post payloads to HTTP endpoints etc. Conduit also provides the ability to write custom processors in JavaScript.
More detailed information as well as examples can be found in the Processors documentation.
Conduit exposes a gRPC API and an HTTP API.
The gRPC API is by default running on port 8084. You can define a custom address
using the CLI flag -grpc.address. To learn more about the gRPC API please have
a look at
the protobuf file
.
The HTTP API is by default running on port 8080. You can define a custom address
using the CLI flag -http.address. It is generated
using gRPC gateway and is thus
providing the same functionality as the gRPC API. To learn more about the HTTP
API please have a look at the API documentation,
OpenAPI definition
or run Conduit and navigate to http://localhost:8080/openapi to open
a Swagger UI which makes it easy to
try it out.
Conduit comes with a web UI that makes building data pipelines a breeze, you can
access it at http://localhost:8080. See
the installation guide for instructions on how to build
Conduit with the UI.
For more information about the UI refer to the Readme in /ui.
To learn more about how to use Conduit visit docs.conduit.io.
If you are interested in internals of Conduit we have prepared some technical documentation:
- Pipeline Semantics explains the internals of how a Conduit pipeline works.
- Pipeline Configuration Files explains how you can define pipelines using YAML files.
- Processors contains examples and more information about Conduit processors.
- Conduit Architecture will give you a high-level overview of Conduit.
- Conduit Metrics provides more information about how Conduit exposes metrics.
For a complete guide to contributing to Conduit, see the Contribution Guide .
We welcome you to join the community and contribute to Conduit to make it better! When something does not work as intended please check if there is already an issue that describes your problem, otherwise please open an issue and let us know. When you are not sure how to do something please open a discussion or hit us up on Discord.
We also value contributions in form of pull requests. When opening a PR please ensure:
- You have followed the Code Guidelines .
- There is no other pull request for the same update/change.
- You have written unit tests.
- You have made sure that the PR is of reasonable size and can be easily reviewed.
