Skip to content

Commit 5d2d2a5

Browse files
authored
chore: polish the arch and other chapters of readme (#2470)
Signed-off-by: daniel-y <[email protected]>
1 parent 8a3d19d commit 5d2d2a5

11 files changed

+20
-51
lines changed

README.md

+18-33
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# AutoMQ: A stateless Kafka on S3, offering 10x cost savings and scaling in seconds.
1+
# AutoMQ: A stateless Kafka® on S3, offering 10x cost savings and scaling in seconds.
22

33
<div align="center">
44
<p align="center">
@@ -65,8 +65,8 @@ There are more deployment options available:
6565
- [Try AutoMQ on AWS Marketplace (Two Weeks Free Trial)](https://docs.automq.com/automq-cloud/getting-started/install-byoc-environment/aws/install-env-from-marketplace)
6666
- [Try AutoMQ on Alibaba Cloud Marketplace (Two Weeks Free Trial)](https://market.aliyun.com/products/55530001/cmgj00065841.html)
6767

68-
## 🗞️ Newest Feature
69-
Table Topic feature for unified stream and data analysis, which now supports the S3 table feature announced at the 2024 re:Invent. [Learn more](https://www.automq.com/blog/automq-table-topic-seamless-integration-with-s3-tables-and-iceberg).
68+
## 🗞️ Newest Feature - Table Topic
69+
Table Topic is a new feature in AutoMQ that combines stream and table functionalities to unify streaming and data analysis. Currently, it supports Apache Iceberg and integrates with catalog services such as AWS Glue, HMS, and the Rest catalog. Additionally, it natively supports S3 tables, a new AWS product announced at the 2024 re:Invent. [Learn more](https://www.automq.com/blog/automq-table-topic-seamless-integration-with-s3-tables-and-iceberg).
7070

7171
![image](https://github.com/user-attachments/assets/6b2a514a-cc3e-442e-84f6-d953206865e0)
7272

@@ -89,18 +89,17 @@ Here are some key highlights of AutoMQ that make it an ideal choice to replace y
8989
- **100% Kafka Compatible**: Fully compatible with Apache Kafka, offering all features with greater cost-effectiveness and operational efficiency.
9090

9191
## ✨Architecture
92+
AutoMQ is a fork of the open-source [Apache Kafka](https://github.com/apache/kafka). We've introduced a new storage engine based on object storage, transforming the classic shared-nothing architecture into a shared storage architecture.
9293

9394
![image](./docs/images/automq_simple_arch.png)
9495

95-
AutoMQ's Shared Storage architecture revolutionizes the storage layer of Apache Kafka by offloading data to cloud storage, thereby rendering the Broker stateless. This architecture incorporates both WAL (Write-Ahead Logging) storage and object storage, storing all data in object storage in near real-time.
96+
Regarding the architecture of AutoMQ, it is fundamentally different from Kafka. The core difference lies in the storage layer of Apache Kafka and how we leverage object storage to achieve a stateless broker architecture. AutoMQ consists of below key components:
97+
- S3 Storage Adapter: an adapter layer that reimplements the UnifiedLog, LocalLog, and LogSegment classes to create logs on S3 instead of a local disk. Traditional local disk storage is still supported if desired.
98+
- S3Stream: a shared streaming storage library that encapsulates various storage modules, including WAL and object storage. WAL is a write-ahead log optimized for frequent writes and low IOPS to reduce S3 API costs. To boost read performance, we use LogCache and BlockCache for improved efficiency.
99+
- Auto Balancer: a component that automatically balances traffic and partitions between brokers, eliminating the need for manual reassignment. Unlike Kafka, this built-in feature removes the need for cruise control.
100+
- Rack-aware Router: Kafka has long faced cross-AZ traffic fees on AWS and GCP. Our shared storage architecture addresses this by using a rack-aware router to provide clients in different AZs with specific partition metadata, avoiding cross-AZ fees while exchanging data through object storage.
96101

97-
In this setup:
98-
99-
- Object storage is the primary data repository, providing a flexible, cost-effective, and scalable storage solution.
100-
- AutoMQ introduces a WAL storage layer to counter the high latency and low IOPS associated with Object storage, thereby improving data write efficiency and lowering IOPS usage.
101-
- The WAL storage layer is adaptable, allowing for the selection of various storage services across different cloud providers to cater to diverse durability and performance needs. Azure Zone-redundant Disk, GCP Regional Persistent Disk, and Alibaba Cloud Regional ESSD are ideal for ensuring multi-AZ durability. For cost-effective solutions on AWS with relaxed latency scenarios, S3 can serve as WAL. Additionally, AWS EFS/FSx can balance latency and cost for critical workloads when used as WAL.
102-
103-
AutoMQ has developed a shared streaming storage library, S3Stream, which encapsulates these storage modules. By replacing the native Apache Kafka® Log storage with S3Stream, the entire Broker node becomes entirely stateless. This transformation significantly streamlines operations such as second-level partition reassignment, automatic scaling, and traffic self-balancing. To facilitate this, AutoMQ has integrated Controller components like Auto Scaling and Auto Balancing within its kernel, which oversee cluster scaling operations and traffic rebalancing, respectively. Please refer to [here](https://docs.automq.com/automq/architecture/overview) for more architecture details.
102+
For more on AutoMQ's architecture, visit [AutoMQ Architecture](https://docs.automq.com/automq/architecture/overview) or explore the source code directly.
104103

105104
## 💬 Community
106105
You can join the following groups or channels to discuss or ask questions about AutoMQ:
@@ -111,30 +110,16 @@ You can join the following groups or channels to discuss or ask questions about
111110
## 👥 How to contribute
112111
If you've found a problem with AutoMQ, please open a [GitHub Issues](https://github.com/AutoMQ/automq/issues).
113112
To contribute to AutoMQ please see [Code of Conduct](CODE_OF_CONDUCT.md) and [Contributing Guide](CONTRIBUTING_GUIDE.md).
114-
We have a list of [good first issues](https://github.com/AutoMQ/automq/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) that help you to get started, gain experience, and get familiar with our contribution process. To claim one, simply reply with 'pick up' in the issue and the AutoMQ maintainers will assign the issue to you. If you have any questions about the 'good first issue' please feel free to ask. We will do our best to clarify any doubts you may have.
115-
116-
## 👍 AutoMQ Business Edition
117-
The business edition of AutoMQ provides a powerful and easy-to-use control plane to help you manage clusters effortlessly. Meanwhile, the control plane is more powerful in terms of availability and observability compared to the community edition.
118-
119-
> You can check the difference between the community and business editions [here](https://www.automq.com/product).
120-
121-
122-
<b>Watch the following video and refer to our [docs](https://docs.automq.com/automq-cloud/getting-started/install-byoc-environment/aws/install-env-via-terraform-module) to see how to deploy AutoMQ Business Edition with 2 weeks free license for PoC.</b>
123-
124-
<b> ⬇️ ⬇️ ⬇️ </b>
125-
126-
[![Deploy AutoMQ Business Edition with Terraform](https://img.youtube.com/vi/O40zp81x97w/0.jpg)](https://www.youtube.com/watch?v=O40zp81x97w)
127-
128-
113+
We have a list of [good first issues](https://github.com/AutoMQ/automq/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) that help you to get started, gain experience, and get familiar with our contribution process.
129114

130-
### Free trial of AutoMQ Business Edition
131-
To allow users to experience the capabilities of the AutoMQ business edition without any barriers, click [here](https://www.automq.com/quick-start#Cloud?utm_source=github_automq_cloud) to apply for a no-obligation cluster trial, and note `AutoMQ Cloud Free Trial` in the message input box. We will immediately initialize an AutoMQ Cloud control panel for you soon in the cloud and give you the address of the control panel. Then, you can use the control panel to create a AutoMQ cluster or perform operations like scale in/out.
115+
## 👍 AutoMQ Enterprise Edition
116+
The enterprise edition of AutoMQ offers a robust, user-friendly control plane for seamless cluster management, with enhanced availability and observability over the open-source version. Additionally, we offer [Kafka Linking](https://www.automq.com/solutions/kafka-linking) for zero-downtime migration from any Kafka-compatible cluster to AutoMQ.
132117

133-
No need to bind a credit card, no cost at all. We look forward to receiving valuable feedback from you to make our product better. If you want to proceed with a formal POC, you can also contact us through [Contact Us](https://www.automq.com/contact). We will further support your official POC.
118+
[Contact us](https://www.automq.com/contact) for more information about the AutoMQ enterprise edition, and we'll gladly assist with your free trial.
134119

135-
## 🐱 The relationship with Apache Kafka
120+
## 📜 License
121+
AutoMQ is under the Apache 2.0 license. See the [LICENSE](https://github.com/AutoMQ/automq/blob/main/LICENSE) file for details.
136122

137-
AutoMQ is a fork of the open-source [Apache Kafka](https://github.com/apache/kafka). Based on the Apache Kafka codebase, we found an aspect at the LogSegment level, and replaced Kafka's storage layer with our self-developed cloud-native stream storage engine, [S3Stream](https://github.com/AutoMQ/automq/tree/main/s3stream). This engine can provide customers with high-performance, low-cost, and unlimited stream storage capabilities based on cloud storage like EBS WAL and S3. As such, AutoMQ completely retains the code of Kafka's computing layer and is 100% fully compatible with Apache Kafka. We appreciate the work done by the Apache Kafka community and will continue to embrace the Kafka community.
123+
## 📝 Trademarks
124+
Apache®, Apache Kafka®, Kafka®, Apache Iceberg®, Iceberg® and associated open source project names are trademarks of the Apache Software Foundation
138125

139-
## 🙋 Contact Us
140-
Want to learn more, [Talk with our product experts](https://www.automq.com/contact).

docs/images/automq-architecture.png

-2.05 MB
Binary file not shown.

docs/images/automq-kafka-compare.png

-228 KB
Binary file not shown.

docs/images/automq_dashboard.gif

-2.31 MB
Binary file not shown.

docs/images/automq_dashboard.jpeg

-694 KB
Binary file not shown.
-411 KB
Binary file not shown.

docs/images/automq_simple_arch.png

-13.6 KB
Loading

docs/images/automq_vs_kafka.gif

-1.07 MB
Binary file not shown.
-504 KB
Binary file not shown.

docs/images/banner-readme.jpeg

-4.81 MB
Binary file not shown.

s3stream/README.md

+2-18
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
## S3Stream: A Shared Streaming Storage Library
2-
S3Stream is a shared streaming storage library that provides a unified interface for reading and writing streaming data to cloud object storage services like Amazon S3, Google Cloud Storage, and Azure Blob Storage. EBS is utilized here for its low-latency capabilities. It is designed to be used as the storage layer for distributed systems like Apache Kafka, Apache RocketMQ, etc. It provides the following features:
3-
* **High Reliability**: S3Stream leverages cloud storage services(EBS and S3) to achieve zero RPO, RTO in seconds and 99.999999999% durability.
2+
S3Stream is a shared streaming storage library offering a unified interface for reading and writing streaming data to cloud object storage services such as Amazon S3, Google Cloud Storage, Azure Blob Storage, and any S3-compatible storage like MinIO. It is designed to be used as the storage layer for distributed streaming storage systems like Apache Kafka. It provides the following features:
3+
* **High Reliability**: S3Stream leverages cloud storage services to achieve zero RPO, RTO in seconds and 99.999999999% durability.
44
* **Cost Effective**: S3Stream is designed for optimal cost and efficiency on the cloud. It can cut Apache Kafka billing by 90% on the cloud.
55
* **Unified Interface**: S3Stream provides a unified interface for reading and writing streaming data to cloud object storage services.
66
* **High Performance**: S3Stream is optimized for high performance and low latency. It can handle high throughput and low latency workloads.
@@ -43,19 +43,3 @@ public interface Stream {
4343
}
4444
```
4545
> Please refer to the [S3Stream API](src/main/java/com/automq/stream/api/Stream.java) for the newest API details.
46-
47-
## S3Stream Architecture
48-
![image](../docs/images/automq_s3stream_architecture.gif)
49-
50-
In S3Stream's core architecture, data is initially written to the Write-Ahead Log (WAL) persistently, then it's uploaded to S3 storage in a near real-time fashion. To efficiently support two reading paradigms—Tailing Read and Catch-up Read—S3Stream incorporates a built-in Message Cache to expedite reading operations.
51-
- **WAL Storage**: Opt for a storage medium with low latency; each WAL disk requires only a few GiB of space, with cloud storage like EBS typically being the choice.
52-
- **S3 Storage**: Select the cloud provider's largest object storage service to offer high-throughput, cost-effective primary data storage solutions.
53-
- **Message Cache**: Hot data and prefetched cold data are both stored in the cache to expedite reading. Simultaneously, they are efficiently evicted based on the consumer focus mechanism, thereby enhancing memory utilization efficiency.
54-
55-
## Various WAL Storage Options
56-
![image](../docs/images/automq_wal_architecture.gif)
57-
S3Stream supports various WAL storage options, including EBS, Regional EBS, S3, and other cloud storage services.
58-
- **EBS WAL**: EBS is the default choice for WAL storage, offering low latency and high durability.
59-
- **Regional EBS WAL**: On Azure, GCP, and Alibaba Cloud, Regional EBS replicas span multiple AZs.
60-
- **S3 WAL**: Utilizing S3 as a WAL eliminates the need for EBS, streamlining the architecture to be fully S3-based, thus simplifying operations and maintenance. If your current setup is limited to MinIO, this is an excellent option.
61-
- **S3 Express WAL**: AWS provides S3 Express, a high-performance, low-latency object storage solution that is well-suited as a storage choice for the S3Stream WAL.

0 commit comments

Comments
 (0)