Skip to content

Architecture Overview

UENISHI Kota edited this page Feb 18, 2016 · 5 revisions

From 10,000m altitude to 1,000m

The term "Riak CS" has two meanings depending on the context -

  1. The product name, and the whole system
  2. The process and package name

In the context of 1., this picture below illustrates what it is like. Riak CS is a system that exposes an AWS S3 API (and Swift API of a small portion), by chunking a large object and storing each chunk as a Riak key and value.

10,000m overview

It consists of three packages as a software, Riak KV, Riak CS (in the context of 2.) and Stanchion. Each of them is a standalone process that is loosely coupled with each other through HTTP(S) or Protocol Buffers.

All data in Riak CS, users, CS buckets, object manifests, blocks, usage stats are stored Riak KV. Riak CS nodes serves HTTP(S) service, aka S3 API exposed to clients, by sending PB requests to Riak KV. Stanchion is a man in the middle to serialize update requests to keys that needs strong consistency (CS buckets) in Riak KV.

Riak CS S3 Storage API covers all supported API functionalities and good start point to learn how it can be used. To know what's behind Object-level operations, Object Chunking and Garbage Collection is the best document. To know what's behind Bukcet-level operations, Buckets-and-Users might be the best document.

There are also administrative APIs that are not in S3. Access stats API is to know incoming and outgoing traffic via CS nodes. Storage stats API is to query storage statistics, which is a log of internal storage calculation job. This allows operators to know all users' usage of storage, and account for disk usage by describing non-visible disk usage. For their internals, see Logging-Access-Stats, Logging-Storage-Stats, Querying-Access-Stats, and Querying-Storage-Stats.

HTTP Admin API is for managing users such as create, disable/enable and update meta information. Internals are partially described in Buckets-and-Users.

Native-API is a document to describe the URL rewrite of S3 API to webmachine-defined resources. As described in CS#1040, especially object name rewrite involves double URL encoding.

quote state     RAW    escaped(1)  escaped(2)    example

                 *                               'baz/foo bar'        'baz/foo+bar'
client app       +--------+
                          |
HTTP wire                 |                      'baz/foo%20bar'      'baz/foo%2Bbar'
                          |
webmachine                |
rewrite                   +------------+
                                       |         'baz%2Ffoo%2520bar'  'baz%2Ffoo%252Bbar'
extract_key      +---------------------+
                 |
                 v
                 *                               'baz/foo bar'        'baz/foo+bar'
Clone this wiki locally