Skip to content

[RFC] Riak CS and Stanchion metrics 2.1

UENISHI Kota edited this page Jul 27, 2015 · 9 revisions

Developer docs;

Summary: Since 2.1, Riak CS has introduced a new metric system to monitor the system in much better way and to diagnose system issues. Stanchion has also introduced a metric system that is analogous to that of Riak CS, including stanchion-admin command and /stats HTTP endpoint.

Status of this document: Requesting for comments. Although most implementation has been finished, it is still easy to add/remove items to cover minor improvements that reflect real need from operation viewpoint. As well as changing item names for comprehensive English would be appreciated.

The new metric system has more items than previous system to see

  • Statistics of API requests - count, latency, success and errors
  • Statistics of Riak PB API performance - count, latency, success and errors
  • Statistics of accessing Stanchion
  • Waiting time and service time of Stanchion serialization queue
  • System metrics like OTP, memory, and
  • Mochiweb metrics.

This document will describe basic ideas of new metric system and try to help maintain Riak CS and Stanchion 2.1 system.

Terminology and categories

  • in, out
  • error
  • counts

Note some may be spill out

  • S3 API stats
  • riakc

Riak CS

Stanchion

  • [http://docs.basho.com/riakcs/latest/cookbooks/Monitoring-and-Metrics/]
Clone this wiki locally