Skip to content
svaroqui edited this page Jan 31, 2013 · 7 revisions

ScambleDB should be a framework to hide the complexity of deployement of complexe database architetures

Simplicity

The stack should be control by very limited set of parameters

  • mem_pct memory dedicated to the service per node
  • io_pct trigger when an instance reach pct capacity
  • cpu_pct trigger when an instance reach pct cpu capacity
  • price

Availability

  • Election for single ressource in the cluster in case of failure (memcache tarantool)
  • Automatic master failover cf MHA
  • Upgrade parameter with rolling restart and switch over
  • Upgrade software with rolling restart and switch over
  • Upgrade vm instances with rolling restart and switch over

Flexibility

  • Can force per query routing decision to slaves
  • Can force per query routing decision to nosql slaves
  • Can force per query routing decision to analitic slaves
  • Can force per table group routing decision

Auto scale up

Rules:

  • Increase Memory capacity until we reach the instances limit
  • Monitor io until we reach instance limit
  • Increase Memory by provisioning better instances
  • Increase io capacity

Actions:

  • Presente a set of tables and queries that can be replicated to dedicated pool of slaves set replication filter to shard the data per table set or per db
  • Present a set of tables and queries canditate for horizontal sharding partition by range/hash set a replication with spider storage engine per table route the queries to that pool of slave

Auto scale out

Rules:

  • Provision more slaves when cpu_pct is reach

Actions:

  • Presente queries that can be routed to memory store Memcache route
  • Presente queries that can be routed to persistent store Tarantool route
  • Presente queries that can be routed to material views to limit joins (cf sphinx RT indexes) Provision replication to sphinx query rewrite
  • Presente queries that can be routed to column store DB Provision replication via delta load cf (tungsten replicator)

Security

  • Public cloud identity should be keeped secret and cloud provisioning should be engage only under your own control

Billing metrics

Scrambledb should provide inovation in database invoice framework

What makes a fair price should be monitored first : SQL is a powefull language in few words you can parse billions of records

Number of queries does not matter

  • Record size * records reads just make sense as it relate to CPU usage and good modeling
  • Number of disque iops as it is directly related to DB size
  • Bandwidth send/receive
Clone this wiki locally