Skip to content

tareqsiraj/quarks

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

quarks

A modern C++ based off-the-shelf server framework for storing, retrieving, processing data with high scalability and plugging in business logics.

Quarks provides a highly scalable and distributable open source system based on actor model which can be easily deployed in closed networks. The ultimate aim is to come up with open source solutions to well known problems like chatting, image/video processing, transcoding, voice recognition etc. thus reducing dependencies on cloud platforms like AWS and GCP. Standardized chat and feed systems would eliminate the need to make private data available to public social networks, thus provisioning to safeguard user's own valuable data. Adding a new functionality or solution should be as easy as spinning up a new Quarks node and integrate it to the system following a few guidelines.

The core implementation concept, guidance and inspiration behind quarks can be found in this link: quarks philosophy

Thanks Arthur de AraĂşjo Farias for providing a good example of using CROW with OpenCV to use as a template. [arthurafarias/microservice-opencv-filter-gausian]

The current codebase uses a compiled version of RocksDB, Chrome v8 Engine and ZeroMQ. It requires the following packages:

  • Crow Library v0.1
  • GCC with support to C++17
  • Cmake 1.13
  • Boost::System
  • RocksDB
  • v8 Javascript Engine
  • ZeroMQ
  • OpenCV 4.0.0 (Optional)
  • Curl (Experimental Optional)

How to build

mkdir build
cd build
cmake .. -G Ninja
ninja

Thanks Tareq Ahmed Siraj (https://github.com/tareqsiraj) for introduing Ninja, made life way easier

Run

./ocv_microservice_crow

Testing

After running the executable, perform get and post requests as follows:

GET REQUESTS

Description

a) Put key vs value

 http://0.0.0.0:18080/put?body={"key":"g1_u1","value":{"msg":"m1"}}

It is recommended to "URI encode" the body parameters . Example JS codes:

 jsonobj = {"key":"g1_u1", "value":{"msg":"m1"}}
 var url = "put?body=" + encodeURIComponent(JSON.stringify(jsonobj));

 $.get(url, function( data ) {
     $( ".result" ).html( data );
 });

*If request is successful then the key would be returned as result

However, GET type of requests have a limitation with parameter lengths and body param cannot be too big. In those cases you have to use the methods in POST section (putjson, postjson etc.)

b) Get value against key

 http://0.0.0.0:18080/get?key=g1_u1

c) List values by wildcard search with keys (you can specifiy skip and limit optionally)

 http://0.0.0.0:18080/getall?keys=g1_u*&skip=5&limit=10

d) List sorted values by wildcard search with keys (you can specifiy skip and limit optionally)

 http://0.0.0.0:18080/getsorted?keys=g1_u*&sortby=msg&skip=5&limit=10

You can reverse the order by specifying des=true

 http://0.0.0.0:18080/getsorted?keys=g1_u*&sortby=msg&des=true&skip=5&limit=10

Apply equal-to filter on a value (using eq) :

http://0.0.0.0:18080/getsorted?keys=g1_u*&skip=0&limit=10&filter={"where":{"messageTo":{"eq":"u2"}}}

Apply equal-to filter on a value performing multiple comparisons (using eq_any):

http://0.0.0.0:18080/getsorted?keys=g1_u*&filter={"where":{"messageTo":{"eq_any":["u2","u4"]}}}

e) List keys vs values by wildcard search with keys (you can specifiy skip and limit optionally)

 http://0.0.0.0:18080/getkeys?keys=g1_u*&skip=5&limit=10

To get the keys in reverse order run

http://0.0.0.0:18080/getkeys?keys=g1_u*&skip=5&limit=10&reverse=true

f) Get count of keys matched by wildcard search

http://0.0.0.0:18080/getcount?keys=g1*

g) remove a key

http://0.0.0.0:18080/remove?key=g1_u1

number of keys successfully deleted would be returned

h) remove keys by wildcard search

http://0.0.0.0:18080/removeall?keys=g1_u*

number of keys successfully deleted would be returned

i) check if a key already exists

http://0.0.0.0:18080/exists?key=g1_u1

j) get a list of key value pair given a list of keys

http://0.0.0.0:18080/getlist?body=["g1_u1", "g2_u2"]

(You can specify skip and limit to this as well but should not need it)

k) increment a value saved as integer by a specified amount

http://0.0.0.0:18080/incr?body={"key":"somecounter","step":5}

Note: Value to increment must be saved as integer with a previous call to put -
http://0.0.0.0:18081/put?body={"key":"somecounter", "value":1}

The more advance version is incrval where you can specify the specific attribute (must be integer) to increment

http://0.0.0.0:18080/incrval?body={"key":"feed_user_johnwick", "value":{"points":3}}

In the above example if points were previously set as 7, after the API call it becomes 10. Both incr and incrval works with POST methods as well

l) Execute Atoms: Atoms are set of Put and Remove operations which can be executed in a single API call

To run a set of put operations together, run:

POST: http://0.0.0.0:18080/put/atom?body=
[
{"key":"g1_u2", "value":{"msg":"m1"}},
{"key":"g2_u2", "value":{"msg":"m2"}},
{"key":"g3_u3", "value":{"msg":"m3"}}
]

To run a set of remove operations together, run:

POST: http://0.0.0.0:18080/remove/atom?body=
["g1_u1","g1_u2", "g3_u3"]

To run a set of remove operations followed by a set of put operations, run:

POST: http://0.0.0.0:18080/atom?body=
{
put:[
{"key":"g1_u2", "value":{"msg":"m1"}},
{"key":"g2_u2", "value":{"msg":"m2"}},
{"key":"g3_u3", "value":{"msg":"m3"}}
}],
remove:["g1_u1","g1_u2", "g3_u3"]

}

  • Notes about Atoms,
    1. "Remove" operations will always be executed before "Put" in ../atom call
    2. Atoms should be used sparingly - if you have only a single put/remove operation then, use the put/remove apis provided for the specific purpose, not atomic ones
    3. If you have a number of put operations and no removes then use ../put/atom (and not ../atom)
    4. If you have a number of remove operations and no puts then use ../remove/atom (and not ../atom)

m) autogenerate key with prefix and value provided

 http://0.0.0.0:18080/make?body={"prefix":"dev_","value":"101"}

  • returns the key value pair as json object; if "key" is specified along with prefix then a key is formed with prefix+key and no key generation occurs

n) provide a prefix, key pair for which all keys (along with values) greater than the passed key, starting with the prefix are returned

http://0.0.0.0:18080/getkeysafter?body=["key_prefix", "comparekey"]

Multiple prefix, key pair can be provided like the following:

http://0.0.0.0:18080/getkeysafter?body=["key_pre1", "key1", "key_pre2", "key2", ... "key_preN", "keyN"]

o) provide a prefix, key pair for which the highest key (along with values and index) greater than the passed key, starting with the prefix is returned

http://0.0.0.0:18080/getkeyslast?body=["key_prefix", "comparekey"]

Multiple prefix, key pair can be provided like the following:

http://0.0.0.0:18080/getkeyslast?body=["key_pre1", "key1", "key_pre2", "key2", ... "key_preN", "keyN"]

POST REQUESTS

Description

i) Put a json object against a key: POST: http://0.0.0.0:18080/putjson

BODY:
{"key":"g3_u3", "value":{ "msg":"m3"}}

Note: In the json body, it is required to have a "key" attribute and a "value" attribute as a part of the json object. The json object {"msg":"m3"} under attribute "value" is saved against the key "g3_u3" in the persistent storage

If the intention is to insert only if the key doesn't exist then use the following api:

POST: http://0.0.0.0:18080/postjson

BODY:
{"key":"g3_u3", "value":{ "msg":"m3"}}

If the key already exists then the call fails. This is more useful than calling the "exists" api to check if key exists and then call putjson, since it's reduces an extra api call

ii) Retrieve the json object by key: POST: http://0.0.0.0:18080/getjson

BODY: {"key":"g3_u3"}

iii) Retrieve an array of json objects by wildcard matching of keys.. POST: http://0.0.0.0:18080/iterjson

BODY: {"keys":"g3_u*"}

To test this API, You could post a few values against keys with putjson, for example

BODY:
POST: http://0.0.0.0:18080/putjson
{"key":"g1_u2", "value":{"msg":"m1"}}

POST: http://0.0.0.0:18080/putjson
BODY:
{"key":"g2_u2", "value":{"msg":"m2"}}

POST: http://0.0.0.0:18080/putjson
BODY:
{"key":"g3_u3", "value":{"msg":"m3"}}

and then check the results by

POST: http://0.0.0.0:18080/iterjson
BODY: {"keys":"g3_u*"}

iv) Get a list of key value pair given a list of keys

POST: http://0.0.0.0:18080/getlist
BODY: ["g1_u1", "g2_u2"]

(You can specify skip and limit as query parameters but should not need it)

v) Execute Atoms: Atoms are set of Put and Remove operations which can be executed in a single API call

To run a set of put operations together, run:

POST: http://0.0.0.0:18080/put/atom

BODY:
[
    {"key":"g1_u2", "value":{"msg":"m1"}},
    {"key":"g2_u2", "value":{"msg":"m2"}},
    {"key":"g3_u3", "value":{"msg":"m3"}}
]

To run a set of remove operations together, run:

POST: http://0.0.0.0:18080/remove/atom

BODY:
["g1_u1","g1_u2", "g3_u3"]

To run a set of remove operations followed by a set of put operations, run:

POST: http://0.0.0.0:18080/atom

BODY:
{
put:[
    {"key":"g1_u2", "value":{"msg":"m1"}},
    {"key":"g2_u2", "value":{"msg":"m2"}},
    {"key":"g3_u3", "value":{"msg":"m3"}}
    }],
remove:["g1_u1","g1_u2", "g3_u3"]

}

  • Notes about Atoms,
  1. "Remove" operations will always be executed before "Put" in ../atom call
  2. Atoms should be used sparingly - if you have only a single put/remove operation then, use the put/remove apis provided for the specific purpose, not atomic ones
  3. If you have a number of put operations and no removes then use ../put/atom (and not ../atom)
  4. If you have a number of remove operations and no puts then use ../remove/atom (and not ../atom)

vi) Autogenerate key and make a key value pair given a key-prefix and value

POST: http://0.0.0.0:18080/make
BODY:
{"prefix":"MSGID_","value":"101"}

  • returns the key value pair as json object; if "key" is specified along with prefix then a key is formed with prefix+key and no key generation occurs

vii) provide a prefix, key pair for which all keys (along with values) greater than the passed key, starting with the prefix are returned

POST: http://0.0.0.0:18080/getkeysafter
BODY:
["key_prefix", "key"]

Multiple prefix, key pair can be provided like the following:

POST: http://0.0.0.0:18080/getkeysafter
BODY:
["key_pre1", "key1", "key_pre2", "key2", ... "key_preN", "keyN"]

viii) provide a prefix, key pair for which the highest key (along with value and index) greater than the passed key, starting with the prefix is returned

POST: http://0.0.0.0:18080/getkeyslast
BODY:
["key_prefix", "key"]

Multiple prefix, key pair can be provided like the following:

POST: http://0.0.0.0:18080/getkeyslast
BODY:
["key_pre1", "key1", "key_pre2", "key2", ... "key_preN", "keyN"]

Joins and Filters

In practical situaions, the need arose to incorporate the getjoinedmap api which joins multiple resultsets in a single query. What this api does is take a wildcard argument to iterate a range of keys (main keys). Then find a "subkey" inside each main key by splitting the key with a delimeter (splitby) and selecting one of the split tokens (selindex). Next add a prefix and suffix to the "subkey" and find values mapped against the newly "formed key". To add prefix, an array of prefix, suffix pairs is supplied to come up with relevant values. This provides a way to having multiple results from a well-formed main key item.

The first item in the result set is the array of "subkeys". The second item in the result set is a json object whose attributes are the "formed keys" (from prefix, suffix) with relevant values placed against each attribute.

You can specify skip and limit in this query as well.

It's preferable to use the POST method in this case.

GET:
http://localhost:18080/getjoinedmap?body=
{ 	"keys":"roomkeys_*","splitby":"_","selindex":5,
	"join":[{"prefix":"usercount_","suffix":""},
		    {"prefix":"messagecount_","suffix":""},
		    {"prefix":"notificationcount_","suffix":"user"}
		   ]
}&skip=2&limit=3

POST: http://localhost:18080/getjoinedmap&skip=2&limit=3
BODY:
{ 	"keys":"roomkeys_*","splitby":"_","selindex":5,
	"join":[{"prefix":"usercount_","suffix":""},
		    {"prefix":"messagecount_","suffix":""},
		    {"prefix":"notificationcount_","suffix":"user"}
		   ]
}

  • All attributes (keys, splitby, selindex, join) mentioned above are mandatory but values can be left as empty strings. For example, if no prefix or suffix joining is needed, then prefix and suffix can be kept as empty strings.

There is also provision to run ORM style queries with searchjson and applying filters

POST: http://0.0.0.0:18080/searchjson

Sample Query Format for "querying items which are up for sale with key like item* (i.e item1, item2 etc.) , then find the sellers of such items (items has a seller_id field that contains the user_id of the seller) "

{
    "keys":"item*",
    "include":{
        "map": {"field":"seller_id", "as":"seller"},
        "module":"main",
        "filter":"jsFilter",
        "params":"{\"approved\":1}"
    }

}


To test it out, First insert some users->

POST: http://0.0.0.0:18080/putjson BODY:

{"key":"user1", "value":{"name":"u1", "age":34}}
{"key":"user2", "value":{"name":"u2", "age":43}}

then insert some items-> POST: http://0.0.0.0:18080/putjson BODY:

{
"key":"item1",
"value":{
"id": "item1",
"seller_id": "user1",
"rating": 4,
"approved": "1"
}

{
"key":"item2",
"value":{
"id": "item2",
"seller_id": "user2",
"rating": 3,
"approved": "1"
}

Finally, check the results by POST: http://0.0.0.0:18080/searchjson

So we are able to iterate items (by "keys":"item*") and then run a join operation with the filter attribute ("filter":...) through the keyword map ({"map": {"field":"seller_id", "as":"seller"}})

v8 engine has been integrated to support scripting in server side to further filter/sort queried results.

Now the post body looks like the following with the js based extended filtering:

{
"keys":"item*",
    "include":{
        "map": {"field":"seller_id", "as":"seller"},
        "module":"main",
        "filter":"jsFilter",
        "params":"{\"approved\":1}"
    }

}

And the JS in server side looks like this:

function jsFilter() {
    var elem = JSON.parse(arguments[0]);
    var args = JSON.parse(arguments[1]);
    var match = 0;
    if(elem.approved == args.approved) {
        match = 1;
    }

    return match;
}

Here module main is the main.js file residing in the server in the same path as the executable. function is the name of the JS Function which we will use to further filter the data.

The idea is the mentioned script main.js will have a filter function with a predefined form filter(elem, params), or a sort function with predefined form sort(elem1, elem2, params) to further fitler/sort the data.

'elem' is an individual item (one of many) found by the Quarks lookup through "keys":"item*" . We are invoking the JS module and the function while finding and iterating the matching items in C++.

It is up to the user to interpret the params in the server side and write the script codes accordingly.

In our example, we named the function - "jsFilter" in main.js.

Quarks will allow minimum usage of scripting to ensure the server side codes remain super optimized.

Backup and Restore

For backing up the database try:
http://0.0.0.0:18080/backup?body={"path":"quarks_backup_1"}

To restore simply run quarks next time using the "store" commandline parameter
 ./ocv_microservice_crow -store quarks_backup_1
 
 -store followed by the path denotes the rocksdb directory path to use when starting quarks

BENCHMARKING

https://github.com/kaisarh/quarks/tree/dev/benchmark/results?
fbclid=IwAR2ea_PuZ6drbdg4PUuFfhirXdHC4rtlQ3I1KDR9G-PSaIJlFfA0FXNjUw8

Thanks Kaisar Haq (https://github.com/kaisarh) :)

After v8 engine integration and scripting support, the next target was to allow listener support through zero mq to communicate with other processes and services and creating the Quarks Cloud which is partially done.

Quarks Cloud

Quarks Cloud provides the functionalities for scaling and replicating nodes (through extensive use of ZeroMQ).

Genearlly each Quarks server is called a core. When we are using the cloud features the cores are called nodes.

There are three types of nodes:

  1. Broker Node
  2. Writer Node
  3. Reader Node

Broker nodes are used to publish data across multiple nodes. All writes through api calls are written to a writer node. The writer node sends the message to broker node which publishes to multiple reader nodes. Reader nodes are dedicated for only data reading related api calls. This helps serving huge amount of requests because the readers are plain replica of writer node.

Conceptual flow:

user->write apis-> [writer] -> [broker] -> [reader] <-read apis<-user

("/put" is an example of write api and "/get" is read api example)

Following are the commands to start up broker, writer and readers:

Start broker node:

 ./ocv_microservice_crow -port 18081 -broker tcp://*:5555

  • Opens a socket for communication in port 5555 to accept writer requests Opens a publisher at port 5556 port for subscribers(i.e readers) to listen to

Start writer node:

./ocv_microservice_crow -port 18082 -writer tcp://localhost:5555

  • Connects to broker at port 5555

Start reader node:

./ocv_microservice_crow -port 18083 -reader tcp://localhost:5556

  • Listens to broker at port 5556
  • There can be multiple readers started in different ports.

LOGGER / REPLICATION

Quarks can send all put and remove requests made in it's core db to a logger

To specify the address of the logger start by specifying the log parameter:

./ocv_microservice_crow -port 18080 -log http://localhost:18081

This means a logger has been started at port 18081 and listening to http://localhost:18081/putjson and http://localhost:18081/remove api calls. These apis respectively get invoked whenever a put or remove operation has been made in the core db

If you start another quarks server in the 18081 port specifying a new database, it simply becomes a replica node

./ocv_microservice_crow -store replica -port 18081

Instead of a Quarks server, you can start any server which implements and handles http://localhost:18081/putjson and http://localhost:18081/remove api calls

WEBSOCKETS

Websocket support has been added (Still not optimized).

#Initiate a socket:

var sock = new WebSocket("ws://0.0.0.0:18080/ws?_id=" + userId );

*Here userId is the id which would be used to uniquely identify a user, otherwise socket chat fails. Usually this id would be used by the other party (i.e a messege sender) to send messages to this user. By default, all users are auto joined to a room named "default".

#Room join:

sock.onopen = ()=>{
	console.log('open');
	// join room
	sock.send('{"join":"testroom", "notifyjoin":true, "notifyleave":true}');

}

#Error Handling and closing

sock.onerror = (e)=>{
	console.log('error',e);
}
			
sock.onclose = ()=>{
    console.log('close');
}

#Message Sending:

	var msg = {};
	msg.room = "testroom";
	msg.send = usrmsg;
	
	// to send to a specific user use the following:
	//msg.to = "useridxxx"; // specifying room is optional in this case
	
	var m = JSON.stringify(msg);
	sock.send(m);

#Message Handling:

sock.onmessage = (e)=>{
			
		let msg = JSON.parse(e.data);
		console.log(msg);
			
		var room = "";
		var from = "";
		var data = "";
		
		if(msg["joined"]){
			room = msg.joined;
			from = msg.from
			data = " I am online!";
			
		}else if(msg["left"]){
			room = msg.left;
			from = msg.from;
			data = " I went offline!";
			
		}else if(msg["data"]){
			room = msg.room;
			from = msg.from;
			data = msg.data;
			
		}else if(msg["userlist"]){
			room = msg.room;
			data = msg.userlist;
			from = "system";
		}
		

#List Users in a Room :

	var msg = {};
	msg.list = "testroom";
	msg.skip = 0;
	msg.limit = -1;
	
	var m = JSON.stringify(msg);
	sock.send(m); // check the message handling (sock.onmessage) section to see how to receive the list

Quarks has plans for plugins integration.

PLUGINS

Currently, only OpenCV is provided as a plugin (codes commented).

For those interested in testing OpenCV as plugin (uncommenting the relevant codes), you should submit a POST request to http://localhost:18080/filters/gausian. The body of this request should be your binary PNG image. The response should be a gausian filtered image from the submited image.

OpenCV however is a plugin (an additional feature) and not the main purpose behind Quarks. Currently it is turned off by using #ifdef _USE_PLUGIN in the codes and if (_USE_PLUGINS) in CMakeLists.txt

EDITOR

A browser based editor has been provided to run Quarks queries and visualize and update data in a JSON Editor (Thanks to https://github.com/json-editor/json-editor). To view the editor at work, Copy the "templates" folder inside "/examples" in the "build" folder and then hit the following in browser: http://localhost:18080/home

Definitely Quarks has to be running to view the editor

EXAMPLES

A guideline is provided for basic twitter like feed and chatrooms.

Copy the "templates" folder inside "/examples" in the "build" folder and then hit the following in browser: http://localhost:18080/feed for feed example http://localhost:18080/chat for chat example

Definitely Quarks has to be running to view the examples

Quick Start: Dependencies installation for Ubuntu 18.04

environment setup (assuming cmake already installed):

-$ sudo apt-get update -y

-$ sudo apt-get install build-essential

-$ sudo apt-get install ninja-build

main dependency libraries installation:

-$ sudo apt-get install libboost-system-dev

-$ sudo apt-get install libv8-dev

-$ sudo apt-get install librocksdb-dev

-$ sudo apt-get install libzmq3-dev

Build and Run: Check #How to Build section for compilation and binary creation and #Run section for how to run

Docker setup:

To build the docker image:

docker build -t quarks:ubuntu-21.04 .

To run the docker image:

docker run -it -v $PWD:/quarks -p 18080:18080 --cap-add sys_ptrace quarks:ubuntu-21.04

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 78.9%
  • HTML 17.4%
  • CMake 2.4%
  • Dockerfile 1.3%