HiddenArtsADM_Project_2018-2019

Contributors

Hidden Arts has been developed by Francesco Musso (@frmusso) and Davide Ponassi (@ponassi).

The project

Domain

The domain is Street Arts. It is based on a mobile application which will provide detailed informations about street arts around the world. Street arts are shared by users that will take photo and fill an information form to be send to the database. The application will support Facebook and Google login and it will allow only one session per device.

System specifications

Our dataset is read-intensive. We do not consider our dataset write-intensive since street arts will not be shared much frequently. Also object shared will end up in a moderation queue before being visible to users. So reading data may be eventually consistent.

Also the dataset will increase proportionally to time and active users so it would be better to create a system that uses technologies that provides partitioning and replication.

We rely on Cassandra technology and CQL for workload.

Main dataset (published.csv)

The chosen dataset is a custom dataset composed by a manually generated part (which consist of 225 real data entries) and a pseudo-random generated one (24775 entries). The random generated part has followed these procedure:

It generates random Latitude / Longitude point on earth (beside Antarctica).
It evaluates its nature (wheter on land or on water).
It keeps generate a random point till it is on land.
Once it has a point it generates a random art title and take a random real author.

Users table are also initially random-generated. This allow us to generate a pseudo-real data whose weight is ~2.3 MBytes.

Other csv datasets

unpublished.csv: same structure as published.csv but it refers to the unpublished street arts, which means that they still are in the moderation queue (5000 rows).
users.csv: contains random generated users dataset (500 rows).
devices.csv: contains registered mobile devices dataset (269 rows).
authors.csv: contains random generated authors dataset (50 rows).
reports.csv: contains random generated reports from users with randomized statuses (100 rows).
authorsPublished.csv, usersPublished.csv, reportsPublished.csv: these are tables needed to create joins in order to use our workload properly.

How we evaluates land and water points

You can check the details on the repository IsOnWater_CSharp.

Project file structure

datasets: contains csv data which has been used to populate the database
src: contains cql schema and population
~~docs~~: contains documentation (soon to be added)
~~workload~~: contains cql workload (soon to be added)

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
datasets		datasets
docs		docs
src		src
utilities		utilities
workload		workload
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HiddenArtsADM_Project_2018-2019

Contributors

The project

Domain

System specifications

Main dataset (published.csv)

Other csv datasets

How we evaluates land and water points

Project file structure

About

Releases

Packages

Contributors 2

frmusso/HiddenArtsADM_Project_2018-2019

Folders and files

Latest commit

History

Repository files navigation

HiddenArtsADM_Project_2018-2019

Contributors

The project

Domain

System specifications

Main dataset (published.csv)

Other csv datasets

How we evaluates land and water points

Project file structure

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages