Skip to content

Commit

Permalink
[DOCS] Solutions' cost information (#109)
Browse files Browse the repository at this point in the history
* DQ-172

* Adding pricing calculation
  • Loading branch information
egorodov authored Sep 1, 2023
1 parent c015173 commit fae6067
Showing 1 changed file with 82 additions and 0 deletions.
82 changes: 82 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,88 @@ The tool can be used as a standard Terraform module, with deployment examples pr

See the [functions](https://github.com/provectus/data-quality-gate/tree/main/functions) for further details.

## Pricing

This solution is completely free because it is open source. However, if you want to integrate it into a live/production environment, there will be associated costs due to its cloud-based nature. These costs can be divided into two parts: the required infrastructure (which you may already have in place, such as VPCs and subnets) and the AWS services necessary for data quality implementation.

*Note: All the information provided below has been calculated using the maximum score strategy.*
#### Pricing for required infrastructure

| AWS Service | Approximate monthly cost| Description |
| ------------- | ------------- | ------------- |
| AWS S3 and DynamoDB endpoints | - | There is no extra charge for gateway-type endpoints. You only pay for the usage of S3 and DynamoDB itself. |
| AWS Interface VPC endpoints(secrets manager, monitoring, sns) | 3 endpoints * (30 days * 24 hours * 0.01 rate) = 21.6 USD | Interface endpoints charged by hour. 1 hour = $0.01 |
| AWS ECRs (allure, data_test, reports, notifications) | 7 versions * (865mb + 432mb + 380mb) => 11.3gb * 0.1 rate per gb month= 1.13 USD | allure image size = 865mb, data_test image size = 432mb, reports image size = 380mb, notifications image size = 160mb. For the purpose of our calculations, let's assume we are storing 7 versions of each image. |
| AWS QuickSight | $7.3 aprx rate per user * 5 = 36.4 USD | Let's assume you have a team consisting of 5 individuals who are interested in the QuickSight data quality dashboard. They frequently check for changes, typically 2-3 times per day. |

<u>Monthly total is $59.13 US$ per month</u>
___

#### Pricing for data quality specific infrastructure
For most of the services used by Data Quality, AWS offers a free-tier supply. Additionally, the costs for these services are typically just a fraction of a cent. To provide further clarity, below you can find a basic cost formula and a few usage examples with cost estimations.

We are going to count:
- number of AWS Lambda runs
- number of AWS StepFunction transitions
- web reports AWS EC2 instance running(720 hrs per month)

| Description | Formula |
| ------------- | ------------- |
| number of AWS Lambda runs for each | (number of data sources * number of changes * work_days_month) * lambda specific rate(depends on lambda duration and memory used) |
| number of AWS StepFunction transitions | number of lambda runs * 2 |

##### Small

Let's say we have 1000 data sources and half of them changed every day. Number of runs formula for any lambda is **(1000 data sources * 0.5 changed * 30 days)**

| AWS Service | Number of runs | Price |
| ------------ | -------------- | ------ |
| AWS Lambda AllureReport | 15000 | $8.33 |
| AWS Lambda DataTest | 15000 | $67.28 |
| AWS Lambda Reports | 15000 | $2.08 |
| AWS StepFunctions | 15000 | $0.65 |
| AWS EC2 Reports S3 Gateway | 720 hrs | $7.25 |

<u>Monthly total: 85.59 US$</u>

___

##### Medium

Let's say we have 10000 data sources and 70% of them changed every day.
Number of runs formula for any lambda is **(10000 data sources * 0.7 changes * 30 days)**

| AWS Service | Number of runs | Price |
| ------------ | -------------- | ------ |
| AWS Lambda AllureReport | 210k | $203.33 |
| AWS Lambda DataTest | 210k | $1028.57 |
| AWS Lambda Reports | 210k | $115.83 |
| AWS StepFunctions | 210k | $10.40 |
| AWS EC2 Reports S3 Gateway | 720 hrs | $7.25 |

<u>Monthly total: 1 365.38 US$</u>

___

##### Large

Let's say we have 30000 data sources and all of them changed every day.
Number of runs formula for any lambda is **(30000 data sources * 1 changes * 30 days)**

| AWS Service | Number of runs | Price |
| ------------ | -------------- | ------ |
| AWS Lambda AllureReport | 900k | $893.34 |
| AWS Lambda DataTest | 900k | $4430.06 |
| AWS Lambda Reports | 900k | $518.33 |
| AWS StepFunctions | 900k | $44.90 |
| AWS EC2 Reports S3 Gateway | 720 hrs | $7.25 |

<u>Monthly total: 5 893.88 US$</u>
___

**Price per changed data source: 0.006 US$**


## License

Apache 2 Licensed. See [LICENSE](https://github.com/provectus/data-quality-gate/tree/main/LICENSE) for full details.

0 comments on commit fae6067

Please sign in to comment.