Skip to content

Commit 1f801ff

Browse files
authored
Update README.md
Updated calculation including RDS replica, and highlighted code and files.
1 parent cd6f494 commit 1f801ff

File tree

1 file changed

+28
-28
lines changed

1 file changed

+28
-28
lines changed

guidance/README.md

+28-28
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020

2121
## Overview
2222

23-
This guidance was created to help customers with database workloads that have high read:write (70:30) ratios and are looking to boost application performance, and at the same time reduce overall cost. Qualifying database workloads will see an increase in the number of transactions, a reduction in response time, and an overall reduction in cost. It is expected that two services together can perform a task faster. However, when AWS ElastiCache is paired with qualifying database workloads not only the performance increases but the total cost of the two services is lower than the cost of scaling the database alone to deliver a similar performance.
23+
This guidance was created to help customers with database workloads that have high read:write (70:30) ratios and are looking to boost application performance, and at the same time reduce overall cost. Qualifying database workloads will see an increase in the number of transactions, a reduction in response time, and an overall reduction in cost. It is expected that two services together can perform a task faster. However, when [Amazon ElastiCache](https://aws.amazon.com/elasticache/) is paired with qualifying database workloads not only the performance increases but the total cost of the two services is lower than the cost of scaling the database alone to deliver a similar performance.
2424

2525
#### Architecture overview ####
2626

@@ -30,33 +30,33 @@ This guidance was created to help customers with database workloads that have hi
3030

3131
You are responsible for the cost of the AWS services used while running this guidance.
3232

33-
The cost for running this guidance will depend on the infrastructure used. Exisitng infrastructure may be used for no additional cost or individual services may be configured as below. All services are assumed to be in the US East (N. Virginia) region with on-demand pricing option. Using Amazon Elastic Compute Cloud (Amazon EC2) instance type t4g.micro with 8GB of Amazon Elastic Block Store (EBS) to run the simlated applicaiton workload. AWS ElastiCache provisioned instance type cache.t2.x.small utilizing 1 primary and 1 read replica. AWS RDS MySQL database using instance type (db.t3.micro), and storage(30 gp2 GB). For any service the cost will greatly depend on the, instance type, and RDS licensing model selected. Reserved pricing will greatly reduce cost for EC2, RDS, and ElastiCache. AWS ElastiCache is also available in a serverless offering where a pay-per-consumption cost model is applicable.
33+
The cost for running this guidance will depend on the infrastructure used. Exisitng infrastructure may be used for no additional cost or individual services may be configured as below. All services are assumed to be in the US East (N. Virginia) region with on-demand pricing option. Using [Amazon Elastic Compute Cloud](https://aws.amazon.com/ec2/) (Amazon EC2) instance type `t4g.micro` with 8GB of [Amazon Elastic Block Store](https://aws.amazon.com/ebs/) (EBS) to run the simlated applicaiton workload. Amazon ElastiCache provisioned instance type `cache.t2.micro` utilizing 1 primary and 1 read replica. [Amazon Relational Database Service](https://aws.amazon.com/rds/) (RDS) MySQL database using instance type `db.t3.micro`, and storage (30GB gp2). For any service the cost will greatly depend on the, instance type, and RDS licensing model selected. Reserved pricing will greatly reduce cost for EC2, RDS, and ElastiCache. Amazon ElastiCache is also available in a serverless offering where a pay-per-consumption cost model is applicable.
3434

3535
| Service | Assumptions | Estimated Cost Per Month |
3636
| --------------------- | ------------------------------------------------- | ------------- |
37-
| Amazon EC2 | 1 instance (t4g.micro) used for 730 hours | $20.13 |
38-
| Amazon ElastiCache | 2 Instance (cache.t2.small) used for 730 hours | $24.82 |
39-
| Amazon RDS MySQL | 1 Instance (db.t3.micro) used for 730 hours | $15.86 |
40-
| Total | | $40.81|
37+
| Amazon EC2 | 1 Instance `t4g.micro` used for 730 hours | $20.13 |
38+
| Amazon ElastiCache | 2 Instances `cache.t2.micro` used for 730 hours | $24.82 |
39+
| Amazon RDS MySQL | 2 Instances `db.t3.micro` used for 730 hours | $31.72 |
40+
| Total | | $76.67|
4141

4242
We recommend creating a [Budget](https://docs.aws.amazon.com/cost-management/latest/userguide/budgets-managing-costs.html) through [AWS Cost Explorer](https://aws.amazon.com/aws-cost-management/aws-cost-explorer/) to help manage costs. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this Guidance.
4343

4444
## Prerequisites
4545

46-
This guidance is targeted towards those familiar with the AWS RDS Service. The users are expected to have a basic understanding of AWS RDS service and database access patterns. It guides users how to utilize AWS ElastiCache in addition to their existing relational database. Effectively paring your database with a caching service. It should be run in US East N. Virginia region. This guidance is not intended for production workloads. For example, recommended security configurations are not included. For production systems it is strongly recommended that data should be encrypted both in transit and at rest.
46+
This guidance is targeted towards those familiar with the RDS Service. The users are expected to have a basic understanding of RDS service and database access patterns. It guides users how to utilize ElastiCache in addition to their existing relational database. Effectively paring your database with a caching service. It should be run in US East N. Virginia region. This guidance is not intended for production workloads. For example, recommended security configurations are not included. For production systems it is strongly recommended that data should be encrypted both in transit and at rest.
4747

4848

4949
### Operating System
5050

51-
This guidance runs in the AWS cloud utilizing an AWS EC2 compute instance based on the Amazon Linux 2023 AMI. With network access to both AWS RDS database service and AWS ElastiCache (both are required). In addition the EC2 compute instance will require public access on port 8888. The included Cloud Formation template can be used to create such an EC2 instance. The sample code also includes two Jupyter notebooks to analyze and visually plot the performance results. Note that public access to the EC2 host on port 8888 should be enabled from your computer only not all end user computers.
51+
This guidance runs in the AWS cloud utilizing an EC2 compute instance based on the Amazon Linux 2023 AMI. With network access to both RDS database service and ElastiCache (both are required). In addition the EC2 compute instance will require public access on port 8888. The included CloudFormation template can be used to create such an EC2 instance. The sample code also includes two Jupyter notebooks to analyze and visually plot the performance results. Note that public access to the EC2 host on port 8888 should be enabled from your computer only not all end user computers.
5252

5353
### Services
5454

55-
This quidance depends on AWS RDS MySQL, and AWS ElastiCache services. It is beyond the scope of this guidance to create those services. Pleae reffer to [AWS RDS](https://aws.amazon.com/rds/) and AWS [ElastiCache](https://aws.amazon.com/elasticache/)
55+
This quidance depends on RDS MySQL, and ElastiCache services. It is beyond the scope of this guidance to create those services. Pleae reffer to [RDS](https://aws.amazon.com/rds/) and [ElastiCache](https://aws.amazon.com/elasticache/)
5656

5757
### Client Software dependencies
5858

59-
Install dependencies by executing the `setup_host.sh` script. This script will install gcc python3-devel at the host level. In addition to the two packages installed a python virtual environment is created with dependent modules installed from the requirements.txt file. The python modules are only committed to the virtual environment not the host. The includes commands are optimized to work on the EC2 instance created by the included CloudFormation template and are specific for the Amazon Linux 2023 AMI al2023-ami-2023.4.20240319.1-kernel-6.1-arm64. This image is specific to the us-east-1 region. Other OS or AMI configuration may require additional steps.
59+
Install dependencies by executing the `setup_host.sh` script. This script will install `gcc python3-devel` at the host level. In addition to the two packages installed a python virtual environment is created with dependent modules installed from the `requirements.txt` file. The python modules are only committed to the virtual environment not the host. The includes commands are optimized to work on the EC2 instance created by the included CloudFormation template and are specific for the Amazon Linux 2023 AMI `al2023-ami-2023.4.20240319.1-kernel-6.1-arm64`. This image is specific to the `us-east-1` region. Other OS or AMI configuration may require additional steps.
6060

6161
### Third-party tools
6262

@@ -70,52 +70,52 @@ Ability to create an EC2 instance and networking configuration to permit access
7070

7171
**Example resources:**
7272
- RDS MySQL Database with the English version of flughafendb data loaded from the third party location mentioned above.
73-
- AWS ElastiCache
73+
- ElastiCache
7474
- VPC
7575
- SSH key in your region of choice
7676

7777
### Supported Regions
7878

79-
All regions where AWS RDS MySQL and AWS ElastiCache are offered.
79+
All regions where RDS MySQL and ElastiCache are offered.
8080

8181
## How to load the seed data
8282

8383
The seed data may be loaded from here www.flughafendb.cc. Follow the steps in the README document for "Import using mysqldump" steps for the English version of the data. However name the database airportdb. The below steps are slight modification of the steps suggested in the README file, using password authentication and using the airportdb name for the target database.
8484

8585
### Change to the directory with the zipped dump
86-
```
86+
```bash
8787
cd english
8888
```
8989

9090
### Concatenate the whole data set in one gzipped file
91-
```
91+
```bash
9292
cat flughafendb_large.sql.gz.part-* > airportdb.sql.gz
9393
```
9494

9595
### Create a new database in your MySQL instance
96-
```
96+
```bash
9797
mysql -h <your-host> -u admin -p -e "CREATE DATABASE airportdb;"
9898
Enter password:
9999
```
100100

101101
### Import the dataset
102-
```
102+
```bash
103103
zcat airportdb.sql.gz | mysql -h <your-host> -u admin -p airportdb
104104
Enter password:
105105
```
106106

107107
## Deployment Steps
108108

109-
1. If you have an existing EC2 instance with at least 1GB of memory using the Amazon Linux 2023 image with network configuration that will allow it to connect both to your RDBMS and ElastiCache services and SSH connectivity. Alternately and for you convenience the repository also includes a cloud formation template called guidance-ec2.yaml. Use AWS CloudFormation and with this template to create an EC2 instance. If you decide to use the CloudFormation template please specify all parameters that are valid for your AWS VPC (Virtual Private Cloud) Such as the SSH key to use, the AMI image ID, security group name and subnet group name.
109+
1. If you have an existing EC2 instance with at least 1GB of memory using the Amazon Linux 2023 image with network configuration that will allow it to connect both to your RDBMS and ElastiCache services and SSH connectivity. Alternately and for you convenience the repository also includes a cloud formation template called `guidance-ec2.yaml`. Use AWS CloudFormation and with this template to create an EC2 instance. If you decide to use the CloudFormation template please specify all parameters that are valid for your AWS VPC (Virtual Private Cloud) Such as the SSH key to use, the AMI image ID, security group name and subnet group name.
110110
2. Log in to your instance from the AWS console via Session Manager or via SSH.
111-
3. Switch to the ec2-user ``` sudo su - ec2-user```, install git ```sudo yum install git -y```
112-
4. Clone the repository by executing ```git clone <this repo name> ```
111+
3. Switch to the ec2-user ```sudo su - ec2-user```, install git ```sudo dnf install -y git```
112+
4. Clone the repository by executing ```git clone <this repo name>```
113113
5. Change directory to the guidance directory ```cd amazon-elasticache-caching-for-amazon-rds/guidance```
114114
6. Execute the setup_host script ```./setup_host.sh```
115-
7. Log in to the same instance from a separate session and navigate to the same directory and execute ```./setup_jupyter.sh``` script. Enter the initial password. Commit the password to memory as you will have to enter it once the notebook is running. Note: This Jupyter configuration is not meant for a production environment as it uses a self-signed certificate. For a proper production environment follow your company standards to acquire a certificate from a known Certificate Authority. The Jupyter server used in this guide uses a self-signed certificate that your web browser will probably not trust. You can accept the certificate or follow internal standards and repalace the Jupyter server key and certificate in the ~/.jupyter/jupyter_lab_config.py file. More documentation is available [here](https://jupyter-notebook.readthedocs.io/en/6.2.0/public_server.html#running-a-public-notebook-server)
116-
8. In your computer browser enter your EC2's public IP address and port for example: ```https://1.2.3.4:8888`` Unless you configured Jupyter with other than the self signed certificate accept the warning and continue.
115+
7. Log in to the same instance from a separate session and navigate to the same directory and execute ```./setup_jupyter.sh``` script. Enter the initial password. Commit the password to memory as you will have to enter it once the notebook is running. Note: This Jupyter configuration is not meant for a production environment as it uses a self-signed certificate. For a proper production environment follow your company standards to acquire a certificate from a known Certificate Authority. The Jupyter server used in this guide uses a self-signed certificate that your web browser will probably not trust. You can accept the certificate or follow internal standards and repalace the Jupyter server key and certificate in the `~/.jupyter/jupyter_lab_config.py` file. More documentation is available [here](https://jupyter-notebook.readthedocs.io/en/6.2.0/public_server.html#running-a-public-notebook-server)
116+
8. In your computer browser enter your EC2's public IP address and port for example: ```https://1.2.3.4:8888``` Unless you configured Jupyter with other than the self signed certificate accept the warning and continue.
117117
9. Enter the password for your Jupyter notebook. (The password entered at step 6)
118-
10. In your first session edit the .env file and update it with your database and ElastiCache related information.
118+
10. In your first session edit the `.env` file and update it with your database and ElastiCache related information.
119119
11. Source the .env file ```source .env``` to export the parameters.
120120

121121
## Deployment Validation
@@ -124,27 +124,27 @@ It is not part of this guidance to install and configure client applications for
124124

125125
## Running the Guidance
126126

127-
* Execute the scenario01.py script. This workload accesses the database only and captures command level performance data in a logfile. In the directory where you executed the setup_host.sh and the Python virtual environment is activated, the first connection, execute: ```python scenario01.py --users 10 --queries 1000 --read_rate 80```
127+
* Execute the `scenario01.py` script. This workload accesses the database only and captures command level performance data in a logfile. In the directory where you executed the setup_host.sh and the Python virtual environment is activated, the first connection, execute: ```python scenario01.py --users 10 --queries 1000 --read_rate 80```
128128
* If deployment was correct you should see a response similar to this. (small sample execution)
129129

130130
Sample execution:
131131

132-
```
132+
```bash
133133
(.venv) [ec2-user]$ python scenario01.py --users 1 --queries 10 --read_rate 80
134134
Reads: 8
135135
Writes: 2
136136
Logfile located here: logs/scenario01_139007_mwae8c4k.json
137137
```
138138

139-
* Open the Jupyter notebook plot_results_db_only.ipynb file and update the logfile name in the second cell. For example ```log_pattern = 'scenario01_139007_mwae8c4k.json```
139+
* Open the Jupyter notebook `plot_results_db_only.ipynb` file and update the logfile name in the second cell. For example ```log_pattern = 'scenario01_139007_mwae8c4k.json```
140140

141141
* From the run option select run all cells. The output of the last cell will show both the number of executions per second and the average response time.
142142

143-
* To compare the performance boost provided by ElastiCache repeat the above steps but use the scenario02.py script. For example execute ```python scenario02.py --users 1 --queries 10 --read_rate 80``` The output should be similar.
143+
* To compare the performance boost provided by ElastiCache repeat the above steps but use the `scenario02.py` script. For example execute ```python scenario02.py --users 1 --queries 10 --read_rate 80``` The output should be similar.
144144

145145
Sample execution result:
146146

147-
```
147+
```bash
148148
(.venv) [ec2-user]$ python scenario02.py --users 1 --queries 10 --read_rate 80
149149
Connected to Database
150150
Connected to ElastiCache
@@ -155,7 +155,7 @@ Cache misses: 0
155155
Logfile located here: logs/scenario02_176908_0y2qr55f.json
156156
```
157157

158-
* Open the Jupyter notebook plot_results_db_and_cache.ipynb file and update the logfile name in the second cell. For example ```log_pattern = scenario02_176908_0y2qr55f.json```
158+
* Open the Jupyter notebook `plot_results_db_and_cache.ipynb` file and update the logfile name in the second cell. For example ```log_pattern = scenario02_176908_0y2qr55f.json```
159159

160160
Then select run all cells to plot the performance of the second scenario. Note that a small execution may not be sufficient to demonstrate the performance advantage of adding a cache.
161161

0 commit comments

Comments
 (0)