You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: guidance/README.md
+28-28
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@
20
20
21
21
## Overview
22
22
23
-
This guidance was created to help customers with database workloads that have high read:write (70:30) ratios and are looking to boost application performance, and at the same time reduce overall cost. Qualifying database workloads will see an increase in the number of transactions, a reduction in response time, and an overall reduction in cost. It is expected that two services together can perform a task faster. However, when AWS ElastiCache is paired with qualifying database workloads not only the performance increases but the total cost of the two services is lower than the cost of scaling the database alone to deliver a similar performance.
23
+
This guidance was created to help customers with database workloads that have high read:write (70:30) ratios and are looking to boost application performance, and at the same time reduce overall cost. Qualifying database workloads will see an increase in the number of transactions, a reduction in response time, and an overall reduction in cost. It is expected that two services together can perform a task faster. However, when [Amazon ElastiCache](https://aws.amazon.com/elasticache/) is paired with qualifying database workloads not only the performance increases but the total cost of the two services is lower than the cost of scaling the database alone to deliver a similar performance.
24
24
25
25
#### Architecture overview ####
26
26
@@ -30,33 +30,33 @@ This guidance was created to help customers with database workloads that have hi
30
30
31
31
You are responsible for the cost of the AWS services used while running this guidance.
32
32
33
-
The cost for running this guidance will depend on the infrastructure used. Exisitng infrastructure may be used for no additional cost or individual services may be configured as below. All services are assumed to be in the US East (N. Virginia) region with on-demand pricing option. Using Amazon Elastic Compute Cloud (Amazon EC2) instance type t4g.micro with 8GB of Amazon Elastic Block Store (EBS) to run the simlated applicaiton workload. AWS ElastiCache provisioned instance type cache.t2.x.small utilizing 1 primary and 1 read replica. AWS RDS MySQL database using instance type (db.t3.micro), and storage(30 gp2 GB). For any service the cost will greatly depend on the, instance type, and RDS licensing model selected. Reserved pricing will greatly reduce cost for EC2, RDS, and ElastiCache. AWS ElastiCache is also available in a serverless offering where a pay-per-consumption cost model is applicable.
33
+
The cost for running this guidance will depend on the infrastructure used. Exisitng infrastructure may be used for no additional cost or individual services may be configured as below. All services are assumed to be in the US East (N. Virginia) region with on-demand pricing option. Using [Amazon Elastic Compute Cloud](https://aws.amazon.com/ec2/) (Amazon EC2) instance type `t4g.micro` with 8GB of [Amazon Elastic Block Store](https://aws.amazon.com/ebs/) (EBS) to run the simlated applicaiton workload. Amazon ElastiCache provisioned instance type `cache.t2.micro` utilizing 1 primary and 1 read replica. [Amazon Relational Database Service](https://aws.amazon.com/rds/) (RDS) MySQL database using instance type `db.t3.micro`, and storage (30GB gp2). For any service the cost will greatly depend on the, instance type, and RDS licensing model selected. Reserved pricing will greatly reduce cost for EC2, RDS, and ElastiCache. Amazon ElastiCache is also available in a serverless offering where a pay-per-consumption cost model is applicable.
34
34
35
35
| Service | Assumptions | Estimated Cost Per Month |
| Amazon EC2 | 1 instance (t4g.micro) used for 730 hours | $20.13 |
38
-
| Amazon ElastiCache | 2 Instance (cache.t2.small) used for 730 hours | $24.82 |
39
-
| Amazon RDS MySQL |1 Instance (db.t3.micro) used for 730 hours | $15.86|
40
-
| Total || $40.81|
37
+
| Amazon EC2 | 1 Instance `t4g.micro` used for 730 hours | $20.13 |
38
+
| Amazon ElastiCache | 2 Instances `cache.t2.micro` used for 730 hours | $24.82 |
39
+
| Amazon RDS MySQL |2 Instances `db.t3.micro` used for 730 hours | $31.72|
40
+
| Total || $76.67|
41
41
42
42
We recommend creating a [Budget](https://docs.aws.amazon.com/cost-management/latest/userguide/budgets-managing-costs.html) through [AWS Cost Explorer](https://aws.amazon.com/aws-cost-management/aws-cost-explorer/) to help manage costs. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this Guidance.
43
43
44
44
## Prerequisites
45
45
46
-
This guidance is targeted towards those familiar with the AWS RDS Service. The users are expected to have a basic understanding of AWS RDS service and database access patterns. It guides users how to utilize AWS ElastiCache in addition to their existing relational database. Effectively paring your database with a caching service. It should be run in US East N. Virginia region. This guidance is not intended for production workloads. For example, recommended security configurations are not included. For production systems it is strongly recommended that data should be encrypted both in transit and at rest.
46
+
This guidance is targeted towards those familiar with the RDS Service. The users are expected to have a basic understanding of RDS service and database access patterns. It guides users how to utilize ElastiCache in addition to their existing relational database. Effectively paring your database with a caching service. It should be run in US East N. Virginia region. This guidance is not intended for production workloads. For example, recommended security configurations are not included. For production systems it is strongly recommended that data should be encrypted both in transit and at rest.
47
47
48
48
49
49
### Operating System
50
50
51
-
This guidance runs in the AWS cloud utilizing an AWS EC2 compute instance based on the Amazon Linux 2023 AMI. With network access to both AWS RDS database service and AWS ElastiCache (both are required). In addition the EC2 compute instance will require public access on port 8888. The included Cloud Formation template can be used to create such an EC2 instance. The sample code also includes two Jupyter notebooks to analyze and visually plot the performance results. Note that public access to the EC2 host on port 8888 should be enabled from your computer only not all end user computers.
51
+
This guidance runs in the AWS cloud utilizing an EC2 compute instance based on the Amazon Linux 2023 AMI. With network access to both RDS database service and ElastiCache (both are required). In addition the EC2 compute instance will require public access on port 8888. The included CloudFormation template can be used to create such an EC2 instance. The sample code also includes two Jupyter notebooks to analyze and visually plot the performance results. Note that public access to the EC2 host on port 8888 should be enabled from your computer only not all end user computers.
52
52
53
53
### Services
54
54
55
-
This quidance depends on AWS RDS MySQL, and AWS ElastiCache services. It is beyond the scope of this guidance to create those services. Pleae reffer to [AWS RDS](https://aws.amazon.com/rds/) and AWS[ElastiCache](https://aws.amazon.com/elasticache/)
55
+
This quidance depends on RDS MySQL, and ElastiCache services. It is beyond the scope of this guidance to create those services. Pleae reffer to [RDS](https://aws.amazon.com/rds/) and [ElastiCache](https://aws.amazon.com/elasticache/)
56
56
57
57
### Client Software dependencies
58
58
59
-
Install dependencies by executing the `setup_host.sh` script. This script will install gcc python3-devel at the host level. In addition to the two packages installed a python virtual environment is created with dependent modules installed from the requirements.txt file. The python modules are only committed to the virtual environment not the host. The includes commands are optimized to work on the EC2 instance created by the included CloudFormation template and are specific for the Amazon Linux 2023 AMI al2023-ami-2023.4.20240319.1-kernel-6.1-arm64. This image is specific to the us-east-1 region. Other OS or AMI configuration may require additional steps.
59
+
Install dependencies by executing the `setup_host.sh` script. This script will install `gcc python3-devel` at the host level. In addition to the two packages installed a python virtual environment is created with dependent modules installed from the `requirements.txt` file. The python modules are only committed to the virtual environment not the host. The includes commands are optimized to work on the EC2 instance created by the included CloudFormation template and are specific for the Amazon Linux 2023 AMI `al2023-ami-2023.4.20240319.1-kernel-6.1-arm64`. This image is specific to the `us-east-1` region. Other OS or AMI configuration may require additional steps.
60
60
61
61
### Third-party tools
62
62
@@ -70,52 +70,52 @@ Ability to create an EC2 instance and networking configuration to permit access
70
70
71
71
**Example resources:**
72
72
- RDS MySQL Database with the English version of flughafendb data loaded from the third party location mentioned above.
73
-
-AWS ElastiCache
73
+
- ElastiCache
74
74
- VPC
75
75
- SSH key in your region of choice
76
76
77
77
### Supported Regions
78
78
79
-
All regions where AWS RDS MySQL and AWS ElastiCache are offered.
79
+
All regions where RDS MySQL and ElastiCache are offered.
80
80
81
81
## How to load the seed data
82
82
83
83
The seed data may be loaded from here www.flughafendb.cc. Follow the steps in the README document for "Import using mysqldump" steps for the English version of the data. However name the database airportdb. The below steps are slight modification of the steps suggested in the README file, using password authentication and using the airportdb name for the target database.
84
84
85
85
### Change to the directory with the zipped dump
86
-
```
86
+
```bash
87
87
cd english
88
88
```
89
89
90
90
### Concatenate the whole data set in one gzipped file
mysql -h <your-host> -u admin -p -e "CREATE DATABASE airportdb;"
98
98
Enter password:
99
99
```
100
100
101
101
### Import the dataset
102
-
```
102
+
```bash
103
103
zcat airportdb.sql.gz | mysql -h <your-host> -u admin -p airportdb
104
104
Enter password:
105
105
```
106
106
107
107
## Deployment Steps
108
108
109
-
1. If you have an existing EC2 instance with at least 1GB of memory using the Amazon Linux 2023 image with network configuration that will allow it to connect both to your RDBMS and ElastiCache services and SSH connectivity. Alternately and for you convenience the repository also includes a cloud formation template called guidance-ec2.yaml. Use AWS CloudFormation and with this template to create an EC2 instance. If you decide to use the CloudFormation template please specify all parameters that are valid for your AWS VPC (Virtual Private Cloud) Such as the SSH key to use, the AMI image ID, security group name and subnet group name.
109
+
1. If you have an existing EC2 instance with at least 1GB of memory using the Amazon Linux 2023 image with network configuration that will allow it to connect both to your RDBMS and ElastiCache services and SSH connectivity. Alternately and for you convenience the repository also includes a cloud formation template called `guidance-ec2.yaml`. Use AWS CloudFormation and with this template to create an EC2 instance. If you decide to use the CloudFormation template please specify all parameters that are valid for your AWS VPC (Virtual Private Cloud) Such as the SSH key to use, the AMI image ID, security group name and subnet group name.
110
110
2. Log in to your instance from the AWS console via Session Manager or via SSH.
111
-
3. Switch to the ec2-user ```sudo su - ec2-user```, install git ```sudo yum install git -y```
112
-
4. Clone the repository by executing ```git clone <this repo name>```
111
+
3. Switch to the ec2-user ```sudo su - ec2-user```, install git ```sudo dnf install -y git```
112
+
4. Clone the repository by executing ```git clone <this repo name>```
113
113
5. Change directory to the guidance directory ```cd amazon-elasticache-caching-for-amazon-rds/guidance```
114
114
6. Execute the setup_host script ```./setup_host.sh```
115
-
7. Log in to the same instance from a separate session and navigate to the same directory and execute ```./setup_jupyter.sh``` script. Enter the initial password. Commit the password to memory as you will have to enter it once the notebook is running. Note: This Jupyter configuration is not meant for a production environment as it uses a self-signed certificate. For a proper production environment follow your company standards to acquire a certificate from a known Certificate Authority. The Jupyter server used in this guide uses a self-signed certificate that your web browser will probably not trust. You can accept the certificate or follow internal standards and repalace the Jupyter server key and certificate in the ~/.jupyter/jupyter_lab_config.py file. More documentation is available [here](https://jupyter-notebook.readthedocs.io/en/6.2.0/public_server.html#running-a-public-notebook-server)
116
-
8. In your computer browser enter your EC2's public IP address and port for example: ```https://1.2.3.4:8888`` Unless you configured Jupyter with other than the self signed certificate accept the warning and continue.
115
+
7. Log in to the same instance from a separate session and navigate to the same directory and execute ```./setup_jupyter.sh``` script. Enter the initial password. Commit the password to memory as you will have to enter it once the notebook is running. Note: This Jupyter configuration is not meant for a production environment as it uses a self-signed certificate. For a proper production environment follow your company standards to acquire a certificate from a known Certificate Authority. The Jupyter server used in this guide uses a self-signed certificate that your web browser will probably not trust. You can accept the certificate or follow internal standards and repalace the Jupyter server key and certificate in the `~/.jupyter/jupyter_lab_config.py` file. More documentation is available [here](https://jupyter-notebook.readthedocs.io/en/6.2.0/public_server.html#running-a-public-notebook-server)
116
+
8. In your computer browser enter your EC2's public IP address and port for example: ```https://1.2.3.4:8888``` Unless you configured Jupyter with other than the self signed certificate accept the warning and continue.
117
117
9. Enter the password for your Jupyter notebook. (The password entered at step 6)
118
-
10. In your first session edit the .env file and update it with your database and ElastiCache related information.
118
+
10. In your first session edit the `.env` file and update it with your database and ElastiCache related information.
119
119
11. Source the .env file ```source .env``` to export the parameters.
120
120
121
121
## Deployment Validation
@@ -124,27 +124,27 @@ It is not part of this guidance to install and configure client applications for
124
124
125
125
## Running the Guidance
126
126
127
-
* Execute the scenario01.py script. This workload accesses the database only and captures command level performance data in a logfile. In the directory where you executed the setup_host.sh and the Python virtual environment is activated, the first connection, execute: ```python scenario01.py --users 10 --queries 1000 --read_rate 80```
127
+
* Execute the `scenario01.py` script. This workload accesses the database only and captures command level performance data in a logfile. In the directory where you executed the setup_host.sh and the Python virtual environment is activated, the first connection, execute: ```python scenario01.py --users 10 --queries 1000 --read_rate 80```
128
128
* If deployment was correct you should see a response similar to this. (small sample execution)
Logfile located here: logs/scenario01_139007_mwae8c4k.json
137
137
```
138
138
139
-
* Open the Jupyter notebook plot_results_db_only.ipynb file and update the logfile name in the second cell. For example ```log_pattern = 'scenario01_139007_mwae8c4k.json```
139
+
* Open the Jupyter notebook `plot_results_db_only.ipynb` file and update the logfile name in the second cell. For example ```log_pattern = 'scenario01_139007_mwae8c4k.json```
140
140
141
141
* From the run option select run all cells. The output of the last cell will show both the number of executions per second and the average response time.
142
142
143
-
* To compare the performance boost provided by ElastiCache repeat the above steps but use the scenario02.py script. For example execute ```python scenario02.py --users 1 --queries 10 --read_rate 80``` The output should be similar.
143
+
* To compare the performance boost provided by ElastiCache repeat the above steps but use the `scenario02.py` script. For example execute ```python scenario02.py --users 1 --queries 10 --read_rate 80``` The output should be similar.
Logfile located here: logs/scenario02_176908_0y2qr55f.json
156
156
```
157
157
158
-
* Open the Jupyter notebook plot_results_db_and_cache.ipynb file and update the logfile name in the second cell. For example ```log_pattern = scenario02_176908_0y2qr55f.json```
158
+
* Open the Jupyter notebook `plot_results_db_and_cache.ipynb` file and update the logfile name in the second cell. For example ```log_pattern = scenario02_176908_0y2qr55f.json```
159
159
160
160
Then select run all cells to plot the performance of the second scenario. Note that a small execution may not be sufficient to demonstrate the performance advantage of adding a cache.
0 commit comments