Skip to content

Latest commit

 

History

History
122 lines (93 loc) · 9.67 KB

File metadata and controls

122 lines (93 loc) · 9.67 KB

Vault Cluster

Deploy a Vault instance following HashiCorp's best practices. Complete these steps in order:

  1. Server Certificates: Prepare certificates first. You can provide yours or use the guide: Public Key Infrastructure (PKI): Requirements.

  2. Vault Instance Setup: Start your Vault instance. See Getting Started for instructions.

  3. Configure Vault: After setting up the cluster, configure it. Switch to the management directory for PKI, roles, etc.

💪 High availability

⚠️ You can choose between two modes when creating a Vault instance: dev and ha (default: dev). Here are the differences between these modes:

Dev HA
Number of nodes 1 5
Disk type hdd ssd
Vault storage type file raft
Instance type(s) t3.micro mixed (lower-price)
Capacity type on-demand spot

In designing a production environment for HashiCorp Vault, I opted for a balance between performance and reliability. Key architectural decisions include:

  1. Raft Protocol for Cluster Reliability: Utilizing the Raft protocol, recognized for its robustness in distributed systems, to ensure cluster reliability in a production environment.

  2. Five-Node Cluster Configuration: Following best practices for fault tolerance and availability, this setup significantly reduces the risk of service disruption and is a recommended choice when using the Raft protocol.

  3. Ephemeral Node Strategy with SPOT Instances: This approach provides operational flexibility and cost efficiency. Note that we also use multiple instance pools. When a Spot Instance in AWS Auto Scaling is interrupted, the system automatically replaces it with another available instance from a different Spot Instance pool, ensuring continuous operation while optimizing costs.

  4. Data Storage on RAID0 Array: Prioritizing performance, RAID0 arrays offer faster data access. The Raft protocol and a robust backup/restore strategy help mitigate the lack of redundancy in RAID0.

  5. Vault Auto-Unseal Feature: Configured to accommodate the ephemeral nature of nodes, ensuring minimal downtime and manual intervention.

This architecture balances performance, cost-efficiency, and resilience, embracing the dynamic nature of cloud resources for operational flexibility.

🔒 Security Considerations

  • Keep the Root CA offline.
  • Use hardened AMIs, such as those built with this project from @konstruktoid. An Ubuntu AMI from Canonical is used by default.
  • Disable SSM once the cluster is operational and an Identity provider is configured.
  • Implement MFA for authentication.

Requirements

Name Version
terraform ~> 1.4
aws ~> 5.0
cloudinit ~> 2.3

Providers

Name Version
aws ~> 5.0
cloudinit ~> 2.3

Modules

Name Source Version
vault_asg terraform-aws-modules/autoscaling/aws ~> 8.0

Resources

Name Type
aws_iam_instance_profile.this resource
aws_iam_role.this resource
aws_iam_role_policy.vault-kms-unseal resource
aws_iam_role_policy_attachment.ec2_read_only resource
aws_iam_role_policy_attachment.ssm resource
aws_kms_key.vault resource
aws_launch_template.dev resource
aws_launch_template.ha resource
aws_lb.this resource
aws_lb_listener.this resource
aws_lb_target_group.this resource
aws_route53_record.nlb resource
aws_security_group.nlb resource
aws_security_group.vault resource
aws_security_group_rule.allow_8200 resource
aws_security_group_rule.vault_internal_api resource
aws_security_group_rule.vault_internal_raft resource
aws_security_group_rule.vault_network_ingress resource
aws_security_group_rule.vault_node_exporter resource
aws_security_group_rule.vault_outbound resource
aws_ami.this data source
aws_ecr_authorization_token.token data source
aws_iam_policy_document.vault-kms-unseal data source
aws_route53_zone.this data source
aws_security_group.tailscale data source
aws_subnets.private data source
aws_vpc.selected data source
cloudinit_config.vault_cloud_init data source

Inputs

Name Description Type Default Required
ami_filter List of maps used to create the AMI filter for the action runner AMI. map(list(string))
{
"name": [
"ubuntu/images/hvm-ssd-gp3/ubuntu-noble-24.04-amd64-server-*"
]
}
no
ami_owner Owner ID of the AMI string "099720109477" no
domain_name The domain name for which the certificate should be issued string n/a yes
env The environment of the Vault cluster string n/a yes
leader_tls_servername One of the shared DNS SAN used to create the certs use for mTLS string n/a yes
mode Vault cluster mode (default dev, meaning a single node) string "dev" no
name Name of the resources created for this Vault cluster string "vault" no
prometheus_node_exporter_enabled If set to true install and start a prometheus node exporter bool false no
region AWS Region string "eu-west-3" no
ssm_enabled If true, allow to connect to the instances using AWS Systems Manager bool false no
tags A map of tags to add to all resources map(string) {} no
vault_data_path Directory where Vault's data will be stored in an EC2 instance string "/opt/vault/data" no

Outputs

Name Description
autoscaling_group_id n/a