Linux EC2 Best Practices: AWS Cloud Optimization Guide
Knowledge Overview
Prerequisites
- Basic Linux command line proficiency and system administration concepts
- Fundamental understanding of AWS services (EC2, VPC, IAM, EBS)
- SSH connectivity and key pair management experience
- Basic networking concepts (subnets, security groups, firewalls)
Time Investment
22 minutes reading time
44-66 minutes hands-on practice
Guide Content
What are the essential Linux EC2 best practices for AWS cloud optimization?
Linux EC2 best practices include using separate EBS volumes for OS and data, implementing proper security groups with least privilege access, enabling encryption for EBS volumes, configuring automated backups, optimizing instance types for workloads, and establishing comprehensive monitoring with CloudWatch. Additionally, follow proper SSH key management, use IAM roles instead of access keys, implement regular security patching, and configure proper network segmentation with VPCs.
Table of Contents
- How to Choose the Right EC2 Instance Type for Linux?
- What are Essential Linux EC2 Security Best Practices?
- How to Optimize Linux EC2 Storage Configuration?
- Why Should You Implement Proper Network Security?
- How to Configure Automated Linux EC2 Monitoring?
- What are Linux EC2 Backup and Recovery Strategies?
- How to Optimize Linux EC2 Performance?
- Why Use Infrastructure as Code for Linux EC2?
How to Choose the Right EC2 Instance Type for Linux?
Selecting the appropriate EC2 instance type fundamentally impacts your Linux system's performance, cost-effectiveness, and scalability. Nevertheless, understanding the different instance families and their optimal use cases ensures you make informed decisions for your infrastructure.
Understanding EC2 Instance Families
General Purpose Instances (t3, t4g, m5, m6i) These instances provide balanced CPU, memory, and networking resources, making them ideal for most Linux workloads.
# Check current instance type
curl -s http://169.254.169.254/latest/meta-data/instance-type
# View instance specifications
aws ec2 describe-instance-types --instance-types m5.large --query 'InstanceTypes[0].[VCpuInfo.DefaultVCpus,MemoryInfo.SizeInMiB,NetworkInfo.NetworkPerformance]' --output table
Compute Optimized Instances (c5, c6i, c7g) Moreover, these instances excel in CPU-intensive applications like web servers, scientific computing, and batch processing.
# Monitor CPU utilization for compute-intensive workloads
top -bn1 | grep "Cpu(s)" | awk '{print $2 + $4}'
# Check if your workload needs compute optimization
sar -u 1 10 | tail -1 | awk '{print "CPU Usage: " 100-$8 "%"}'
Memory Optimized Instances (r5, r6i, x1e) Furthermore, these instances are designed for memory-intensive applications such as in-memory databases and real-time analytics.
# Monitor memory usage patterns
free -h | awk 'NR==2{printf "Memory Usage: %.2f%%\n", $3*100/$2}'
# Check memory pressure indicators
vmstat 1 5 | tail -1 | awk '{print "Swap Usage: " $7+$8 " pages"}'
Instance Type Selection Criteria
Workload Analysis Commands
# Comprehensive system resource analysis
#!/bin/bash
echo "=== System Resource Analysis ==="
echo "CPU Cores: $(nproc)"
echo "Memory: $(free -h | awk 'NR==2{print $2}')"
echo "Storage IOPS: $(iostat -x 1 2 | tail -n +4 | awk '{iops+=$4+$5} END {print int(iops)}')"
echo "Network Usage: $(cat /proc/net/dev | awk 'NR>2{rx+=$2; tx+=$10} END {print "RX: " rx/1024/1024 "MB, TX: " tx/1024/1024 "MB"}')"
Cost Optimization Script
# Calculate potential savings with different instance types
aws ec2 describe-spot-price-history \
--instance-types m5.large m5.xlarge c5.large \
--product-descriptions "Linux/UNIX" \
--max-items 3 \
--query 'SpotPriceHistory[*].[InstanceType,SpotPrice,Timestamp]' \
--output table
What are Essential Linux EC2 Security Best Practices?
Security forms the foundation of any robust Linux EC2 deployment. Consequently, implementing comprehensive security measures protects your infrastructure from threats while maintaining operational efficiency.
SSH Key Management and Access Control
Secure SSH Configuration
# Create dedicated SSH key for EC2 access
ssh-keygen -t rsa -b 4096 -f ~/.ssh/ec2-linux-key -C "ec2-access-$(date +%Y%m%d)"
# Configure SSH client for secure connections
cat >> ~/.ssh/config << 'EOF'
Host ec2-*
User ec2-user
IdentitiesOnly yes
StrictHostKeyChecking yes
UserKnownHostsFile ~/.ssh/known_hosts
ServerAliveInterval 60
ServerAliveCountMax 3
EOF
Advanced SSH Hardening
# Implement SSH security configuration on EC2 instance
sudo tee /etc/ssh/sshd_config.d/99-security.conf << 'EOF'
# Disable root login
PermitRootLogin no
# Limit authentication methods
PasswordAuthentication no
ChallengeResponseAuthentication no
UsePAM yes
# Connection limits
MaxAuthTries 3
MaxSessions 2
LoginGraceTime 60
# Protocol settings
Protocol 2
AllowUsers ec2-user admin
EOF
# Restart SSH service
sudo systemctl restart sshd
sudo systemctl status sshd
IAM Roles and Instance Profiles
Creating Secure IAM Roles
# Create IAM role for EC2 instance
aws iam create-role \
--role-name EC2-Linux-Role \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}'
# Create instance profile
aws iam create-instance-profile --instance-profile-name EC2-Linux-Profile
# Attach role to instance profile
aws iam add-role-to-instance-profile \
--instance-profile-name EC2-Linux-Profile \
--role-name EC2-Linux-Role
Implementing Least Privilege Access
# Create custom policy for specific S3 access
aws iam create-policy \
--policy-name S3-ReadOnly-Specific \
--policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::your-bucket",
"arn:aws:s3:::your-bucket/*"
]
}
]
}'
Security Group Configuration
Implementing Network Security Rules
# Create security group with restrictive rules
aws ec2 create-security-group \
--group-name linux-web-secure \
--description "Secure Linux web server group" \
--vpc-id vpc-12345678
# Add SSH access from specific IP
aws ec2 authorize-security-group-ingress \
--group-id sg-12345678 \
--protocol tcp \
--port 22 \
--cidr 203.0.113.0/24
# Add HTTP/HTTPS access
aws ec2 authorize-security-group-ingress \
--group-id sg-12345678 \
--protocol tcp \
--port 80 \
--cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress \
--group-id sg-12345678 \
--protocol tcp \
--port 443 \
--cidr 0.0.0.0/0
How to Optimize Linux EC2 Storage Configuration?
Proper storage configuration significantly affects performance, durability, and cost-efficiency of your Linux EC2 instances. Therefore, understanding EBS volume types and optimization techniques is crucial for optimal system performance.
EBS Volume Types and Use Cases
General Purpose SSD (gp3) Configuration
# Create optimized gp3 volume
aws ec2 create-volume \
--size 100 \
--volume-type gp3 \
--iops 3000 \
--throughput 250 \
--availability-zone us-east-1a \
--encrypted \
--tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=WebServer-Data},{Key=Environment,Value=Production}]'
# Attach volume to instance
aws ec2 attach-volume \
--volume-id vol-12345678 \
--instance-id i-abcdef123456789 \
--device /dev/xvdf
High-Performance Storage Setup
# Format and mount new EBS volume
sudo mkfs.ext4 -F -E lazy_itable_init=0,lazy_journal_init=0 /dev/xvdf
# Create mount point and configure
sudo mkdir -p /opt/data
sudo mount /dev/xvdf /opt/data
# Add to fstab for persistent mounting
echo "$(sudo blkid -s UUID -o value /dev/xvdf) /opt/data ext4 defaults,noatime,nofail 0 2" | sudo tee -a /etc/fstab
# Verify mount
df -h /opt/data
sudo mount -a
Storage Performance Optimization
File System Tuning
# Optimize ext4 file system for performance
sudo tune2fs -o journal_data_writeback /dev/xvdf
sudo tune2fs -O ^has_journal /dev/xvdf
sudo e2fsck -f /dev/xvdf
sudo tune2fs -O has_journal /dev/xvdf
# Configure optimal mount options
sudo umount /opt/data
sudo mount -o defaults,noatime,nodiratime,data=writeback /dev/xvdf /opt/data
RAID Configuration for Enhanced Performance
# Create RAID 0 for improved I/O performance
sudo mdadm --create /dev/md0 \
--level=0 \
--raid-devices=2 \
/dev/xvdf /dev/xvdg
# Format RAID array
sudo mkfs.ext4 -F /dev/md0
# Configure persistent RAID
echo 'DEVICE partitions' | sudo tee /etc/mdadm.conf
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm.conf
# Mount RAID array
sudo mkdir -p /opt/raid-data
sudo mount /dev/md0 /opt/raid-data
echo "/dev/md0 /opt/raid-data ext4 defaults,noatime,nofail 0 2" | sudo tee -a /etc/fstab
EBS Snapshot Management
Automated Backup Strategy
# Create snapshot with proper tagging
aws ec2 create-snapshot \
--volume-id vol-12345678 \
--description "Production data backup $(date +%Y%m%d_%H%M%S)" \
--tag-specifications 'ResourceType=snapshot,Tags=[{Key=Name,Value=Production-Backup},{Key=Environment,Value=Production},{Key=Retention,Value=30days}]'
# Automated snapshot cleanup script
#!/bin/bash
RETENTION_DAYS=30
CUTOFF_DATE=$(date -d "$RETENTION_DAYS days ago" +%Y-%m-%d)
aws ec2 describe-snapshots \
--owner-ids self \
--query "Snapshots[?StartTime<='$CUTOFF_DATE'].SnapshotId" \
--output text | \
while read snapshot_id; do
if [ ! -z "$snapshot_id" ]; then
echo "Deleting snapshot: $snapshot_id"
aws ec2 delete-snapshot --snapshot-id $snapshot_id
fi
done
Why Should You Implement Proper Network Security?
Network security provides multiple layers of protection for your Linux EC2 infrastructure. Moreover, implementing proper network segmentation and monitoring ensures comprehensive defense against potential threats.
VPC Configuration and Subnets
Secure VPC Architecture
# Create production VPC
aws ec2 create-vpc \
--cidr-block 10.0.0.0/16 \
--tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=Production-VPC}]'
# Create public subnet for web tier
aws ec2 create-subnet \
--vpc-id vpc-12345678 \
--cidr-block 10.0.1.0/24 \
--availability-zone us-east-1a \
--tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=Public-Web-Subnet}]'
# Create private subnet for application tier
aws ec2 create-subnet \
--vpc-id vpc-12345678 \
--cidr-block 10.0.2.0/24 \
--availability-zone us-east-1a \
--tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=Private-App-Subnet}]'
Network ACL Configuration
# Create restrictive network ACL
aws ec2 create-network-acl \
--vpc-id vpc-12345678 \
--tag-specifications 'ResourceType=network-acl,Tags=[{Key=Name,Value=Production-NACL}]'
# Configure inbound rules
aws ec2 create-network-acl-entry \
--network-acl-id acl-12345678 \
--rule-number 100 \
--protocol tcp \
--port-range From=443,To=443 \
--cidr-block 0.0.0.0/0 \
--rule-action allow
aws ec2 create-network-acl-entry \
--network-acl-id acl-12345678 \
--rule-number 110 \
--protocol tcp \
--port-range From=22,To=22 \
--cidr-block 203.0.113.0/24 \
--rule-action allow
Firewall Configuration with iptables
Advanced iptables Rules
# Clear existing rules and set default policies
sudo iptables -F
sudo iptables -P INPUT DROP
sudo iptables -P FORWARD DROP
sudo iptables -P OUTPUT ACCEPT
# Allow loopback traffic
sudo iptables -A INPUT -i lo -j ACCEPT
sudo iptables -A OUTPUT -o lo -j ACCEPT
# Allow established and related connections
sudo iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
# Allow SSH from specific networks
sudo iptables -A INPUT -p tcp --dport 22 -s 203.0.113.0/24 -m state --state NEW -j ACCEPT
# Allow HTTP and HTTPS
sudo iptables -A INPUT -p tcp --dport 80 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 443 -j ACCEPT
# Rate limiting for SSH
sudo iptables -A INPUT -p tcp --dport 22 -m recent --name ssh --set
sudo iptables -A INPUT -p tcp --dport 22 -m recent --name ssh --rcheck --seconds 60 --hitcount 4 -j DROP
# Save rules
sudo iptables-save | sudo tee /etc/iptables/rules.v4
Network Monitoring and Intrusion Detection
Fail2ban Configuration
# Install and configure fail2ban
sudo dnf install epel-release -y
sudo dnf install fail2ban -y
# Create custom jail configuration
sudo tee /etc/fail2ban/jail.d/sshd.conf << 'EOF'
[sshd]
enabled = true
port = ssh
filter = sshd
logpath = /var/log/secure
maxretry = 3
bantime = 3600
findtime = 600
EOF
# Start and enable fail2ban
sudo systemctl start fail2ban
sudo systemctl enable fail2ban
# Check banned IPs
sudo fail2ban-client status sshd
How to Configure Automated Linux EC2 Monitoring?
Comprehensive monitoring enables proactive system management and rapid issue detection. Furthermore, automated monitoring reduces administrative overhead while maintaining system reliability and performance visibility.
CloudWatch Agent Installation and Configuration
Installing CloudWatch Agent
# Download and install CloudWatch agent
wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
sudo rpm -U amazon-cloudwatch-agent.rpm
# Create IAM policy for CloudWatch agent
aws iam attach-role-policy \
--role-name EC2-CloudWatch-Role \
--policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy
Advanced CloudWatch Configuration
{
"agent": {
"metrics_collection_interval": 60,
"run_as_user": "cwagent"
},
"metrics": {
"namespace": "CWAgent",
"metrics_collected": {
"cpu": {
"measurement": [
"cpu_usage_idle",
"cpu_usage_iowait",
"cpu_usage_user",
"cpu_usage_system"
],
"metrics_collection_interval": 60
},
"disk": {
"measurement": [
"used_percent"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"diskio": {
"measurement": [
"io_time",
"read_bytes",
"write_bytes",
"reads",
"writes"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"mem": {
"measurement": [
"mem_used_percent"
],
"metrics_collection_interval": 60
},
"netstat": {
"measurement": [
"tcp_established",
"tcp_time_wait"
],
"metrics_collection_interval": 60
},
"swap": {
"measurement": [
"swap_used_percent"
],
"metrics_collection_interval": 60
}
}
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/messages",
"log_group_name": "/aws/ec2/system",
"log_stream_name": "{instance_id}-messages"
},
{
"file_path": "/var/log/secure",
"log_group_name": "/aws/ec2/security",
"log_stream_name": "{instance_id}-secure"
}
]
}
}
}
}
Starting CloudWatch Agent
# Save configuration and start agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config \
-m ec2 \
-c file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json \
-s
# Verify agent status
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-m ec2 \
-a query-config
Custom Monitoring Scripts
System Health Monitoring Script
#!/bin/bash
# Comprehensive system health monitor
LOG_FILE="/var/log/system-health.log"
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
# Function to log messages
log_message() {
echo "[$TIMESTAMP] $1" | tee -a $LOG_FILE
}
# Check CPU usage
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2 + $4}' | sed 's/%us,//')
if (( $(echo "$CPU_USAGE > 80" | bc -l) )); then
log_message "WARNING: High CPU usage: ${CPU_USAGE}%"
fi
# Check memory usage
MEM_USAGE=$(free | awk 'NR==2{printf "%.0f", $3*100/$2}')
if [ $MEM_USAGE -gt 85 ]; then
log_message "WARNING: High memory usage: ${MEM_USAGE}%"
fi
# Check disk space
while IFS= read -r line; do
USAGE=$(echo $line | awk '{print $5}' | sed 's/%//')
PARTITION=$(echo $line | awk '{print $6}')
if [ $USAGE -gt 90 ]; then
log_message "CRITICAL: Disk space critical on $PARTITION: ${USAGE}%"
elif [ $USAGE -gt 80 ]; then
log_message "WARNING: High disk usage on $PARTITION: ${USAGE}%"
fi
done < <(df -h | grep -E '^/dev/')
# Check load average
LOAD_AVG=$(uptime | awk -F'load average:' '{print $2}' | awk '{print $1}' | sed 's/,//')
CPU_CORES=$(nproc)
LOAD_THRESHOLD=$(echo "$CPU_CORES * 0.8" | bc -l)
if (( $(echo "$LOAD_AVG > $LOAD_THRESHOLD" | bc -l) )); then
log_message "WARNING: High load average: $LOAD_AVG (threshold: $LOAD_THRESHOLD)"
fi
log_message "System health check completed"
Performance Baseline Script
#!/bin/bash
# Performance baseline collection
BASELINE_DIR="/opt/monitoring/baselines"
DATE=$(date +%Y%m%d_%H%M%S)
mkdir -p $BASELINE_DIR
# Collect system information
{
echo "=== System Information ==="
uname -a
cat /proc/version
echo
echo "=== CPU Information ==="
cat /proc/cpuinfo | grep "model name" | head -1
echo "CPU Cores: $(nproc)"
cat /proc/loadavg
echo
echo "=== Memory Information ==="
free -h
cat /proc/meminfo | grep -E "(MemTotal|MemFree|Buffers|Cached|SwapTotal|SwapFree)"
echo
echo "=== Disk Information ==="
df -h
iostat -x 1 1
echo
echo "=== Network Information ==="
ip addr show
ss -tuln
} > "$BASELINE_DIR/baseline_${DATE}.txt"
echo "Baseline saved to $BASELINE_DIR/baseline_${DATE}.txt"
What are Linux EC2 Backup and Recovery Strategies?
Comprehensive backup and recovery strategies protect against data loss and ensure business continuity. Additionally, automated backup solutions reduce administrative burden while maintaining recovery point objectives.
EBS Snapshot Automation
AWS Data Lifecycle Manager Setup
# Create IAM role for DLM
aws iam create-role \
--role-name DLMRole \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "dlm.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}'
# Attach DLM service policy
aws iam attach-role-policy \
--role-name DLMRole \
--policy-arn arn:aws:iam::aws:policy/service-role/AWSDataLifecycleManagerServiceRole
# Create lifecycle policy
aws dlm create-lifecycle-policy \
--description "Daily EBS snapshots" \
--state ENABLED \
--execution-role-arn arn:aws:iam::123456789012:role/DLMRole \
--policy-details '{
"ResourceTypes": ["VOLUME"],
"TargetTags": [{"Key": "Environment", "Value": "Production"}],
"Schedules": [{
"Name": "DailySnapshot",
"CreateRule": {
"Interval": 24,
"IntervalUnit": "HOURS",
"Times": ["03:00"]
},
"RetainRule": {
"Count": 7
},
"TagsToAdd": [
{"Key": "SnapshotType", "Value": "Automated"},
{"Key": "CreatedBy", "Value": "DLM"}
],
"CopyTags": true
}]
}'
Application-Level Backup Strategies
Database Backup Automation
#!/bin/bash
# MySQL backup with point-in-time recovery
DB_NAME="production_db"
BACKUP_DIR="/opt/backups/mysql"
RETENTION_DAYS=30
DATE=$(date +%Y%m%d_%H%M%S)
mkdir -p $BACKUP_DIR
# Create full backup
mysqldump --single-transaction \
--routines \
--triggers \
--events \
--master-data=2 \
$DB_NAME | gzip > "$BACKUP_DIR/${DB_NAME}_full_${DATE}.sql.gz"
# Enable binary logging for point-in-time recovery
mysql -e "SET GLOBAL log_bin = ON;"
mysql -e "SET GLOBAL expire_logs_days = $RETENTION_DAYS;"
# Upload to S3
aws s3 cp "$BACKUP_DIR/${DB_NAME}_full_${DATE}.sql.gz" \
s3://your-backup-bucket/mysql/
# Cleanup old local backups
find $BACKUP_DIR -name "${DB_NAME}_full_*.sql.gz" -mtime +$RETENTION_DAYS -delete
echo "Backup completed: ${DB_NAME}_full_${DATE}.sql.gz"
File System Backup with rsync
#!/bin/bash
# Incremental backup with rsync
SOURCE_DIR="/opt/application"
BACKUP_DIR="/opt/backups/files"
REMOTE_HOST="backup-server.example.com"
DATE=$(date +%Y%m%d)
# Create local incremental backup
rsync -avz \
--delete \
--backup \
--backup-dir="$BACKUP_DIR/incremental_$DATE" \
$SOURCE_DIR/ \
$BACKUP_DIR/current/
# Sync to remote backup server
rsync -avz \
--delete \
$BACKUP_DIR/ \
root@$REMOTE_HOST:/backups/$(hostname)/
# Create snapshot for long-term retention
if [ $(date +%d) = "01" ]; then
cp -al "$BACKUP_DIR/current" "$BACKUP_DIR/monthly_$(date +%Y%m)"
fi
echo "Backup completed to $BACKUP_DIR and $REMOTE_HOST"
Disaster Recovery Planning
Automated Recovery Testing
#!/bin/bash
# Disaster recovery test automation
TEST_INSTANCE_TYPE="t3.micro"
TEST_VPC_ID="vpc-test123"
TEST_SUBNET_ID="subnet-test123"
SNAPSHOT_ID="snap-12345678"
echo "Starting disaster recovery test..."
# Create test volume from snapshot
TEST_VOLUME_ID=$(aws ec2 create-volume \
--snapshot-id $SNAPSHOT_ID \
--volume-type gp3 \
--availability-zone us-east-1a \
--tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=DR-Test}]' \
--query 'VolumeId' \
--output text)
# Launch test instance
TEST_INSTANCE_ID=$(aws ec2 run-instances \
--image-id ami-0abcdef1234567890 \
--instance-type $TEST_INSTANCE_TYPE \
--subnet-id $TEST_SUBNET_ID \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=DR-Test}]' \
--query 'Instances[0].InstanceId' \
--output text)
# Wait for instance to be running
aws ec2 wait instance-running --instance-ids $TEST_INSTANCE_ID
# Attach test volume
aws ec2 attach-volume \
--volume-id $TEST_VOLUME_ID \
--instance-id $TEST_INSTANCE_ID \
--device /dev/xvdf
echo "DR test environment created:"
echo "Instance ID: $TEST_INSTANCE_ID"
echo "Volume ID: $TEST_VOLUME_ID"
echo "Test your recovery procedures, then run cleanup script"
How to Optimize Linux EC2 Performance?
Performance optimization ensures efficient resource utilization and optimal application responsiveness. Moreover, systematic performance tuning reduces costs while improving user experience and system reliability.
CPU and Memory Optimization
CPU Governor Configuration
# Check available CPU governors
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors
# Set performance governor for high-performance workloads
echo "performance" | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
# Create persistent configuration
sudo tee /etc/systemd/system/cpu-performance.service << 'EOF'
[Unit]
Description=Set CPU governor to performance
After=multi-user.target
[Service]
Type=oneshot
ExecStart=/bin/bash -c 'echo performance > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor'
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl enable cpu-performance.service
Memory Optimization Configuration
# Configure swap behavior
echo "vm.swappiness=10" | sudo tee -a /etc/sysctl.conf
echo "vm.dirty_ratio=15" | sudo tee -a /etc/sysctl.conf
echo "vm.dirty_background_ratio=5" | sudo tee -a /etc/sysctl.conf
# Configure huge pages for database workloads
echo "vm.nr_hugepages=1024" | sudo tee -a /etc/sysctl.conf
echo "never" | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
# Apply changes
sudo sysctl -p
Storage Performance Tuning
I/O Scheduler Optimization
# Check current I/O scheduler
cat /sys/block/xvda/queue/scheduler
# Set optimal scheduler for SSD storage
echo "none" | sudo tee /sys/block/xvd*/queue/scheduler
# Make changes persistent
sudo tee /etc/udev/rules.d/60-ssd-scheduler.rules << 'EOF'
# Set noop scheduler for SSD devices
KERNEL=="xvd*", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="none"
EOF
File System Performance Tuning
# Optimize ext4 file system
sudo tune2fs -o journal_data_writeback /dev/xvdf
sudo mount -o remount,noatime,nodiratime,data=writeback /opt/data
# Configure read-ahead for better sequential performance
sudo blockdev --setra 4096 /dev/xvdf
# Enable file system-level read-ahead
echo "4096" | sudo tee /sys/block/xvdf/queue/read_ahead_kb
Network Performance Optimization
Network Interface Optimization
# Enable SR-IOV for enhanced network performance
aws ec2 modify-instance-attribute \
--instance-id i-1234567890abcdef0 \
--sriov-net-support simple
# Configure network interface offloading
sudo ethtool -K eth0 rx on tx on gso on tso on gro on lro on
# Optimize network buffer sizes
echo "net.core.rmem_default = 262144" | sudo tee -a /etc/sysctl.conf
echo "net.core.rmem_max = 16777216" | sudo tee -a /etc/sysctl.conf
echo "net.core.wmem_default = 262144" | sudo tee -a /etc/sysctl.conf
echo "net.core.wmem_max = 16777216" | sudo tee -a /etc/sysctl.conf
# TCP optimization
echo "net.ipv4.tcp_window_scaling = 1" | sudo tee -a /etc/sysctl.conf
echo "net.ipv4.tcp_rmem = 4096 65536 16777216" | sudo tee -a /etc/sysctl.conf
echo "net.ipv4.tcp_wmem = 4096 65536 16777216" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
Performance Monitoring and Benchmarking
Comprehensive Performance Test
#!/bin/bash
# System performance benchmark
echo "=== CPU Performance Test ==="
time openssl speed -multi $(nproc) rsa2048
echo -e "\n=== Memory Performance Test ==="
sudo dnf install -y sysbench
sysbench memory --threads=$(nproc) --memory-total-size=1G run
echo -e "\n=== Disk Performance Test ==="
# Test write performance
dd if=/dev/zero of=/tmp/testfile bs=1M count=1024 conv=fdatasync
# Test read performance
dd if=/tmp/testfile of=/dev/null bs=1M count=1024
echo -e "\n=== Network Performance Test ==="
# Install iperf3 if not available
which iperf3 || sudo dnf install -y iperf3
# Test internal bandwidth
iperf3 -c 169.254.169.254 -t 10
rm -f /tmp/testfile
echo -e "\nPerformance test completed"
Why Use Infrastructure as Code for Linux EC2?
Infrastructure as Code (IaC) provides consistent, repeatable, and version-controlled infrastructure deployments. Furthermore, IaC reduces human error, improves collaboration, and enables rapid scaling of Linux EC2 environments.
CloudFormation Templates for Linux EC2
Complete Linux EC2 Stack Template
AWSTemplateFormatVersion: '2010-09-09'
Description: 'Optimized Linux EC2 infrastructure with best practices'
Parameters:
InstanceType:
Type: String
Default: t3.medium
AllowedValues: [t3.micro, t3.small, t3.medium, t3.large]
Description: EC2 instance type
KeyPairName:
Type: AWS::EC2::KeyPair::KeyName
Description: EC2 Key Pair for SSH access
VpcCIDR:
Type: String
Default: 10.0.0.0/16
Description: CIDR block for VPC
Resources:
# VPC Configuration
VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: !Ref VpcCIDR
EnableDnsHostnames: true
EnableDnsSupport: true
Tags:
- Key: Name
Value: Linux-Production-VPC
# Internet Gateway
InternetGateway:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
Value: Linux-Production-IGW
AttachGateway:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
VpcId: !Ref VPC
InternetGatewayId: !Ref InternetGateway
# Public Subnet
PublicSubnet:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
CidrBlock: 10.0.1.0/24
AvailabilityZone: !Select [0, !GetAZs '']
MapPublicIpOnLaunch: true
Tags:
- Key: Name
Value: Linux-Public-Subnet
# Route Table
PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: Linux-Public-RouteTable
PublicRoute:
Type: AWS::EC2::Route
DependsOn: AttachGateway
Properties:
RouteTableId: !Ref PublicRouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway
PublicSubnetRouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnet
RouteTableId: !Ref PublicRouteTable
# Security Group
LinuxSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Security group for Linux EC2 instances
VpcId: !Ref VPC
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: 0.0.0.0/0
Description: SSH access
- IpProtocol: tcp
FromPort: 80
ToPort: 80
CidrIp: 0.0.0.0/0
Description: HTTP access
- IpProtocol: tcp
FromPort: 443
ToPort: 443
CidrIp: 0.0.0.0/0
Description: HTTPS access
Tags:
- Key: Name
Value: Linux-Security-Group
# IAM Role for EC2
EC2InstanceRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service: ec2.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy
Tags:
- Key: Name
Value: Linux-EC2-Role
EC2InstanceProfile:
Type: AWS::IAM::InstanceProfile
Properties:
Roles:
- !Ref EC2InstanceRole
# EBS Volume
DataVolume:
Type: AWS::EC2::Volume
Properties:
AvailabilityZone: !Select [0, !GetAZs '']
Size: 100
VolumeType: gp3
Encrypted: true
Iops: 3000
Throughput: 250
Tags:
- Key: Name
Value: Linux-Data-Volume
# EC2 Instance
LinuxInstance:
Type: AWS::EC2::Instance
Properties:
ImageId: ami-0abcdef1234567890 # Amazon Linux 2 AMI
InstanceType: !Ref InstanceType
KeyName: !Ref KeyPairName
IamInstanceProfile: !Ref EC2InstanceProfile
SecurityGroupIds:
- !Ref LinuxSecurityGroup
SubnetId: !Ref PublicSubnet
UserData:
Fn::Base64: !Sub |
#!/bin/bash
yum update -y
yum install -y amazon-cloudwatch-agent
# Configure CloudWatch agent
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config \
-m ec2 \
-c ssm:AmazonCloudWatch-linux \
-s
# Install security tools
yum install -y fail2ban
systemctl enable fail2ban
systemctl start fail2ban
# Configure automatic updates
yum install -y yum-cron
systemctl enable yum-cron
systemctl start yum-cron
Tags:
- Key: Name
Value: Linux-Production-Server
- Key: Environment
Value: Production
- Key: Backup
Value: Required
Outputs:
InstanceId:
Description: Instance ID of the Linux server
Value: !Ref LinuxInstance
Export:
Name: !Sub ${AWS::StackName}-InstanceId
PublicIP:
Description: Public IP address
Value: !GetAtt LinuxInstance.PublicIp
Export:
Name: !Sub ${AWS::StackName}-PublicIP
VolumeId:
Description: EBS Volume ID
Value: !Ref DataVolume
Export:
Name: !Sub ${AWS::StackName}-VolumeId
Terraform Configuration
Modular Terraform Setup
# variables.tf
variable "region" {
description = "AWS region"
type = string
default = "us-east-1"
}
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t3.medium"
}
variable "key_pair_name" {
description = "EC2 Key Pair name"
type = string
}
variable "environment" {
description = "Environment name"
type = string
default = "production"
}
# main.tf
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = var.region
}
# Data sources
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
}
data "aws_availability_zones" "available" {
state = "available"
}
# VPC Module
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = "${var.environment}-linux-vpc"
cidr = "10.0.0.0/16"
azs = slice(data.aws_availability_zones.available.names, 0, 2)
public_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnets = ["10.0.11.0/24", "10.0.12.0/24"]
enable_nat_gateway = true
enable_vpn_gateway = false
tags = {
Environment = var.environment
Project = "linux-ec2-best-practices"
}
}
# Security Group
resource "aws_security_group" "linux_sg" {
name_prefix = "${var.environment}-linux-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "SSH"
}
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "HTTP"
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "HTTPS"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.environment}-linux-sg"
Environment = var.environment
}
}
# IAM Role
resource "aws_iam_role" "ec2_role" {
name = "${var.environment}-ec2-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
}
]
})
tags = {
Environment = var.environment
}
}
resource "aws_iam_role_policy_attachment" "cloudwatch_agent" {
role = aws_iam_role.ec2_role.name
policy_arn = "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy"
}
resource "aws_iam_instance_profile" "ec2_profile" {
name = "${var.environment}-ec2-profile"
role = aws_iam_role.ec2_role.name
}
# EC2 Instance
resource "aws_instance" "linux_server" {
ami = data.aws_ami.amazon_linux.id
instance_type = var.instance_type
key_name = var.key_pair_name
vpc_security_group_ids = [aws_security_group.linux_sg.id]
subnet_id = module.vpc.public_subnets[0]
iam_instance_profile = aws_iam_instance_profile.ec2_profile.name
user_data = base64encode(templatefile("${path.module}/user_data.sh", {
environment = var.environment
}))
root_block_device {
volume_type = "gp3"
volume_size = 20
encrypted = true
delete_on_termination = true
}
tags = {
Name = "${var.environment}-linux-server"
Environment = var.environment
Backup = "required"
}
}
# EBS Volume
resource "aws_ebs_volume" "data_volume" {
availability_zone = aws_instance.linux_server.availability_zone
size = 100
type = "gp3"
iops = 3000
throughput = 250
encrypted = true
tags = {
Name = "${var.environment}-data-volume"
Environment = var.environment
}
}
resource "aws_volume_attachment" "data_attachment" {
device_name = "/dev/xvdf"
volume_id = aws_ebs_volume.data_volume.id
instance_id = aws_instance.linux_server.id
}
# outputs.tf
output "instance_id" {
description = "ID of the EC2 instance"
value = aws_instance.linux_server.id
}
output "public_ip" {
description = "Public IP address of the instance"
value = aws_instance.linux_server.public_ip
}
output "private_ip" {
description = "Private IP address of the instance"
value = aws_instance.linux_server.private_ip
}
output "volume_id" {
description = "ID of the EBS data volume"
value = aws_ebs_volume.data_volume.id
}
Frequently Asked Questions (FAQ)
Q: What's the difference between instance store and EBS volumes for Linux EC2? A: Instance store provides temporary, high-performance storage that's physically attached to the host server, while EBS volumes offer persistent, network-attached storage that survives instance stops and starts. Use instance store for temporary data and caches, and EBS for operating systems and persistent application data.
Q: How do I choose between gp3, io1, and io2 EBS volume types? A: Choose gp3 for most general-purpose workloads requiring up to 16,000 IOPS, io1 for consistent high IOPS requirements up to 64,000, and io2 for mission-critical applications needing up to 256,000 IOPS with better durability (99.999% vs 99.9%).
Q: Should I use Elastic IP addresses for Linux EC2 instances? A: Use Elastic IP addresses only when you need a static public IP that persists across instance stops/starts. For most applications, use Application Load Balancers or Route 53 for high availability instead of relying on specific IP addresses.
Q: How can I reduce Linux EC2 costs without affecting performance? A: Consider Reserved Instances for predictable workloads, Spot Instances for fault-tolerant applications, right-sizing instances based on actual usage, using gp3 instead of gp2 volumes, and implementing automatic start/stop scheduling for development environments.
Q: What's the best practice for SSH key management in EC2? A: Use separate SSH keys for different environments, implement key rotation every 90 days, store private keys securely with proper file permissions (600), consider using AWS Systems Manager Session Manager for SSH-less access, and implement multi-factor authentication where possible.
Q: How do I migrate data from an existing Linux server to EC2? A: Use AWS Server Migration Service (SMS) for live server migration, create AMIs from existing instances, or manually migrate using rsync for file-level transfers. For databases, use AWS Database Migration Service (DMS) or database-specific tools like mysqldump.
Q: What monitoring metrics should I track for Linux EC2 instances? A: Monitor CPU utilization, memory usage, disk I/O and space, network throughput, system load average, failed login attempts, disk health metrics, and application-specific metrics. Set up CloudWatch alarms for thresholds that indicate potential issues.
Q: How do I implement high availability for Linux EC2 applications? A: Deploy instances across multiple Availability Zones, use Auto Scaling Groups for automatic scaling and replacement, implement Application Load Balancers for traffic distribution, use Route 53 health checks, and design applications to be stateless where possible.
Troubleshooting Common Issues
Instance Launch and Connectivity Problems
Issue: Cannot connect to EC2 instance via SSH
# Check security group rules
aws ec2 describe-security-groups --group-ids sg-12345678 \
--query 'SecurityGroups[0].IpPermissions[?FromPort==`22`]'
# Verify SSH key permissions
chmod 600 ~/.ssh/your-key.pem
ls -la ~/.ssh/your-key.pem
# Test connectivity with verbose output
ssh -v -i ~/.ssh/your-key.pem ec2-user@your-instance-ip
# Check instance system log for boot issues
aws ec2 get-console-output --instance-id i-1234567890abcdef0
Issue: High CPU usage unexpectedly
# Identify CPU-intensive processes
top -bn1 | head -20
# Check for runaway processes
ps aux --sort=-%cpu | head -10
# Monitor CPU usage over time
sar -u 1 60
# Check for system services consuming CPU
systemctl list-units --type=service --state=running
Storage and File System Issues
Issue: Running out of disk space
# Identify large files and directories
du -sh /* | sort -rh | head -10
find / -type f -size +100M 2>/dev/null | head -20
# Clean up package manager cache
sudo dnf clean all
sudo yum clean all
# Remove old log files
sudo journalctl --vacuum-time=7d
sudo find /var/log -name "*.log" -type f -mtime +30 -exec rm -f {} \;
# Clean temporary files
sudo rm -rf /tmp/*
sudo rm -rf /var/tmp/*
Issue: EBS volume not mounting
# Check if volume is attached
lsblk
# Verify file system
sudo file -s /dev/xvdf
# Check for file system errors
sudo fsck -n /dev/xvdf
# Mount with verbose output
sudo mount -v -t ext4 /dev/xvdf /mnt/data
# Check system messages for errors
dmesg | tail -20
Network and Security Issues
Issue: Network connectivity problems
# Test network connectivity
ping -c 4 8.8.8.8
dig google.com
# Check network interface configuration
ip addr show
ip route show
# Verify DNS resolution
nslookup google.com
cat /etc/resolv.conf
# Check for network drops
netstat -i
cat /proc/net/dev
Issue: Security group or firewall blocking connections
# Check local firewall rules
sudo iptables -L -n
sudo firewall-cmd --list-all
# Test specific port connectivity
telnet your-instance-ip 80
nc -zv your-instance-ip 443
# Check if service is listening
sudo netstat -tlnp | grep :80
sudo ss -tlnp | grep :443
Additional Resources
External Documentation and References
- AWS EC2 User Guide
- Linux Foundation Documentation
- Red Hat Enterprise Linux Documentation
- Ubuntu Server Guide
- AWS Well-Architected Framework
Security and Compliance Resources
- CIS Amazon Web Services Foundations Benchmark
- AWS Security Best Practices
- NIST Cybersecurity Framework
Performance and Monitoring Tools
Related LinuxTips.pro Articles
- Post #71: AWS CLI: Managing AWS Resources from Linux - Essential command-line tools for AWS management
- Post #46: Setting up Prometheus and Grafana on Linux - Advanced monitoring implementation
- Post #40: Backup Strategies: rsync, tar, and Cloud Solutions - Comprehensive backup methodologies
- Post #25: Linux Security Hardening: Complete Protection Guide - Security fundamentals and advanced protection
- Post #15: System Services with systemd - Service management and optimization
Community Resources and Forums
This comprehensive guide covered Linux EC2 best practices from infrastructure planning to performance optimization. By implementing these strategies systematically, you'll build secure, scalable, and efficient cloud infrastructure that follows AWS Well-Architected principles while maintaining cost-effectiveness and operational excellence.