6. AWS Infrastructure Monitoring and Logging (Terraform Hands On Project)

Introduction:

I will show you how to set up monitoring and logging for EC2 instances and applications running on AWS using Terraform. We will configure CloudWatch alarms to monitor CPU utilization and CloudWatch Logs to collect application logs.

Objective: To deploy an EC2 instance with CloudWatch monitoring and logging enabled, including setting up IAM roles and policies, security groups, and CloudWatch alarms. Monitor the CPU utilization of EC2 instances. Collect and store application logs from EC2 instances. Trigger CloudWatch alarms based on CPU utilization thresholds.

Resources Involved:

a) EC2 instances

b) CloudWatch Alarms

c) CloudWatch Logs

d) IAM Roles and Policies

  1. provider.tf

Purpose: Specifies the AWS provider and region for the Terraform configuration.

COPY

provider "aws" { 
    region = "us-west-2" 
}

2. iam.tf

Purpose: Defines IAM roles and policies required for the EC2 instance to send logs and metrics to CloudWatch.

COPY

resource "aws_iam_role" "ec2_cloudwatch_role_new" {
  name = "ec2_cloudwatch_role_new"

  assume_role_policy = jsonencode({
    Version = "2012-10-17",
    Statement = [{
      Action    = "sts:AssumeRole",
      Effect    = "Allow",
      Principal = {
Service = "ec2.amazonaws.com"
      }
    }]
  })
}

resource "aws_iam_role_policy" "ec2_cloudwatch_policy" {
  name   = "ec2_cloudwatch_policy"
role = aws_iam_role.ec2_cloudwatch_role_new.id
  policy = file("ec2_cloudwatch_policy.json")
}

resource "aws_iam_instance_profile" "ec2_cloudwatch_profile" {
  name = "ec2_cloudwatch_profile"
role = aws_iam_role.ec2_cloudwatch_role_new.name
}

3. ec2.tf

Purpose: Configures the EC2 instance, attaches the IAM role, and sets up user data for CloudWatch Agent installation and configuration.

COPY

data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]
}

resource "aws_instance" "web" {
  ami           = data.aws_ami.amazon_linux.id
  instance_type = "t2.micro"

  iam_instance_profile = aws_iam_instance_profile.ec2_cloudwatch_profile.name
  security_groups      = [aws_security_group.ec2_sg.name]

  tags = {
    Name = "WebServer"
  }

  user_data = <<-EOF
                #!/bin/bash
                yum update -y
                yum install -y awslogs
                yum install -y amazon-cloudwatch-agent
                amazon-linux-extras install epel -y
                yum install -y stress
                cat <<EOT >> /opt/aws/amazon-cloudwatch-agent/bin/config.json
                {
                  "agent": {
                    "metrics_collection_interval": 60,
                    "logfile": "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log"
                  },
                  "logs": {
                    "logs_collected": {
                      "files": {
                        "collect_list": [
                          {
                            "file_path": "/var/log/messages",
                            "log_group_name": "/aws/ec2/app_logs",
                            "log_stream_name": "{instance_id}-messages",
                            "timestamp_format": "%Y-%m-%d %H:%M:%S"
                          }
                        ]
                      }
                    }
                  },
                  "metrics": {
                    "metrics_collected": {
                      "cpu": {
                        "measurement": [
                          {"name": "cpu_usage_idle", "rename": "CPUIdle", "unit": "Percent"},
                          {"name": "cpu_usage_nice", "unit": "Percent"},
                          {"name": "cpu_usage_system", "unit": "Percent"},
                          {"name": "cpu_usage_user", "unit": "Percent"}
                        ],
                        "totalcpu": true,
                        "metrics_collection_interval": 60
                      },
                      "disk": {
                        "measurement": [
                          {"name": "disk_free", "rename": "FreeDiskSpace", "unit": "Gigabytes"},
                          {"name": "disk_used", "rename": "UsedDiskSpace", "unit": "Gigabytes"}
                        ],
                        "metrics_collection_interval": 60,
                        "resources": ["*"]
                      }
                    }
                  }
                }
                EOT
                /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s
                systemctl start amazon-cloudwatch-agent
                systemctl enable amazon-cloudwatch-agent
                EOF
}

4. security_group.tf

Purpose: Defines a security group to control inbound and outbound traffic for the EC2 instance.

COPY

resource "aws_security_group" "ec2_sg" {
  name        = "ec2_security_group"
  description = "Allow inbound traffic to EC2 instances"

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

5. cloudwatch.tf

Purpose: Configures CloudWatch log groups and alarms for monitoring EC2 instance performance.

COPY

resource "aws_cloudwatch_log_group" "app_logs" {
  name = "/aws/ec2/app_logs"
}

resource "aws_cloudwatch_metric_alarm" "cpu_high" {
  alarm_name          = "High_CPU_Usage"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = "1"
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  period              = "60"
  statistic           = "Average"
  threshold           = "80"

  alarm_description = "Alarm when CPU exceeds 80%"

  dimensions = {
    InstanceId = aws_instance.web.id
  }

  actions_enabled = true
  alarm_actions   = []
}

Steps to Apply the Terraform Configuration Initialize Terraform. Open your terminal and run the following command to initialize Terraform:

terraform init

Plan the Changes:

terraform plan

Apply the Changes:

terraform apply

Confirm with yes when prompted.

Verify Resources:

EC2 Instance: Check the EC2 Dashboard for the created instance.

IAM Role: Ensure the IAM role and instance profile are correctly attached.

CloudWatch Logs: Go to the CloudWatch Dashboard to view log groups and streams.

CloudWatch Alarms: Check the CloudWatch Alarms section for active alarms.

Use SSH to connect to your EC2 instance.

Generate CPU Load:

To test the CloudWatch alarm, you can generate CPU load on the instance to trigger the alarm.

Run the following command to create CPU load:

COPY

sudo amazon-linux-extras install epel -y
sudo yum install -y stress
stress --cpu 2 --timeout 300

Verify CloudWatch Logs:

Check the CloudWatch console under Logs > Log Groups > aws/ec2/app_logs to see if logs from your EC2 instance are being collected.

Verify CloudWatch Alarms:

Check the CloudWatch console under Alarms to see if the high_cpu alarm gets triggered when the CPU utilization exceeds 80%.

Conclusion:

Using Terraform, we have successfully set up monitoring and logging for EC2 instances and applications on AWS. This setup ensures that we can monitor CPU utilization and collect logs for better observability and troubleshooting.