Containers

Developers guide to using Amazon EFS with Amazon ECS and AWS Fargate – Part 3

Welcome to Part 3 of this blog post series on how to use Amazon EFS with Amazon ECS and AWS Fargate. For reference, these are the blog posts in this series:

  • Part 1: This blog provides the background about the need for this integration, its scope and provides a high-level view of the use cases and scenarios this feature unlocks and that it enables for our customers
  • Part 2: A deep dive on how EFS security works in container deployments based on ECS and Fargate with some high-level considerations around regional ECS and EFS deployments best practices
  • Part 3: [this blog post] A practical example, including reusable code and commands, of a containerized application deployed on ECS tasks that use EFS

In this post, we are going to create code examples to try what we learned in Parts 1 and 2. We are going to segment this blog in two main blocks (with two separate examples). They are:

  • Stateful standalone tasks to run applications that require file system persistency
  • Multiple tasks that access in parallel a shared file system

If you want to read more about the theory behind these, please refer to Part 1. We are now going to dive deep into the code examples.

In these examples, the ECS tasks will be run on Fargate but the exact same workflow would apply if you were to run the same tasks on EC2 instances (using a different task launch type).

Prerequisites and considerations for running the examples

The examples below assume you have a VPC available with at least two public and two private subnets, one per Availability Zone. There are many ways to create such VPC: from the AWS CLI via CloudFormation all the way to the CDK. If you don’t have a standard way to create a temporary VPC for this exercise this would be a good option.

The examples and commands also assume you have a proper AWS CLI v2 environment at the latest version (previous versions may not support the new integration). In addition the client, your user needs the ability to build container images (using Docker) and needs to have the jq and curl utilities installed. I used eksutils in an AWS Cloud9 environment but you can use any setup that has the prerequisites mentioned.

While a higher degree of automation could be achieved in the code examples, we tried to create a fair level of interaction so that you understand, at every step, what’s being done. This is primarily a learning exercise. It is not how we would recommend building a production-ready CI/CD pipeline.

Each section may use system variables populated in the previous session so it’s important you keep the same shell context. For convenience, the scripts and commands outlined below echo the content of those variables on terminal as well as they “tee” these to a log file (ecs-efs-variables.log) if you need them to recreate the context at any point.

From your terminal, let’s start laying out the plumbing with the variables that represents your environment and the initialization of the log file. Failure to set these environment variables, may lead to failures in the examples provided.

export AWS_ACCESS_KEY_ID=<xxxxxxxxxxxxxx>
export AWS_SECRET_ACCESS_KEY=<xxxxxxxxxxxxxxxxxxxxx>
export AWS_DEFAULT_OUTPUT="json"

export AWS_REGION=<xxxxxxx>
export VPC_ID=<vpc-xxxx>
export PUBLIC_SUBNET1=<subnet-xxxxx>
export PUBLIC_SUBNET2=<subnet-xxxxx>
export PRIVATE_SUBNET1=<subnet-xxxxx>
export PRIVATE_SUBNET2=<subnet-xxxxx>
date | tee ecs-efs-variables.log 

Stateful standalone tasks to run applications that require file system persistency

This example mimics an existing application that requires configurations to persist across restarts. The example is based on NGINX and it’s fairly basic but it is intended to be representative of more complex scenarios our customers have that require the features we are going to leverage.

This custom application can only run, by design, standalone. It is effectively a singleton. It stores important configuration information in a file called /server/config.json. This application is limited to store this information on the file system. No changes can be made to its code and we need to work within the boundaries of the application architecture characteristics.

The information in the configuration file is generated when the application is installed and starts for the first time but it needs to persist when a task restarts. First, start up the application generates a RANDOM_ID and it saves it into the critically important /server/config.json file. The unique code id is then imported into the home page of the web server. If the application needs to restart, it checks if the file is there. If it doesn’t exist, it assumes it is the first time the application launches and it will create it. If it exists, it skips its recreation.

This is how this logic is implemented in the startup script (startup.sh) of this application:

#!/bin/bash
apt-get update
apt-get install -y curl jq
CONFIG_FILE="/server/config.json"

# this grabs the private IP of the container
CONTAINER_METADATA=$(curl ${ECS_CONTAINER_METADATA_URI_V4}/task)
PRIVATE_IP=$(echo $CONTAINER_METADATA | jq --raw-output .Containers[0].Networks[0].IPv4Addresses[0])
AZ=$(echo $CONTAINER_METADATA | jq --raw-output .AvailabilityZone)
echo $CONTAINER_METADATA
echo $PRIVATE_IP
echo $AZ

# this generates a unique ID
RANDOM_ID=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)

# if this is the first time the server starts the config file is populated
mkdir -p /server
if [ ! -f "$CONFIG_FILE" ]; then echo $RANDOM_ID > /server/config.json; fi

# the index.html file is generated with the private ip and the unique id
echo -n "Unique configuration ID       : " > /usr/share/nginx/html/index.html
echo $(cat $CONFIG_FILE) >> /usr/share/nginx/html/index.html
echo -n "Origin - Private IP            : " >> /usr/share/nginx/html/index.html
echo $PRIVATE_IP >> /usr/share/nginx/html/index.html
echo -n "Origin - Availability Zone     : " >> /usr/share/nginx/html/index.html
echo $AZ >> /usr/share/nginx/html/index.html
# this starts the nginx service
nginx -g "daemon off;"

This application is only used during standard working hours in a given timezone and we would like to create a workflow that starts the app at 7AM in the morning and shut it down at 7PM in the night. This will help cutting in half the bill for the application.

Before the EFS integration, if you were to launch this application on an ECS task, upon a restart the RANDOM_ID in the /server/config.json file would be lost. The script would re-generate the file with a new id and this would cause the application to break.

We decide to package this application in a container and to do so we author the following Dockerfile in the same directory where we created the startup.sh file.

FROM nginx:1.11.5
MAINTAINER massimo@it20.info
ADD startup.sh . 
RUN chmod +x startup.sh
CMD ["./startup.sh"]

We are now ready to:

  • create an ECR repo called standalone-app
  • build the image
  • log in to ECR
  • push the container image to the ECR repo
ECR_STANDALONE_APP_REPO=$(aws ecr create-repository --repository-name standalone-app --region $AWS_REGION)
ECR_STANDALONE_APP_REPO_URI=$(echo $ECR_STANDALONE_APP_REPO | jq --raw-output .repository.repositoryUri)
echo The ECR_STANDALONE_APP_REPO_URI is: $ECR_STANDALONE_APP_REPO_URI | tee -a ecs-efs-variables.log

docker build -t $ECR_STANDALONE_APP_REPO_URI:1.0 . 

echo $(aws ecr get-login-password --region $AWS_REGION) | docker login --password-stdin --username AWS $ECR_STANDALONE_APP_REPO_URI

docker push $ECR_STANDALONE_APP_REPO_URI:1.0 

At this point, we are ready to create a basic ECS task without EFS support to demonstrate the limitations of an ephemeral deployment. Before we do that we need to create an IAM policy document that allows the tasks to assume an execution role. Create a policy document called ecs-tasks-trust-policy.json and add the following content:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "ecs-tasks.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Now we can create a task definition called standalone-app.json with the following content. Make sure you edit the image with the content of the variable $ECR_STANDALONE_APP_REPO_URI, the account id, and the Region.

{"family": "standalone-app",
    "networkMode": "awsvpc",
    "executionRoleArn": "arn:aws:iam::ACCOUNT_ID:role/task-exec-role",
    "containerDefinitions": [
        {"name": "standalone-app",
            "image": "<xxxxxxxxxxxxxxxxxxxxx>:1.0",
            "portMappings": [
                {
                    "containerPort": 80
                }
            ],
            "logConfiguration": {
                "logDriver": "awslogs",
                    "options": {
                       "awslogs-group": "/aws/ecs/standalone-app",
                       "awslogs-region": "REGION",
                       "awslogs-stream-prefix": "standalone-app"
                       }
            }
        }
    ],
    "requiresCompatibilities": [
        "FARGATE",
        "EC2"
    ],
    "cpu": "256",
    "memory": "512"
}

In the next batch of commands, we:

  • create an ECS cluster to hold our tasks
  • create a task execution role (task-exec-role)
  • assign the AWS managed AmazonECSTaskExecutionRolePolicy policy to the role
  • register the task definition above
  • create a log group
  • create and configure a security group (standalone-app-SG) to allow access to port 80
aws ecs create-cluster --cluster-name app-cluster --region $AWS_REGION

aws iam create-role --role-name task-exec-role --assume-role-policy-document file://ecs-tasks-trust-policy.json --region $AWS_REGION

aws iam attach-role-policy --role-name task-exec-role  --policy-arn "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy" 

aws ecs register-task-definition --cli-input-json file://standalone-app.json --region $AWS_REGION

aws logs create-log-group --log-group-name /aws/ecs/standalone-app --region $AWS_REGION

STANDALONE_APP_SG=$(aws ec2 create-security-group --group-name standalone-app-SG --description "Standalone app SG" --vpc-id $VPC_ID --region $AWS_REGION) 
STANDALONE_APP_SG_ID=$(echo $STANDALONE_APP_SG | jq --raw-output .GroupId)
aws ec2 authorize-security-group-ingress --group-id $STANDALONE_APP_SG_ID --protocol tcp --port 80 --cidr 0.0.0.0/0 --region $AWS_REGION 
export STANDALONE_APP_SG_ID # the STANDALONE_APP_SG_ID variable needs to be exported to be able to use it from scripts
echo The STANDALONE_APP_SG_ID is: $STANDALONE_APP_SG_ID  | tee -a ecs-efs-variables.log

This is the setup we built with the ephemeral task representing our application being recycled:

We are now going to demonstrate what happens when you start and stop this task. To do so, we will create a script that will cycle in a loop. The script will start and stop the application every two minutes, for five times. The script will query the application while the task is running.

This is the standalone-loop-check.sh script:

#!/bin/bash 
COUNTER=0
echo ------------
while [  $COUNTER -lt 5 ]; do
    TASK=$(aws ecs run-task --cluster app-cluster --task-definition standalone-app --count 1 --launch-type FARGATE --platform-version 1.4.0 --network-configuration "awsvpcConfiguration={subnets=[$PUBLIC_SUBNET1, $PUBLIC_SUBNET2],securityGroups=[$STANDALONE_APP_SG_ID],assignPublicIp=ENABLED}" --region $AWS_REGION)
    TASK_ARN=$(echo $TASK | jq --raw-output .tasks[0].taskArn) 
    sleep 20
    TASK=$(aws ecs describe-tasks --cluster app-cluster --tasks $TASK_ARN --region $AWS_REGION)
    TASK_ENI=$(echo $TASK | jq --raw-output '.tasks[0].attachments[0].details[] | select(.name=="networkInterfaceId") | .value')
    ENI=$(aws ec2 describe-network-interfaces --network-interface-ids $TASK_ENI --region $AWS_REGION)
    PUBLIC_IP=$(echo $ENI | jq --raw-output .NetworkInterfaces[0].Association.PublicIp)
    sleep 100
    curl $PUBLIC_IP
    echo ------------
    aws ecs stop-task --cluster app-cluster --task $TASK_ARN --region $AWS_REGION > /dev/null
    let COUNTER=COUNTER+1 
done

Add the execute flag to the script (chmod +x standalone-loop-check.sh) and launch it. The output should be similar to this:

sh-4.2# ./standalone-loop-check.sh 
------------
Unique configuration ID       : FZaeWiQfCFRxy4Kb3VrQGCOtGXH1AZGL
Origin - Private IP            : 10.0.42.251
Origin - Availability Zone     : us-west-2a
------------
Unique configuration ID       : 0zPEpXGxGvwNxHlcpb2s2tV85VjDYsyK
Origin - Private IP            : 10.0.2.50
Origin - Availability Zone     : us-west-2a
------------
Unique configuration ID       : CwjjgmNB8TQSc9UYkj2V2Z3cbQ25STZh
Origin - Private IP            : 10.0.76.91
Origin - Availability Zone     : us-west-2b
------------
Unique configuration ID       : KB3FtvDoAAChOxwe993MzyPKX37t3Vgy
Origin - Private IP            : 10.0.26.76
Origin - Availability Zone     : us-west-2a
------------
Unique configuration ID       : ngdUNWomeqSm1uOUAaYG84JULVm5dBAV
Origin - Private IP            : 10.0.42.181
Origin - Availability Zone     : us-west-2a
------------
sh-4.2# 

As you can see, the RANDOM_ID changes at every restart and this will break the application. We need to find a way to persist the /server/config.json file across restarts. Enter EFS.

We will shortly configure an EFS file system and made it accessible to the ECS tasks. Before we dive into the AWS CLI commands that will make it happen, we need to create a policy document called efs-policy.json. This policy, which we will use with the CLI, contains a single rule, which denies any traffic that isn’t secure. The policy does not explicitly grant anyone the ability to mount the file system:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Deny",
            "Principal": {
                "AWS": "*"
            },
            "Action": "*",
            "Condition": {
                "Bool": {
                    "aws:SecureTransport": "false"
                }
            }
        }
    ]
}

We are now ready to configure the EFS service. In the next batch of commands we are going to:

  • create an EFS file system
  • set a default policy that enforces in-transit encryption for all clients
  • create and configure security group (efs-SG) that allows in-bound access on port 2049 (the NFS protocol) from standalone-app-SG
  • create two mount targets in the two private subnets
  • create an EFS Access Point called standalone-app-EFS-AP that maps to the directory /server

We are now ready to launch the AWS CLI commands that will create the setup mentioned above:

EFS=$(aws efs create-file-system --region $AWS_REGION)
FILESYSTEM_ID=$(echo $EFS | jq --raw-output .FileSystemId)
echo The FILESYSTEM_ID is: $FILESYSTEM_ID | tee -a ecs-efs-variables.log 
sleep 10

aws efs put-file-system-policy --file-system-id $FILESYSTEM_ID --policy file://efs-policy.json --region $AWS_REGION

EFS_SG=$(aws ec2 create-security-group --group-name efs-SG --description "EFS SG" --vpc-id $VPC_ID --region $AWS_REGION) 
EFS_SG_ID=$(echo $EFS_SG | jq --raw-output .GroupId)
aws ec2 authorize-security-group-ingress --group-id $EFS_SG_ID --protocol tcp --port 2049 --source-group $STANDALONE_APP_SG_ID --region $AWS_REGION
echo The EFS_SG_ID is: $EFS_SG_ID | tee -a ecs-efs-variables.log

EFS_MOUNT_TARGET_1=$(aws efs create-mount-target --file-system-id $FILESYSTEM_ID --subnet-id $PRIVATE_SUBNET1 --security-groups $EFS_SG_ID --region $AWS_REGION)
EFS_MOUNT_TARGET_2=$(aws efs create-mount-target --file-system-id $FILESYSTEM_ID --subnet-id $PRIVATE_SUBNET2 --security-groups $EFS_SG_ID --region $AWS_REGION)
EFS_MOUNT_TARGET_1_ID=$(echo $EFS_MOUNT_TARGET_1 | jq --raw-output .MountTargetId)
EFS_MOUNT_TARGET_2_ID=$(echo $EFS_MOUNT_TARGET_2 | jq --raw-output .MountTargetId)
echo The EFS_MOUNT_TARGET_1_ID is: $EFS_MOUNT_TARGET_1_ID | tee -a ecs-efs-variables.log
echo The EFS_MOUNT_TARGET_2_ID is: $EFS_MOUNT_TARGET_2_ID | tee -a ecs-efs-variables.log

EFS_ACCESSPOINT=$(aws efs create-access-point --file-system-id $FILESYSTEM_ID --posix-user "Uid=1000,Gid=1000" --root-directory "Path=/server,CreationInfo={OwnerUid=1000,OwnerGid=1000,Permissions=755}" --region $AWS_REGION)
EFS_ACCESSPOINT_ID=$(echo $EFS_ACCESSPOINT | jq --raw-output .AccessPointId)
EFS_ACCESSPOINT_ARN=$(echo $EFS_ACCESSPOINT | jq --raw-output .AccessPointArn)
echo The EFS_ACCESSPOINT_ID is: $EFS_ACCESSPOINT_ID | tee -a ecs-efs-variables.log
echo The EFS_ACCESSPOINT_ARN is: $EFS_ACCESSPOINT_ARN | tee -a ecs-efs-variables.log

If you want to know more about why we opted to create an EFS Access Point to mount the /server directory on the file system, please refer to Part 2 where we talk about the advantage of using access points.

Now that we have our EFS file system properly configured, we need to make our application aware of it. To do so, we are going to:

  • create an IAM role (standalone-app-role) that grants permissions to map the EFS Access Point
  • tweak the task definition (standalone-app.json) to:
    • add a task role that grants permissions to map the EFS Access Point
    • add the directives to connect to the EFS Access Point we created above

Create a policy called standalone-app-task-role-policy.json and add the following, making sure you properly configure your EFS file system ARN and your EFS Access Point ARN. This information should be on your screen when we printed the variables above or you can refer to the ecs-efs-variables.log file. This policy grants access to that specific access point we have created.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "elasticfilesystem:ClientMount",
                "elasticfilesystem:ClientWrite"
            ],
            "Resource": "arn:aws:elasticfilesystem:REGION:ACCOUNT_ID:file-system/fs-xxxxxx",
            "Condition": {
                "StringEquals": {
                    "elasticfilesystem:AccessPointArn": "arn:aws:elasticfilesystem:REGION:ACCOUNT_ID:access-point/fsap-xxxxxxxxxxxxx"
                }
            }
        }
    ]
}

Open the standalone-app.json task definition and add the taskRoleArn, the mountPoints section, and the volumes section. You can either recreate the file from this skeleton (to be re-customized) or you can add the above directives to the original standalone-app.json task definition you have already customized.

{"family": "standalone-app",
    "networkMode": "awsvpc",
    "executionRoleArn": "arn:aws:iam::ACCOUNT_ID:role/task-exec-role",
    "taskRoleArn": "arn:aws:iam::ACCOUNT_ID:role/standalone-app-role",
    "containerDefinitions": [
        {"name": "standalone-app",
            "image": "<xxxxxxxxxxxxxxxxxxxxx>:1.0",
            "logConfiguration": {
                "logDriver": "awslogs",
                    "options": {
                       "awslogs-group": "/aws/ecs/standalone-app",
                       "awslogs-region": "REGION",
                       "awslogs-stream-prefix": "standalone-app"
                       }
                },
            "mountPoints": [
                {"containerPath": "/server",
                    "sourceVolume": "efs-server-AP"
                }
            ]
        }
    ],
    "requiresCompatibilities": [
        "FARGATE",
        "EC2"
    ],
    "volumes": [
        {"name": "efs-server-AP",
            "efsVolumeConfiguration": {"fileSystemId": "fs-xxxxxx",
                "transitEncryption": "ENABLED",
                "authorizationConfig": {
                    "accessPointId": "fsap-xxxxxxxxxxxxxxxx",
                    "iam": "ENABLED"
             }
            }
        }
    ],
    "cpu": "256",
    "memory": "512"
}

We are now ready to launch the batch of commands that will implement the integration between ECS and EFS:

aws iam create-role --role-name standalone-app-role --assume-role-policy-document file://ecs-tasks-trust-policy.json --region $AWS_REGION
aws iam put-role-policy --role-name standalone-app-role --policy-name efs-ap-rw --policy-document file://standalone-app-task-role-policy.json --region $AWS_REGION

aws ecs register-task-definition --cli-input-json file://standalone-app.json --region $AWS_REGION

With this, we have decoupled the lifecycle of the task from “the data.” In our case, the data is just a configuration file but it could be anything. This is a visual representation of what we have configured:

Let’s see what happens if we launch the very same script that we used before. As a reminder, this script cycles the task with five consecutive starts and stops every two minutes:

sh-4.2# ./standalone-loop-check.sh 
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.110.38
Origin - Availability Zone     : us-west-2b
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.83.106
Origin - Availability Zone     : us-west-2b
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.10.159
Origin - Availability Zone     : us-west-2a
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.126.224
Origin - Availability Zone     : us-west-2b
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.89.26
Origin - Availability Zone     : us-west-2b
------------
sh-4.2# 

Do you spot anything different from the previous run? Now the RANDOM_ID persists across application restarts because the configuration (in our example the file /server/config.json) has been moved out on the EFS share. Also note how the standalone tasks are started in different Availability Zones but they can reach to the same file system from any place. Mission accomplished!

Multiple tasks that access in parallel a shared file system

In this section, we will build on what we have seen so far and we will demonstrate how tasks working in parallel can access a common shared file system. We will keep using our application as a proxy to the infinite possibilities that this pattern allows customers achieve (whether it’s for deploying a scale-out WordPress workload or a parallel machine learning job).

Our (fictitious) application is now serving a broader and distributed community. We no longer can afford to turn it off during non working hours because it’s now serving users 24/7. Not only that, changes have been introduced to the architecture such that now the application can scale out. This is a welcome enhancement given the load it needs to support. There is, however, always the prerequisite of persisting the /server/config.json file and we now need to solve for how we can allow multiple ECS tasks to access in parallel the same file. We will elect the task we defined in the previous section to be the “master“ of this application with read/write permissions to the EFS /server folder. In this section, we are going to leverage the same EFS Access Point pointing to the /server directory and we will provide read only access to a set of 4 ECS tasks to serve the load behind a load balancer.

The approach above shows how you can bypass POSIX permissions and delegate to AWS policies various degrees of access to the EFS file system. Refer to Part 2 of this blog series if you want to read more about this.

We create a new policy document called scale-out-app-task-role-policy.json. Note this policy grants read only access to the access point. Make sure you properly configure your EFS file system ARN and your EFS Access Point ARN.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "elasticfilesystem:ClientMount"
            ],
            "Resource": "arn:aws:elasticfilesystem:REGION:ACCOUNT_ID:file-system/fs-xxxxxx",
            "Condition": {
                "StringEquals": {
                    "elasticfilesystem:AccessPointArn": "arn:aws:elasticfilesystem:REGION:ACCOUNT_ID:access-point/fsap-xxxxxxxxxxxxx"
                }
            }
        }
    ]
}

We can now create the new task role and attach the policy document we have just created.

aws iam create-role --role-name scale-out-app-role --assume-role-policy-document file://ecs-tasks-trust-policy.json --region $AWS_REGION
aws iam put-role-policy --role-name scale-out-app-role --policy-name efs-ap-r --policy-document file://scale-out-app-task-role-policy.json --region $AWS_REGION

Next, we are creating a new task definition called scale-out-app.json. This file is similar to the standalone-app.json task definition we used in the previous section with some notable differences:

  • the family
  • the containerDefinitions/name
  • the awslogs-group and awslogs-stream-prefix
  • the taskRoleArn (the one we created in this section)
  • the accessPointId
{"family": "scale-out-app",
    "networkMode": "awsvpc",
    "executionRoleArn": "arn:aws:iam::ACCOUNT_ID:role/task-exec-role",
    "taskRoleArn": "arn:aws:iam::ACCOUNT_ID:role/scale-out-app-role",
    "containerDefinitions": [
        {"name": "scale-out-app",
            "image": "<xxxxxxxxxxxxxxxxxxxxx>:1.0",
            "portMappings": [
                {
                    "containerPort": 80
                }
            ],
            "logConfiguration": {
                "logDriver": "awslogs",
                    "options": {
                       "awslogs-group": "/aws/ecs/scale-out-app",
                       "awslogs-region": "REGION",
                       "awslogs-stream-prefix": "scale-out-app"
                       }
                },
            "mountPoints": [
                {"containerPath": "/server",
                    "sourceVolume": "efs-server-AP"
                }
            ]
        }
    ],
    "requiresCompatibilities": [
        "FARGATE",
        "EC2"
    ],
    "volumes": [
        {"name": "efs-server-AP",
            "efsVolumeConfiguration": {"fileSystemId": "fs-xxxxxx",
                "transitEncryption": "ENABLED",
                "authorizationConfig": {
                    "accessPointId": "fsap-xxxxxxxxxxxxxxxx",
                    "iam": "ENABLED"
             }
            }
        }
    ],
    "cpu": "256",
    "memory": "512"
}

Now we can register this new task definition:

aws ecs register-task-definition --cli-input-json file://scale-out-app.json --region $AWS_REGION

And we are now ready to launch the last batch of commands to create the deployment of the scale-out version of the application. These commands do the following:

  • create and configure a security group for the scale-out application (scale-out-app-SG)
  • add the scale-out-app-SG to the efs-SG to allow the scale-out app to talk to the EFS mount targets
  • create and configure an ALB to balance traffic across the four tasks
  • create a dedicated log group (/aws/ecs/scale-out-app) to collect the logs
  • create an ECS service that starts the four Fargate tasks
SCALE_OUT_APP_SG=$(aws ec2 create-security-group --group-name scale-out-app-SG --description "Scale-out app SG" --vpc-id $VPC_ID --region $AWS_REGION) 
SCALE_OUT_APP_SG_ID=$(echo $SCALE_OUT_APP_SG | jq --raw-output .GroupId)
aws ec2 authorize-security-group-ingress --group-id $SCALE_OUT_APP_SG_ID --protocol tcp --port 80 --cidr 0.0.0.0/0 --region $AWS_REGION
echo The SCALE_OUT_APP_SG_ID is: $SCALE_OUT_APP_SG_ID | tee -a ecs-efs-variables.log

aws ec2 authorize-security-group-ingress --group-id $EFS_SG_ID --protocol tcp --port 2049 --source-group $SCALE_OUT_APP_SG_ID --region $AWS_REGION

LOAD_BALANCER=$(aws elbv2 create-load-balancer --name scale-out-app-LB --subnets $PUBLIC_SUBNET1 $PUBLIC_SUBNET2 --security-groups $SCALE_OUT_APP_SG_ID --region $AWS_REGION)
LOAD_BALANCER_ARN=$(echo $LOAD_BALANCER | jq --raw-output .LoadBalancers[0].LoadBalancerArn)
LOAD_BALANCER_DNSNAME=$(echo $LOAD_BALANCER | jq --raw-output .LoadBalancers[0].DNSName)
export LOAD_BALANCER_DNSNAME # the LOAD_BALANCER_DNSNAME variable needs to be exported to be able to use it from scripts
echo The LOAD_BALANCER_ARN is: $LOAD_BALANCER_ARN | tee -a ecs-efs-variables.log
TARGET_GROUP=$(aws elbv2 create-target-group --name scale-out-app-TG --protocol HTTP --port 80 --target-type ip --vpc-id $VPC_ID --region $AWS_REGION)
TARGET_GROUP_ARN=$(echo $TARGET_GROUP | jq --raw-output .TargetGroups[0].TargetGroupArn)
echo The TARGET_GROUP_ARN is: $TARGET_GROUP_ARN | tee -a ecs-efs-variables.log
LB_LISTENER=$(aws elbv2 create-listener --load-balancer-arn $LOAD_BALANCER_ARN --protocol HTTP --port 80 --default-actions Type=forward,TargetGroupArn=$TARGET_GROUP_ARN --region $AWS_REGION)
LB_LISTENER_ARN=$(echo $LB_LISTENER | jq --raw-output .Listeners[0].ListenerArn)
echo The LB_LISTENER_ARN is: $LB_LISTENER_ARN | tee -a ecs-efs-variables.log

aws logs create-log-group --log-group-name /aws/ecs/scale-out-app --region $AWS_REGION

aws ecs create-service --service-name scale-out-app --cluster app-cluster --load-balancers "targetGroupArn=$TARGET_GROUP_ARN,containerName=scale-out-app,containerPort=80" --task-definition scale-out-app --desired-count 4 --launch-type FARGATE --platform-version 1.4.0 --network-configuration "awsvpcConfiguration={subnets=[$PRIVATE_SUBNET1, $PRIVATE_SUBNET2],securityGroups=[$SCALE_OUT_APP_SG_ID],assignPublicIp=DISABLED}" --region $AWS_REGION

The following diagram shows what we have created:

Let’s see how the application behaves in action now. To do this, we will run a loop where curl hits the load balancer public DNS name to query the home page. This is the scale-out-loop-check.sh script:

#!/bin/bash
COUNTER=0
echo ------------
while [  $COUNTER -lt 10 ]; do
    curl $LOAD_BALANCER_DNSNAME
    echo ------------
    let COUNTER=COUNTER+1
done

Add the execute flag to the script (chmod +x scale-out-loop-check.sh) and launch it. The output should be similar to this:

sh-4.2# ./scale-out-loop-check.sh
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.228.203
Origin - Availability Zone     : us-west-2b
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.224.117
Origin - Availability Zone     : us-west-2b
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.228.203
Origin - Availability Zone     : us-west-2b
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.224.117
Origin - Availability Zone     : us-west-2b
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.168.140
Origin - Availability Zone     : us-west-2a
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.145.194
Origin - Availability Zone     : us-west-2a
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.168.140
Origin - Availability Zone     : us-west-2a
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.145.194
Origin - Availability Zone     : us-west-2a
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.228.203
Origin - Availability Zone     : us-west-2b
------------
Unique configuration ID       : S5yvUI12p1SqwyMqunJweD57gofJG6ho
Origin - Private IP            : 10.0.224.117
Origin - Availability Zone     : us-west-2b
------------
sh-4.2# 

As you can see, all four tasks are being balanced by the ALB and everyone responds with the same RANDOM_ID coming from the now shared /server/config.json file. Again, these tasks by design are deployed across the Availability Zones we have configured and yet they have access to the same data. Mission accomplished!

Tearing down the environment

It is now time to tier down the environment you have created. This is the list of commands to delete the resources we have implemented in this blog post:

aws ecs update-service --service scale-out-app --desired-count 0 --cluster app-cluster --region $AWS_REGION 
sleep 10
aws ecs delete-service --service scale-out-app --cluster app-cluster --region $AWS_REGION

aws efs delete-mount-target --mount-target-id $EFS_MOUNT_TARGET_1_ID --region $AWS_REGION
aws efs delete-mount-target --mount-target-id $EFS_MOUNT_TARGET_2_ID --region $AWS_REGION
aws efs delete-access-point --access-point-id $EFS_ACCESSPOINT_ID --region $AWS_REGION
sleep 10
aws efs delete-file-system --file-system-id $FILESYSTEM_ID --region $AWS_REGION

aws elbv2 delete-listener --listener-arn $LB_LISTENER_ARN 
aws elbv2 delete-target-group --target-group-arn $TARGET_GROUP_ARN  
aws elbv2 delete-load-balancer --load-balancer-arn $LOAD_BALANCER_ARN

aws ec2 delete-security-group --group-id $EFS_SG_ID --region $AWS_REGION
aws ec2 delete-security-group --group-id $STANDALONE_APP_SG_ID --region $AWS_REGION
sleep 20
aws ec2 delete-security-group --group-id $SCALE_OUT_APP_SG_ID --region $AWS_REGION

aws logs delete-log-group --log-group-name /aws/ecs/standalone-app --region $AWS_REGION
aws logs delete-log-group --log-group-name /aws/ecs/scale-out-app --region $AWS_REGION

aws ecr delete-repository --repository-name standalone-app --force --region $AWS_REGION

aws ecs delete-cluster --cluster app-cluster --region $AWS_REGION

aws iam detach-role-policy --role-name task-exec-role --policy-arn "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
aws iam delete-role --role-name task-exec-role --region $AWS_REGION
aws iam delete-role-policy --role-name standalone-app-role --policy-name efs-ap-rw
aws iam delete-role --role-name standalone-app-role --region $AWS_REGION
aws iam delete-role-policy --role-name scale-out-app-role --policy-name efs-ap-r
aws iam delete-role --role-name scale-out-app-role --region $AWS_REGION

Remember to delete the VPC if you created it for the purpose of this exercise.

Conclusions

This concludes our series of blog. In Part 1, we have explored the basics and the context of the ECS and EFS integration and the context. In Part 2, we have explored some of the technical details about architectural considerations with a focus on how to secure access to EFS. In this last part, we tied all together and we showed examples of how you could implement what we have seen in the previous posts. By now you should have the basis to understand the applicability of your integration and the knowledge to build something specific to your needs with it.

Massimo Re Ferre

Massimo Re Ferre

Massimo is a Senior Principal Technologist at AWS. He has been working on containers since 2014 and is now part of the DECS (Developers, Events, Containers, Serverless) organization at AWS. Massimo has a blog at https://it20.info and his Twitter handle is @mreferre.