App-FargateStack
view release on metacpan or search on metacpan
share/README.md view on Meta::CPAN
### Step 3: Apply the Plan
Once you have reviewed the plan and are satisfied with the proposed
changes, run the `apply` command. This will execute the plan and
create all the necessary AWS resources.
app-FargateStack apply
### Step 4: Deploy and Start the Service
The `apply` command creates all the necessary **infrastructure**, but
it does not start your service. This separation allows you to manage
your infrastructure and your application's runtime state
independently.
To create the ECS service and start your container, use the
`deploy-service` command.
app-FargateStack deploy-service my-stack-daemon
By default, this will start one instance of your task. To check its
status, use the `status` command:
app-FargateStack status my-stack-daemon
And to stop the service, simply run:
app-FargateStack stop-service my-stack-daemon
To restart a stopped service, run:
app-FargateStack start-service my-stack-daemon
## VPC AND SUBNET DISCOVERY
If you do not specify a `vpc_id` in your configuration, the framework will attempt
to locate a usable VPC automatically.
A VPC is considered usable if it meets the following criteria:
- It is attached to an Internet Gateway (IGW)
- It has at least one available NAT Gateway
If no eligible VPCs are found, the process will fail with an error. If multiple
eligible VPCs are found, the framework will abort and list the candidate VPC IDs.
You must then explicitly set the `vpc_id:` in your configuration to resolve
the ambiguity.
If exactly one eligible VPC is found, it will be used automatically,
and a warning will be logged to indicate that the selection was
inferred.
## SUBNET SELECTION
If no subnets are specified in the configuration, the framework will query all
subnets in the selected VPC and categorize them as either public or private.
The task will be placed in a private subnet by default. For this to succeed,
your VPC must have at least one private subnet with a route to a NAT Gateway,
or have appropriate VPC endpoints configured for ECR, S3, STS, CloudWatch Logs,
and any other services your task needs.
If subnets are explicitly specified in your configuration, the
framework will validate them and warn if they are not reachable or are
not usable for Fargate tasks.
### Task placement and Availability Zones
The framework places each task's ENI into exactly one subnet, which fixes
that task in a single AZ. A service can span multiple AZs by listing
subnets from at least two AZs.
What the framework does:
- Prefers private subnets
If private subnets are defined in the configuration, tasks are placed
there. If no private subnets are defined, the framework falls back to
public subnets.
- Aligns ALB AZs with task placement
When a load balancer is used, the framework enables the ALB in the same
AZ set it selects for tasks (best practice). This is for resilience and
to avoid unnecessary cross-AZ hops; it is not a hard technical requirement.
- Requires two subnets
The configuration must specify at least two subnets in different AZs.
If subnets are not specified, the framework attempts to discover them,
but still requires at least two usable subnets (either both private or
both public). If fewer than two are available, it errors with guidance.
Notes on internet access and ALBs:
- Internet-facing ALB
An internet-facing ALB must be created in public subnets. Tasks may (and
usually should) remain in private subnets behind it.
- Egress from private subnets
For image pulls and outbound calls, use either a NAT Gateway in each AZ
or VPC endpoints for ECR (api and dkr) and S3.
- Egress from public subnets
If tasks are placed in public subnets without endpoints or NAT, they
require `assignPublicIp=ENABLED` to reach ECR/S3.
## REQUIRED SECTIONS
At minimum, your configuration must include the following:
app:
name: my-stack
tasks:
my-task:
image: my-image
type: daemon | task | http | https
For task types `http` or `https`, you must also specify a domain name:
domain: example.com
## FULL SCHEMA OVERVIEW
The framework will expand and update your configuration file with default values as needed.
Here is the full schema outline. All keys are optional unless otherwise noted:
---
account:
alb:
arn:
name:
port:
type:
app:
name: # required
version:
certificate_arn:
cluster:
arn:
name:
default_log_group:
domain: # required for http/https tasks
id:
last_updated:
region:
role:
arn:
name:
policy_name:
route53:
profile:
zone_id:
security_groups:
alb:
group_id:
group_name:
fargate:
group_id:
group_name:
subnets:
private:
public:
tasks:
share/README.md view on Meta::CPAN
- Stacks may contain multiple daemon services, but only one task
may be exposed as an HTTP/HTTPS service via an ALB.
- Limited configuration options for some resources such as
advanced load balancer listener rules, custom CloudWatch metrics, or
task health check tuning.
- Some out of band infrastructure changes may break the ability
to re-run `app-FargateStack` without manually updating the
configuration
- Support for only 1 EFS filesystem per task
- This framework assumes that the
[operatingSystemFamily](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters_ec2.html#runtime-platform_ec2)
is "LINUX" and the `cpuArchitecture` is "X86\_64" LINUX. This is
unlikely to change.
[Back to Table of Contents](#table-of-contents)
# TROUBLESHOOTING
## Warning: task placed in a public subnet
When running a task you may see:
[2025/08/05 03:40:58] run-task: subnet-id: [subnet-7c160c37] is in a public subnet...consider running your jobs in a private subnet
This means the task is being scheduled in a subnet that has a
0.0.0.0/0 route to an Internet Gateway (a public subnet).
While not fatal, placing tasks in public subnets is discouraged unless
you have a specific need.
### Why this matters
Running tasks in public subnets can introduce risk and operational
surprises:
- Accidental exposure
If the task is assigned a public IP and the security group allows
inbound access, it may be reachable from the internet.
- Unintended dependency
Public-subnet egress typically relies on a public IP and the Internet
Gateway. That can bypass intended egress controls, logging, or central
inspection.
- Narrow security margin
Safety depends entirely on security groups and NACLs. A small
misconfiguration can expose services or data.
### Recommended pattern
Use private subnets for most Fargate workloads. Private subnets do not
route directly to the internet.
If the task needs outbound access (for example, to pull images from
ECR or call external APIs), use one of:
- A NAT Gateway (private subnet egress to the internet)
- VPC interface endpoints for ECR (ecr.api and ecr.dkr) and a
gateway endpoint for S3, so image pulls stay inside the VPC with no
public IPs
For public-facing applications, the common pattern is: tasks in
private subnets, fronted by a public Application Load Balancer in
public subnets.
### When is a public subnet acceptable?
Use a public subnet only when the task itself must have a public IP
and terminate client connections directly (uncommon). If you do:
- Set assignPublicIp=ENABLED so the task can reach the internet
via the Internet Gateway
- Keep security groups locked down and monitor egress on TCP 443
### Note on image pulls
To pull from ECR, the task needs a path to ECR API, ECR DKR, and S3:
- Public subnet: requires a public IP (assignPublicIp=ENABLED),
unless you provision VPC endpoints
- Private subnet: works via a NAT Gateway, or entirely private
via VPC endpoints (no public IPs)
## My task fails with this message:
ResourceInitializationError: unable to pull secrets or registry auth:
The task cannot pull registry auth from Amazon ECR: There is a
connection issue between the task and Amazon ECR. Check your task
network configuration. operation error ECR: GetAuthorizationToken,
exceeded maximum number of attempts, 3, https response error
StatusCode: 0, RequestID: , request send failed, Post
"https://api.ecr.us-east-1.amazonaws.com/": dial tcp 44.213.79.10:443:
i/o timeout
This error usually occurs when your task is launched in a subnet that
does not have outbound access to the internet. Internet access - or a
properly configured VPC endpoint - is required for Fargate to
authenticate with ECR and pull your container image.
### Common causes
- The task was placed in a public subnet but was not assigned a
public IP.
- The task was placed in a private subnet without access to a
NAT gateway or VPC endpoints.
Even though the subnet may have a route to an Internet Gateway (i.e.,
it is technically a "public" subnet), if the task does not receive a
public IP, it cannot use that route to reach external services like
ECR or Secrets Manager.
### How to fix it
- If using public subnets, ensure the task is assigned a public
IP.
- If using private subnets, ensure a NAT gateway is available
and the subnet has a route to it.
- Alternatively, configure VPC endpoints for ECR, Secrets
Manager, and related services to avoid needing internet access
altogether.
### Note on Subnet Selection
`App::FargateStack` attempts to prevent this situation by analyzing
your VPC configuration during planning. It categorizes subnets as
private or public and evaluates whether they provide the necessary
network access to launch a Fargate task successfully. The framework
warns if you attempt to use a subnet that lacks internet or endpoint
access.
## My task failed to start and the reason is unclear
This is one of the most common and frustrating scenarios when working
with Fargate. You run `start-service` or `run-task`, the command
seems to succeed, but then the task quickly stops. The `status`
command shows the desired count is 1 but the running count is 0, and
the logs are empty.
This often happens due to a **resource initialization error**. The
problem isn't with your container image itself, but with the
infrastructure Fargate is trying to set up for it.
Common causes include:
- **Networking Issues**: The task is in a subnet that can't pull the
image from ECR (e.g., no NAT Gateway or VPC endpoints).
- **Permissions Errors**: The task's IAM role is missing a required
permission.
- **EFS Mount Failures**: The task cannot mount an EFS volume, often due
to a misconfigured security group or incorrectly specified path.
These errors are opaque because they happen deep inside the
AWS-managed environment. The high-level ECS API only reports a generic
failure, and since it's not an API call error, it won't appear in
CloudTrail.
### The Solution: Finding the `stoppedReason`
To solve this, `App-FargateStack` provides an optional argument to
the `list-tasks` command. By default, this command only shows
`RUNNING` tasks. However, if you add the `stopped` argument, it will
show recently stopped tasks and, most importantly, the reason they
stopped.
**The Command:**
app-FargateStack list-tasks stopped
This will display a table of stopped tasks, including a `Stopped
Reason` column. This column often contains the detailed, multi-line
error message from the underlying AWS service that caused the failure,
giving you the exact information you need to debug the problem.
For example, if an EFS mount failed, the `stoppedReason` might
contain:
ResourceInitializationError: failed to invoke EFS utils
commands... mount.nfs4: mounting failed, reason given by server: No
such file or directory
This tells you immediately that the problem is with the EFS path, not
a generic "task failed" message.
## Why is my task or service still using an old image?
This is one of the most common points of confusion when working with
ECS and Fargate.
You may have just built and pushed a new image to ECR using the same
tag (e.g. `latest`), but when you launch a task or deploy a service,
ECS appears to continue using the old image. Here's why.
### One-off tasks: `run-task` uses a fixed image digest
When you run a task using:
app-FargateStack run-task my-task
ECS uses the exact task definition revision as registered. If the
image was specified using a tag like `:latest`, ECS resolves that tag
once -- at the time the task starts -- and stores the resolved digest
(e.g. `sha256:...`).
This means:
- Tasks launched this way will continue to run the old image, even if
( run in 0.879 second using v1.01-cache-2.11-cpan-39bf76dae61 )