App-FargateStack

 view release on metacpan or  search on metacpan

lib/App/FargateStack/Pod.pm  view on Meta::CPAN

premium features.

=head2 Roadmap for HTTP Services

=over 4

=item * path based routing on ALB listeners

=back

=head1 AUTOSCALING

=head2 Overview

For services that experience variable load, such as HTTP applications or
background job processors, C<App::FargateStack> can automate the process of
scaling the number of running tasks up or down to meet demand. This ensures
high availability during traffic spikes and saves costs during quiet periods.

The framework integrates with AWS Application Auto Scaling to provide target
tracking scaling policies. This allows you to define a target metric - such as
average CPU utilization or the number of requests per minute - and the framework
will automatically manage the number of Fargate tasks to keep that metric at
your desired level.

=head2 Enabling Autoscaling

To enable autoscaling for a service, add an C<autoscaling> block to its task
configuration in your .yml configuration file.

tasks:
  my-service:
    # ... other task settings ...
    autoscaling:
      min_capacity: 1
      max_capacity: 10
      cpu: 60

=head2 Configuration Parameters

The C<autoscaling> block accepts the following keys:

=over

=item * B<min_capacity> (Required)

The minimum number of tasks to keep running at all times. The service will
never scale in below this number.

=item * B<max_capacity> (Required)

The maximum number of tasks that the service can scale out to. This acts as
a safeguard to control costs.

=item * B<cpu> OR B<requests> (Required, mutually exclusive)

You must specify exactly one scaling metric.

=over

=item * C<cpu>: The target average CPU utilization percentage across all tasks in
the service. Valid values are between 1 and 100.

=item * C<requests>: The target number of requests per minute for each task. This
is only valid for tasks of type C<http> or C<https> that are behind an
Application Load Balancer.

=back

=item * B<scale_in_cooldown> (Optional)

The amount of time, in seconds, to wait after a scale-in activity before
another scale-in activity can start. This prevents the service from scaling
in too aggressively.

Default: C<300>

=item * B<scale_out_cooldown> (Optional)

The amount of time, in seconds, to wait after a scale-out activity before
another scale-out activity can start. This allows new tasks time to warm up
and start accepting traffic before the service decides to scale out again.

Default: C<60>

=item * B<policy_name> (Managed by CApp::FargateStack)

This is a unique name for the scaling policy generated by the framework. It
is written to your configuration file and used to detect drift between your
configuration and the live environment in AWS. You should not modify this
value.

=back

=head2 Example: Scaling on CPU Utilization

This configuration will maintain at least 1 task, scale up to a maximum of 5
tasks, and will add or remove tasks to keep the average CPU utilization at or
near 60%.

 tasks:
   my-cpu-intensive-worker:
     type: daemon
     image: my-worker:latest
     autoscaling:
       min_capacity: 1
       max_capacity: 5
       cpu: 60

=head2 Example: Scaling on ALB Requests

This configuration will maintain at least 2 tasks, scale up to a maximum of 20
tasks, and will add or remove tasks to keep the number of requests per minute
for each task at or near 1000. It also specifies custom cooldown periods.

 tasks:
   my-website:
     type: https
     image: my-website:latest
     autoscaling:
       min_capacity: 2



( run in 0.590 second using v1.01-cache-2.11-cpan-39bf76dae61 )