Managing Load with Throttles

The implementation of services requires setting limits on both inbound requests and responses to prevent overloading downstream systems within the architecture. This is crucial for maintaining the availability and performance of GLU.Engines, especially during high workloads.

TPS (Transactions per Second) is a common parameter used to measure the limits and effective management of throttling is necessary to ensure optimal performance.

GLU.Ware offers two types of Load control throttle mechanisms:

Throttle type 1: Requests per time period
Throttle type 2: Concurrent Requests using Thread Pools

Throttle Type 1: Requests per Time Period

The Throttle feature in GLU provides the capability to regulate the workload of specific endpoints and prevent overloading. It also enables the enforcement of user quota limits, adherence to external service SLAs, and management of applied SLAs on the API.

In the Request Manager Control Panel, Throttles on Inbound Requests can be defined by enabling the ‘Add Throttle’ option. This opens the Throttle tool where a new Throttle can be created by clicking the green ‘plus’ sign, as illustrated in the screenshot below:

Base Configuration Options:

Time Period (milliseconds): This sets the duration for which the maximum number of requests will apply.
Max Requests per Time Period: This sets the maximum number of requests to allow within the set Time Period. It reflects a Numeric value or a parameter/variable i.e. 73 or ${maxRequests}

Throttle Type 1 – Example

Time Period = 1000ms (1 second)
Max Requests per Time Period = 50

When your GLU.Engine receives its 51st Request within the 1000ms (1 second), then that 51st Request can be handled in one of two ways:

Configuration Options:

Queue excess calls: in the above example, the 51st call will be held in memory in a queue in the GLU.Engine until the Time Period completes. This is the ‘default’ option.
Reject excess calls: by checking the ‘Reject Excess Requests’ box, you will be presented with the configuration options to define the alternative Throttle Option which is to rather than queue excess calls to reject them. In this scenario, you are able to define your HTTP Status Code, the Response Content-Type, and the Response Template to use.

If Reject happens it happens before anything occurs in the API. It will be the first thing to happen and the reject response template will supersede any other templates that the API may generate.

Throttle Type 1 Usage Scenarios

It is possible to adjust the Throttle Type 1 setting dynamically, by setting the Max request per time period parameter/variable through an API call.
Setting up a query parameter ${throttleSetting} to control throttle max requests dynamically based on external factors, such as the number of credits a client has.

In this way, you could control the number of requests based on some external influence, like the number of credits a client has. More credits more throughput.

Throttle Type 2: Concurrent Requests using Thread Pools

The Thread Pool Throttle feature allows you to:

Ensure that a specific API does not get overloaded, or
That you do not exceed an agreed SLA with an external service.

Thread Pool Throttles are defined (configured) on the Inbound Request or Outbound Request Handlers.

See example below which shows the following settings which are set for a call to a orchestration connector “Purchase”.

To help explain how the settings POOL SIZE, MAX POOL SIZE and MAX QUEUE SIZE affect the parallel execution of downstream threads to the “Purchase” connector see the example below based on a real world analogy. We hope the analogy helps in the understanding.

The Passport office analogy.

This analogy reflects a passport office which contains clerks who are responsible for dealing with customers passport renewals one on one.

People arrive at the office with passport in hand, they queue up outside the main passport processing room and are only let in to have their passports processed when a clerk is available to deal with them.

In the floor plan view above 2 clerks are processing customers at Desk 1 & 2, and 2 clerks who are off duty will rush out to serve new customers at desk 3 & 4.

The Tasks or the API call being received or generated are the customers.

Max Queue Size reflects the number of customers in the queue and in the passport processing room being processed. Currently for the floor plan view above has 10 customers.
Max Queue Size Zero setting The passport office also has the ability to set the queue size to Zero, if this is the the case when a customers arrives at the passport office and all the clerks are busy, the customer will be told to go away.
Max Pool Size reflects the number of clerks which the passport office has availble to process customers passports, in this case it is reflected by the desks available. Clerks will retire back to the back room if they are not needed i.e. customers do not need processing.
Pool Size reflects the minimum number of clerks who are always at the desks to serve customers.

Select Type as either ‘None’ or ‘Parameter’ based and after defining the Action you want to perform, you are presented with the option to ‘Enable Thread Pool’, setting this to ‘TRUE’ will allow you to then set the ‘Pool Size’, the ‘Max Pool Size’ and the ‘Max Queue Size’

Configuration Options:

Pool Size – This specifies the number of threads to keep in the pool, even if they’re idle. The thread pool will always contain at least the number of threads specified. The Pool Size default setting is 10, the default of 10 will be applied if the field is left empty.
Max Pool Size – This specifies the maximum number of threads to keep in the pool. The thread pool can grow up to at most the number of threads specified but will release excess threads (above the Pool Size) if not needed. The Max Pool Size default setting is 20, the default of 20 will be applied if the field is left empty.
Max Queue Size – This specifies the max queue for holding waiting tasks before they’re executed. Tasks queued beyond the set size will be rejected. Task Queues can contain up to 1000 Tasks which is the default setting so if this field is not set, the default of 1000 will be applied.

Type 2 – enables you to limit the number of concurrent transactions (threads) any Inbound or Outbound Connector is able to process simultaneously. These Throttles are set as Handlers on the Request (Inbound or Outbound).

WARNING

Note: Do not enable SEDA for any orchestration connectors when you use Throttle type 2. Otherwise, the throttle settings will not work and the throughput will be limited to 162 TPS.

Se below a view of how this should be set up.