Skip to content

Probable SSM ThrottlingException: Rate exceeded with standard queues from burst-y traffic #566

Closed
@tkovis

Description

@tkovis

Describe the issue
When publishing a large number of payloads concurrently without waiting for the first to finish, you can get throttled unexpectedly even if you use a plugin to change the default fifo queues into standard queues where throttling should not happen.

  await Promise.all(
    largeArray.map(async (payload) =>
      arc.queues.publish({
        name: "queue-name",
        payload,
      })
    )
  );
{
    "errorType": "ThrottlingException",
    "errorMessage": "Rate exceeded",
    "name": "ThrottlingException",
    "$fault": "client",
    "$metadata": {
        "httpStatusCode": 400,
        "requestId": "...",
        "attempts": 3,
        "totalRetryDelay": 101
    },
    "__type": "ThrottlingException",
    "message": "Rate exceeded",
    "stack": [
        "ThrottlingException: Rate exceeded",
        "    at throwDefaultError (/var/task/index.js:6319:24)",
        "    at /var/task/index.js:6328:9",
        "    at de_CommandError (/var/task/index.js:25195:18)",
        "    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)",
        "    at async /var/task/index.js:756:24",
        "    at async /var/task/index.js:872:22",
        "    at async /var/task/index.js:10952:42",
        "    at async /var/task/index.js:494:26"
    ]
}

It seems that this is caused by the service discovery only caching after it has succesfully fetched which causes every call to publish launch a concurrent fetch for service discovery and eventually causes a throttling error.

Steps to reproduce
Steps to reproduce the behavior:

  1. Use a plugin to change the queue to a standard queue
  2. Try to publish a large number of payloads concurrently

Expected behavior
If this is caused by service discovery not caching the initial promise, then it is expected that the first call to publish causes a fetch for service discovery and every other call to publish waits until it resolves and does not try to fetch things by themselves.
const store = { cache: null };

const fetchOrUseCached = () => {
  if (store.cache) return store.cache;
  // immediately store the promise for next synchronous call so the same promise is used for every request
  store.cache = fetchSomething();
  return store.cache;
}

I am unsure about what actually is the cause, this is just a guess. I am also unsure if this would be the correct solution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs infoneed further details / info / repro case

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions