How to Get Past the 15 Minute Delay Limit in Amazon SQS

How to Get Past the 15 Minute Delay Limit in Amazon SQS

Building a task scheduler is a common problem. Some use cases from my work include:

  • Notifications - Scheduling emails, push, and SMS notifications to send at particular times.
  • E-Commerce Price Changes - Scheduling price changes to happen at a particular date for a flash sale.
  • Retries - If a job fails, you may need to schedule a retry for later.

Typically, engineers implement task processing using a message queue  —  with Amazon SQS as the most popular option. However, if you want to delay posting a message to simulate a scheduled task, Amazon only offers support up to a 15 minute delay!

Anything beyond 15 minutes requires you to build your own service to handle

This means SQS works well for tasks that need immediate processing, but anything beyond that requires you to build your own service to handle the delays.

When we researched potential solutions, no options worked out of the box, with long delays, good documentation, and production level SLA’s.

Potential Solutions

Some alternatives we looked at, but eventually abandoned include:

  1. DynamoDB TTL

Description: This solution involves storing queue messages in a DynamoDB table with a TTL set on each entry. Then when the item expires, a lambda executes to put the scheduled message on a queue.

Problem: According to the official documentation, the lambda functions are triggered up to 48 hours after expiration. This margin of error is too large.

chrome_QQTgOnMLZb.png

2. AWS Step Functions

Description: Step functions are state machines with a visual workflow. You can use their “Wait” state which allows you to delay publishing a message to a queue for a set amount of time.

chrome_gHpBDDpKII.png

Problems:

  • Scaling limits. Step functions have a max limit of 1 million waiting tasks.
  • Cost. A simple state machine for a scheduled task requires 3 state transitions. So a million scheduled tasks would cost $75 with step functions, which doesn’t include the cost of Lambda or other peripheral services like CloudWatch logs.
  • Testing. Overall development cycle requires 30 seconds to package, upload, and deploy a new stack. This is followed by a number of clicks on the UI to find and debug your latest execution. Also many issues you face are in setting configuration settings of the state machine, which can’t be executed locally.

3. Cron Polling

Description: Create a new cron job for every task you want to delay.

chrome_RG7gmxtqGH.png

Problems: 

  • Scalability: Cron works well if it’s just a few scripts you need to automate. But if you’re dealing with millions of requests, all with different scheduled times, you can’t spin up millions of Cron jobs.
  • Integration: Most tasks take the form of messages on a queue like Amazon SQS or Kafka. Cron doesn’t integrate with message queues. Furthermore, cron jobs aren’t really a service, and don’t fit well within a microservice architecture.

4. Open Source

There are a number of open source task schedulers. Some examples include Celery, Quartz, and BigBen.

chrome_prMZ8yhL7H.png

Problems:

  • Difficult integration. These solutions require you to spin up your own servers that run the open source. We want something that works out-of-the-box — no cryptic errors, no servers to manage.
  • Lack of support. Documentation can be spotty and outdated. If you get anything wrong in your configuration settings, there could be little to no feedback as to what the problem could be. When we posted an issue on Github about an error we had spent weeks debugging, no one could help.

Introducing Scheduler API

Dissatisfied with the alternatives above, we set out to build a cloud task scheduler. This solution is the solution we wished we had — Scheduler API. Scheduler API is a message queue scheduler. It consists of four API’s:

  • schedule() — schedules the queue message (task)
  • cancel() — cancels a previously scheduled queue message
  • update() — changes the scheduled time of a previously scheduled queue message
  • status() — checks the status of a scheduled message

We aimed to solve a couple problems:

  • Serverless — We didn’t want to manage our own brokers or do our own scheduler deployments. We wanted a clean API that just worked. Make the call, and don’t worry about the rest.
  • High Precision — Messages have to be delivered with precision. The 48 hour margin of error with DynamoDB is too much.
  • Production Level SLA's - The API should have no scalability limits. It should handle high throughput traffic with no additional work.
  • Fast integration - Getting the first task scheduled should not take weeks of endless configuration and debugging. The API should "just work" out of the box.
  • High level of support - We are engineers too. We get the frustration of integrating software with poor documentation. We aimed to make sure that developers always get the support they need - well-maintained API docs and fast response times to inquiries.

Getting Started with Scheduler API

We’ve bundled the API in an easy to use SDK. In this section we will show you how to get a test call up and running by delaying tasks to Amazon SQS in Node. Using the API involves 3 steps:

  1. Create a Scheduler API account
  2. Grant permissions
  3. Import the SDK and test the call

1) Creating a Scheduler API Account

First you have to create a Scheduler API account to get API keys. You can do this by signing up on the main website: www.schedulerapi.com and clicking “Sign Up”.

    chrome_Z8EgNDYZXw.png

Or you can go straight to account creation here: app.schedulerapi.com:

    chrome_wDkmmKhtRR.png

Once your account is created, the admin console will have a button to create your first API key. Click the button to create the API key, as you will need this in the test call later.

    chrome_tOb5H1qDJq.png

2) Granting Permissions

The next step is to create an IAM role to represent Scheduler API with “write” permissions to the queue you want to schedule messages to. This is necessary or else the API can’t publish your scheduled messages. We walk you through this in our blog post here.

3) Importing the Scheduler SDK

Now that we’ve gotten the configuration finished, we can get coding!

The recommended way to install the Scheduler API SDK is through npm or Yarn.

npm:

    npm install schedulerapi-js

yarn:

    yarn add schedulerapi-js

At this point, you can make a call by adding this line of code to where you want the task scheduled:

    const s = new Scheduler({ key: SCHEDULER_API_KEY }); const results = await s.scheduleSqs({ when: new Date('2020-08-24 20:13:00'), url: YOUR_SQS_QUEUE_URL, body: ‘THE_BODY_OF_YOUR_SQS_MESSAGE’}); console.log(results);

A sample response looks like this:

    { “id”: “cLzxqmLKAEc2Tf2YzKRZW”, “when”: “2020–08–24 20:13:00”, “now”: “2020–08–24 20:11:35”, “user”: “CkM2xwzjvxjGhWeiMFWy9s” }

The calls to update(), cancel(), and status() are similar and documented in the API docs below.

Documentation

You can see a full working example that you can pull in and test at this github repo here — https://github.com/schedulerapi/schedulerapi-cra-typescript-example

You can also see the full API docs and NPM and PHP packages here:

NPM package —  https://www.npmjs.com/package/schedulerapi-js

PHP SDK — https://packagist.org/packages/schedulerapi/schedulerapi-php

API Documentation — https://apidocs.schedulerapi.com/#schedulerapi

What’s next?

Our next areas of focus involve:

  • Building a Scheduler API SDK for languages beyond just Javascript and PHP.
  • Supporting other messages queues beyond Amazon SQS (Kafka, ActiveMQ, Rabbit MQ, and others).

Contact

If you need Scheduler API for any other use cases, we’d love to hear about it! 

Contact us at info@schedulerapi.com for questions and requests.

Follow us on Twitter: https://twitter.com/SchedulerAPI

We guarantee a response to all inquiries within 24 hours. Happy hacking!