Running cron jobs on AWS Lambda with Scheduled Events

A look into the minimal set of AWS resources to get scheduled Lambda functions running and how to set things up with Node.js and Serverless.

Recently I wanted to create a cron job for a side project but couldn’t because I didn’t have a server that I could use or a computer running at home 24/7. I decided to finally dive into AWS Lambda and use a scheduled Lambda function instead. Although I had read a lot about Lambda and Cloud Functions, in general, I never got to work with them. I heard of many tools and frameworks that make it easy to set up and deploy Cloud Functions like Serverless Framework, AWS SAM, Apex or Universal Now. But instead of learning any abstractions I wanted to get a deeper understanding of AWS first. In this post, I’ll document my learnings and give you an overview of all AWS resources that you’d need to set up such a Lambda function manually. We’ll go through the basic concepts first and then look into how to get things running with one of those abstractions, the Serverless Framework. If you are interested in a comparison of different tooling in this area, let me know, and I’ll put it on my to-do list for upcoming blog posts.

The Basics: Getting an architectural overview

To get this deeper understanding of AWS, I decided to create a CloudFormation template and just see where that would get me. To be honest, at first, I was overwhelmed by the number of services I would need to configure and the amount of documentation I had to go through. At times I thought “all I want to do is deploy a Lambda function, why do I need to care and know about S3 bucket deletion policies now?”. If you are a product engineer, maybe you shouldn’t need to worry about S3 deletion policies. And there are solutions that abstract things like that away. But if you want to know what’s behind the abstraction, with CloudFormation you can get into a lot of details about your infrastructure. It can be a bit tiring but also a good learning exercise. Once you work with it, you start to understand the need for simpler solutions. Nevertheless, let’s look into the minimal set of resources for our infrastructure.

S3 Bucket

Every Lambda function needs to be prepared as a “deployment package”. The deployment package is a .zip file consisting of code and any dependencies. That .zip file can be uploaded via the web console or located in an S3 bucket. If you’re using CloudFormation, you’d need to upload your deployment package to an S3 Bucket and later tell Lambda where to find it.

IAM Role

As with almost everything on AWS we’d need to manage permissions for our Lambda function with IAM. At the very least it should be able to write logs, so it needs access to CloudWatch Logs. The AWS Lambda Developer Guide gives a nice introduction to this topic in the section AWS Lambda Permissions Model. In simpler terms, what we want to tell our function is ”you are allowed to access CloudWatch Logs only to write logs”.

Lambda Function

Of course, we’d need to create the actual Lambda function. Additionally, to the deployment package and the IAM Role, a Lambda function needs to have various other configurations like function name, the path to a handler function, what runtime to use, how much memory the function needs, etc. What we want to tell AWS is ”take this source code, this IAM Role, a couple of configs and give me a Lambda function that I can invoke”.

CloudWatch Events Rule

If we want to trigger our Lambda function periodically, we’ll need CloudWatch and a CloudWatch Events Rule. CloudWatch Events supports cron-like expressions which we can use to define how often an event is created. We would also need to make sure that we add our Lambda function as a target for those events. Again, in simpler terms, what we say is ”create a new event every x and invoke this Lambda function with it”.

Lambda Permission

Creating the events and targeting the Lambda function isn’t enough unfortunately. We would still need to make sure that the events are allowed to trigger (invoke) our Lambda function. Anything that wants to invoke a Lambda function needs to have explicit permission to do that. We’re telling our function ”allow events from CloudWatch Events to invoke you”.

Setting things up

Now that we know the theory let’s get into practice! To keep things short and sweat we’ll create a function that does nothing else then logging Hello, world! every 5 minutes. We’ll look into how we can create such a function in Node.js and deploy it with Serverless.

To get started, follow the Quick Start guide by Serverless to install the Serverless CLI and configure your AWS credentials. After that, we can create a new Node.js service. A service in Serverless is conceptually something like a project, where you can define multiple Lambda functions, event triggers, and any other AWS resources.

serverless create --template "aws-nodejs" --path my-service  

This will create a serverless.yml config and a handler.js file where we’ll define the handler for our Lambda function. Let’s replace the contents of handler.js with the following:

module.exports.hello = (event, context, callback) => {  
  console.log("Hello, world!");
  callback(null);
};

With our Lambda handler function ready, the only thing left is to prepare the serverless.yml file:

service: my-service  
provider:  
  name: aws
  region: eu-central-1
  runtime: nodejs6.10
functions:  
  hello:
    handler: handler.hello
    events:
    - schedule:
        rate: cron(*/5 * * * ? *)
        enabled: true

As you can see, the configs are pretty self-explanatory once you know what’s going on behind the scenes. Compare this to the equivalent of these two CloudFormation templates (#1, #2), and you’ll get a feeling of what I was talking about earlier.

What’s nice about Serverless compared to CloudFormation is that it manages permissions, so you don’t need to configure IAM roles or Lambda permissions. In addition to that, it creates the deployment package for you, uploads it to S3, and deploys everything else with one command: serverless deploy.

Once your function is deployed, you can tail the logs with serverless logs --function hello --tail and check if you see a Hello, world! logged every 5 minutes. If everything went well, you should see output similar to this:

START RequestId: 1b2cc533-a86d-11e7-a3f3-5ba627dcc6d6 Version: $LATEST  
2017-10-03 21:00:22.173 (+02:00)        1b2cc533-a86d-11e7-a3f3-5ba627dcc6d6    Hello, world!  
END RequestId: 1b2cc533-a86d-11e7-a3f3-5ba627dcc6d6  
REPORT RequestId: 1b2cc533-a86d-11e7-a3f3-5ba627dcc6d6  Duration: 2.50 ms       Billed Duration: 100 ms         Memory Size: 1024 MB    Max Memory Used: 20 MB  

That’s it! Our Lambda function is running on a schedule, just like a cron job. The difference now is that our job is running in the cloud and most likely for free since the Lambda free tier includes 1M free requests per month. Pretty awesome, right? If you want to shut down your Lambda function, you can disable the event rule by switching the enabled flag to false and re-run serverless deploy. Or if you want to take down your whole stack you can use serverless remove.

blogfoster’s vision is to build an ecosystem for bloggers where they can get all the tools and support they need to become successful with their blogs. We use React, Redux, Webpack, SASS, ES6 and more to build an enjoyable platform for thousand of bloggers. Do you want to work with the newest technologies? We are constantly looking for people as passionate as we are. Join our team, let's work together.