Aki Rautio

Generate a PDF in AWS Lambda with NodeJS and Puppeteer

July 22, 2019

Recently I have needed to solve a problem that involves generating a PDF file based on database content. Since these PDFs are not generated too often, it doesn't make sense to 24/7 running service. Luckily both Google (Functions) and AWS (Lambda) have an event-driven service which is only running on request.

Originally I was planning to use Python and a Reportlab for this project but a connection to PostgreSQL database ended up being too complex to configure. With NodeJS I had already done a small project with database connection so I knew that it would work.

For NodeJS I still needed a package to generator PDF, and I found following options:

I ended up choosing Puppeteer for this project. It's a bit overkill for the current use case but at the same time, it is more future proof due to html+css base structure.

To make my life easier I'm using a serverless package to handle deployment to AWS Lambda and chrome-aws-lambda to help out the deployment of puppeteer to AWS Lambda. Full list of required dependencies are the following:

Aside from the main requirements, I'm using knex, pg, and pg-hstore to handle database connection and pug as a template engine. For local testing I'm using serverless-offline and to help the binary addition to lambda, I'm using serverless-apigw-binary.

##Creating a lambda function

The process of creating a pdf goes following:

  1. Fetch the data which we will use to create report (in my case from db with knex)
  2. Create a html template which will be comined with the data (I'm using pug in here).
  3. Load puppeteer and open html file with puppeteer.
  4. Generate a pdf page with puppeteer.
  5. Return PDF as a base64 string.

Deployment to AWS lambda

As earlier said, we are using Serverless for deployment so that the configuration is not too heavy.

The keys in here are that we enable / for apigwBinary so that PDF goes through in a correct format.

And here we have everything to generate PDF in AWS lambda. To my opinion generating the pdf with 1024 MB took something like 4000ms which would mean that total price would be close to 1 euro per 20000 PDF generations after free tier.

If you want to try it out yourself, I have created a repository to Github.

Comment or ask about the post in Dev.to