Custom PDF Rendering in JavaScript with Mozilla’s PDF.Js

When it comes to the Web, almost every modern browser supports viewing of PDF documents natively. But, that native component is outside of the developer’s control. Imagine that because of some business rule in your web app, you wanted to disable the Print button, or display only few pages while others require paid membership. You can use browser’s native PDF rendering capability by using the embed tag, but since you don’t have programmatic access you can’t control the rendering phase to suit your needs.

Luckily, there now exists such a tool, PDF.js, created by Mozilla Labs, which can render PDF documents in your browser. Most importantly, you as a developer have full control over rendering the PDF document’s pages as per your requirements. Isn’t this cool? Yes, it is!

Let’s see what PDF.js actually is.

What Is PDF.js

PDF.js is Portable Document Format (PDF) built around HTML5-based technologies, which means it can be used in modern browsers without installing any third-party plugins.

PDF.js is already in use at many different places including some online file sharing services like Dropbox, CloudUp, and Jumpshare to let users view PDF documents online without relying on browser’s native PDF rendering capability.

PDF.js is without any doubt an awesome and essential tool to have in your web app, but integrating it isn’t as straightforward as it might seem. There is little to no documentation available on how to integrate certain features like rendering text-layers or annotations (external/internal links), and supporting password protected files.

In this article, we will be exploring PDF.js, and looking at how we can integrate different features. Some of the topics which we will cover are:

  • Basic Integration
  • Rendering Using SVG
  • Rendering Text-Layers
  • Zooming in/Out

Basic Integration

Downloading the Necessary Files

PDF.js, as it’s name states is a JavaScript library which can be used in browser to render PDF documents. The first step is to fetch necessary JavaScript files required by PDF.js to work properly. Following are two main files required by PDF.js:

  • pdf.js
  • pdf.worker.js

To fetch aforementioned files, if you are a Node.js user, you can follow these steps as mentioned on the GitHub repo. After you are done with the gulp generic command, you will have those necessary files.

If, like me, you don’t feel comfortable with Node.js there is an easier way. You can use following URLs to download necessary files:

The above mentioned URLs point to Mozilla’s live demo of PDF.js. By downloading files this way, you will always have latest version of the library.

Web Workers and PDF.js

The two files you downloaded contain methods to fetch, parse and render a PDF document. pdf.js is the main library, which essentially has methods to fetch a PDF document from some URL. But parsing and rendering PDF is not a simple task. In fact, depending on the nature of the PDF, the parsing and rendering phases might take a bit longer which might result in the blocking of other JavaScript functions.

HTML5 introduced Web Workers, which are used to run code in a separate thread from that of browser’s JavaScript thread. PDF.js relies heavily on Web Workers to provide a performance boost by moving CPU-heavy operations, like parsing and rendering, off of the main thread. Running processing expensive code in Web Workers is the default in PDF.js but can be turned off if necessary.

Promises in PDF.js

The JavaScript API of PDF.js is quite elegant and easy to use and is heavily based on Promises. Every call to the API returns a Promise, which allows asynchronous operations to be handled cleanly.

Hello World!

Let’s integrate a simple ‘Hello World!’ PDF document. The document which we are using in this example can be found at http://mozilla.github.io/pdf.js/examples/learning/helloworld.pdf.

Create a project under your local web-server such that it can be accessed using http://localhost/pdfjs_learning/index.html. PDF.js makes Ajax calls to fetch documents in chunks, so in order to make the Ajax call work locally we need to place PDF.js files in a local web-server. After creating the pdfjs_learning folder on your local web-server, place the files (pdf.js, pdf.worker.js) in it that you downloaded above. Place the following code in index.html:

<!DOCTYPE html>
<html>
  <head>
    <title>PDF.js Learning</title>
  </head>
  <body>
    <script type="text/javascript" src="pdf.js"></script>
  </body>
</html>

As you can see, we’ve included a link to the main library file, pdf.js. PDF.js automatically detects whether your browser supports Web Workers, and if it does, it will attempt to load pdf.worker.js from the same location as pdf.js. If the file is in another location, you can configure it using PDFJS.workerSrc property right after including the main library:

<script type="text/javascript" src="pdf.js"></script>
<script type="text/javascript">
    PDFJS.workerSrc = "/path/to/pdf.worker.js";
</script>

If your browser doesn’t support Web Workers there’s no need to worry as pdf.js contains all the code necessary to parse and render PDF documents without using Web Workers, but depending on your PDF documents it might halt your main JavaScript execution thread.

Let’s write some code to render the ‘Hello World!’ PDF document. Place the following code in a script tag, below the pdf.js tag.

Continue reading %Custom PDF Rendering in JavaScript with Mozilla’s PDF.Js%


Source: Sitepoint