Using Background Processing to Speed Up Page Load Times

This article is part of a series on building a sample application — a multi-image gallery blog — for performance benchmarking and optimizations. (View the repo here.)


In a previous article, we’ve added on-demand image resizing. Images are resized on the first request and cached for later use. By doing this, we’ve added some overhead to the first load; the system has to render thumbnails on the fly and is “blocking” the first user’s page render until image rendering is done.

The optimized approach would be to render thumbnails after a gallery is created. You may be thinking, “Okay, but we’ll then block the user who is creating the gallery?” Not only would it be a bad user experience, but it also isn’t a scalable solution. The user would get confused about long loading times or, even worse, encounter timeouts and/or errors if images are too heavy to be processed. The best solution is to move these heavy tasks into the background.

Background Jobs

Background jobs are the best way of doing any heavy processing. We can immediately notify our user that we’ve received their request and scheduled it for processing. The same way as YouTube does with uploaded videos: they aren’t accessible after the upload. The user needs to wait until the video is processed completely to preview or share it.

Processing or generating files, sending emails or any other non-critical tasks should be done in the background.

How Does Background Processing Work?

There are two key components in the background processing approach: job queue and worker(s). The application creates jobs that should be handled while workers are waiting and taking from the queue one job at a time.

Background jobs

You can create multiple worker instances (processes) to speed up processing, chop a big job up into smaller chunks and process them simultaneously. It’s up to you how you want to organize and manage background processing, but note that parallel processing isn’t a trivial task: you should take care of potential race conditions and handle failed tasks gracefully.

Our tech stack

We’re using the Beanstalkd job queue to store jobs, the Symfony Console component to implement workers as console commands and Supervisor to take care of worker processes.

If you’re using Homestead Improved, Beanstalkd and Supervisor are already installed so you can skip the installation instructions below.

Installing Beanstalkd

Beanstalkd is

a fast work queue with a generic interface originally designed for reducing the latency of page views in high-volume web applications by running time-consuming tasks asynchronously.

There are many client libraries available that you can use. In our project, we’re using Pheanstalk.

To install Beanstalkd on your Ubuntu or Debian server, simply run sudo apt-get install beanstalkd. Take a look at the official download page to learn how to install Beanstalkd on other OSes.

Once installed, Beanstalkd is started as a daemon, waiting for clients to connect and create (or process) jobs:

/etc/init.d/beanstalkd
Usage: /etc/init.d/beanstalkd {start|stop|force-stop|restart|force-reload|status}

Install Pheanstalk as a dependency by running composer require pda/pheanstalk.

The queue will be used for both creating and fetching jobs, so we’ll centralize queue creation in a factory service JobQueueFactory:

<?php

namespace AppService;

use PheanstalkPheanstalk;

class JobQueueFactory
{
    private $host = 'localhost';
    private $port = '11300';

    const QUEUE_IMAGE_RESIZE = 'resize';

    public function createQueue(): Pheanstalk
    {
        return new Pheanstalk($this->host, $this->port);
    }
}

Now we can inject the factory service wherever we need to interact with Beanstalkd queues. We are defining the queue name as a constant and referring to it when putting the job into the queue or watching the queue in workers.

Installing Supervisor

According to the official page, Supervisor is a

client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems.

We’ll be using it to start, restart, scale and monitor worker processes.

Install Supervisor on your Ubuntu/Debian server by running
sudo apt-get install supervisor. Once installed, Supervisor will be running in the background as a daemon. Use supervisorctl to control supervisor processes:

$ sudo supervisorctl help

default commands (type help <topic>):
=====================================
add    exit      open  reload  restart   start   tail
avail  fg        pid   remove  shutdown  status  update
clear  maintail  quit  reread  signal    stop    version

To control processes with Supervisor, we first have to write a configuration file and describe how we want our processes to be controlled. Configurations are stored in /etc/supervisor/conf.d/. A simple Supervisor configuration for resize workers would look like this:

[program:resize-worker]
process_name=%(program_name)s_%(process_num)02d
command=php PATH-TO-YOUR-APP/bin/console app:resize-image-worker
autostart=true
autorestart=true
numprocs=5
stderr_logfile = PATH-TO-YOUR-APP/var/log/resize-worker-stderr.log
stdout_logfile = PATH-TO-YOUR-APP/var/log/resize-worker-stdout.log

We’re telling Supervisor how to name spawned processes, the path to the command that should be run, to automatically start and restart the processes, how many processes we want to have and where to log output. Learn more about Supervisor configurations here.

Resizing images in the background

Once we have our infrastructure set up (i.e., Beanstalkd and Supervisor installed), we can modify our app to resize images in the background after the gallery is created. To do so, we need to:

  • update image serving logic in the ImageController
  • implement resize workers as console commands
  • create Supervisor configuration for our workers
  • update fixtures and resize images in the fixture class.

The post Using Background Processing to Speed Up Page Load Times appeared first on SitePoint.


Source: Sitepoint