This article was sponsored by Monitis. Thank you for supporting the sponsors who make SitePoint possible.
For better or for worse, our jobs as developers don’t end with that last line of code, the final commit or by hitting the “deploy” button.
Even the best-engineered web application isn’t bullet-proof, the most expensive hosting environments still aren’t one hundred percent reliable; and, ultimately, there’s always something that can go wrong.
We can plan for failure, have processes in place in the event of a problem arising and even contingencies for a genuine disaster, but in between we can monitor.
Monitoring allows us to be reactive; to take action in the event of a problem, as well as proactive; taking preventative action before an issue arises.
In this article we’re going to take a look at monitoring, in the context of a website or a web application. Along the way we’ll be taking a detailed look at Monitis, an all-in-one monitoring platform that is one of the leaders in its field, and how it can help make sure that once you’ve launched your app, it remains running and keeps performing.
What can go wrong?
In order to know how and what we should monitor, it helps to have an understanding of what could potentially go wrong. The short answer is probably “well, a lot” — and as such, it’s a really tough question to answer definitively. At the same time, though, there are a range of things that we can anticipate might go wrong.
Broadly speaking, we can divide these issues into a number of categories:
- Hitting “hard” limits; for example a full disk, hitting a physical memory limit or reaching the maximum number of processes
- Network issues; for example a site becoming unreachable, high packet losses, server connections going down or DNS failures
- Component or service failures; perhaps your database server has gone down, for example.
- Problems with third-party services; your S3 bucket is unreachable, your mail provider is experiencing issues, or your CDN has gone down
- Problems with your applications; errors and exceptions, inconsistencies in your data or even bugs in your code
- Keeping third-party code or operating system components up-to-date; in particular checking for security patches or service packs
- There are even silly examples of human error which, alas, do still happen; such as forgetting to renew a SSL certificate.
Once you have an idea of what can go wrong, you start to get a feel for what to monitor.
Continue reading %Getting Started with Web Application Monitoring%