Drupal Performance Optimization:
How to Get Your Site to Perform its Best
by Dana Winslow
January 2, 2012
Drupal Is Slow, Isn't It?
This question gets brought up all the time, whether it's by developers, site owners, or bloggers. It seems as though everyone has heard that Drupal is slow. In fact, this tidbit has become so widespread that many blogs and articles that compare Drupal to other content management systems often use this as one of Drupal's biggest "cons."
The truth is, Drupal itself isn't really all that slow. At least, it's not any slower than any other content management system. It's just like any other piece of equipment or technology. Imagine, if you will, having that shiny new 4G smartphone. Then, imagine loading it up with every app, song and game available – and then running it on a 3G network. Will you be happy with your new shiny 4G phone? Probably not. Will it be all the phone's fault?
Similarly, if you don't take care of your Drupal site and you let it get all clogged up with junk, it's not going to perform at its best. All you'll be left with is a powerful web site with memory the size of a peanut and the speed of a brick wall. And that won't do anyone any good.
So, we're going to tackle these slow loading times.
Why Does Drupal Slow Down?
Chances are, when you first installed Drupal and started building your site, you may have even thought, "Well, this isn't slow at all," or, "Whoopee, look how fast this is!" Either way, if you're reading this article now, then your site has probably slowed down a lot. Page loads are seeming to take forever, if they load at all. You're probably frustrated and trying to figure out why your site is running so slowly.
So, how is it that your once-fast site has slowed down enough to force you into searching out guides, tutorials and answers online? While many things can contribute to a slowed-down site, there are some common issues that affect many customized Drupal setups. These common issues are the ones we'll cover today.
Drupal is a big boy, and big boys need lots of energy to continue running strong. And let's face it, a lot of the functionality and power that makes Drupal so cool can also spell out a performance nightmare if you're not prepared to feed it.
In my experience, meeting the minimum requirements for Drupal is just not enough. Drupal 7 requires PHP 5.2.5, but runs much better on PHP 5.3 or higher - which is why PHP 5.2.5 is listed as a requirement but PHP 5.3 is recommended. Drupal 7 Core requires that the PHP memory limit is at least 32MB, but if you're planning on using all the glorious ooey-gooey goodness that makes Drupal so wonderful (like, say, Views?) you'll need to set that memory limit to at least 64MB. Want media, pictures, thumbnails and more? Then it needs to go higher.
If you don't want to run into issues later on down the road, then you need to make sure that your hosting company can handle everything you're about to throw at it. And that means more than just the number of visitors and files you want to cover. If they don't have the PHP version that you need, then cut your losses and find someone who does. If you can't get them to raise your memory limit above the minimum 32MB, then get away from them and move to another hosting company.
User Types, Browsing Habits and Connections
Of course, knowing how many visitors go to your site on a regular basis will help you in determing whether or not you should plan to alleviate crashes, long load times, and other potential problems. But understanding those visitors is a lot more important.
Do your visitors log in, or do they browse anonymously? Is your site primarily a "read-only" site or is there interaction with your visitors?
On a normal, out-of-the-box Drupal site, Drupal would need to make several database queries regardless of whether your visitors are primarily anonymous browsers or logged-in authenticated members. However, caching (once enabled) is easier with anonymous visitors. This is because with anonymous visitors, Drupal does not have to worry about personalizing anything on the site. Virtually all of the information on any given page can be saved and stored, allowing for load times to be cut significantly.
However, with authenticated users - members who have registered to use your site - there are generally a lot more settings and queries that cannot be saved through cache. Things like personal settings, private messages, comment forms, access permissions, and dashboards are all examples of personal things that may differ from one registered member to another. And because of these differences, the pages cannot be cached in the same way. Some modules that treat anonymous users as if they are authenticated users can also disable the caching.
Another aspect to be mindful of is what your visitors are doing when they browse around your site. How many database queries are being called? What about PHP connections? How many modules are running for each page? Are they duplicating any of the queries or functions already being called?
Using External Resources
There are a lot of resources available for you, and depending on your site, it may be very tempting to include many of them. Facebook feeds and likes, Twitter feeds, RSS feeds... some people have an Amazon.com store that they like to use. Countdowns, movies, videos and songs... the list keeps going.
These things can add a lot of flare to your site, and if done well can make your site look great. However, they can also slow down your site to a crawl. The more your site has to retrieve RSS feeds, videos, tweets, and other scripts from remote servers, the longer it is going to take your page to load.
A lot of hosting companies keep their site servers separate from their database server. This helps them to save money on hardware upgrades which, in turn, helps them save you a little bit of money as well. Unfortunately, Drupal relies heavily on contact with it's database. And when the database is on a separate server, that means that every time your Drupal site has to run a query, that query has to travel through a network before the database can respond.
And, of course, the results of that query also have to travel back through the network to the site, which then allows your site to complete its loading process for that page.
By having your database on a separate server, it's essentially been turned into the largest external resource that your site depends on. With only one or two queries, you may not even notice an increase in load times. But for some of the more complicated, feature-rich sites, the lag can be detrimental.
A lot of sites mishandle their images. Most people understand that larger images take longer to load. So a lot of sites use thumbnails to display several images, and allow their larger images to load on their own separate page. While this is good practice, it can also slow down your site if it's not done correctly.
One tactic that has become increasingly popular is to upload an image once at it's largest size, and then use the site to resize and display that same image as a thumbnail. This isn't necessarily a bad thing, especially if there's only one image on the page (like in a simple blog entry). But problems arise if your server is generating these thumbnails on every request instead of generating them only once and then storing them for later use ("caching" them). When you have multiple images displayed in this way and don't have caching properly configured, it can cause noticeable performance issues.
Know Your Cache
Drupal has a lot of performance-boosting projects to help improve your site's effectiveness. The built-in caching tools alone can speed up your site tremendously. But there are also a lot of things that can affect how well your site will cache everything.
For example, the Captcha module. Captcha is great for stopping spam, among other annoyances such as fake accounts. But Captcha does not always play well with some of the caching modules available for Drupal. In fact, the module disables page caching on all pages where a captcha puzzle appears!
There are other features that may disable caching. For example, statistics and some RSS feed displays will disable the caching of the pages where they are enabled. Knowing where to cache and what may interfere with your caching can make the difference between a load time of 3000 ms and 50 ms.
The Pareto Principle
The truth is that there are probably several reasons why your particular Drupal site might seem to be running a little slowly. And if you go searching around for all the reasons that your site might be slowing down, you'll likely find half a dozen different causes.
Oh, wait - look at that. I just gave you a half a dozen possible causes.
But I'm sure you get my point. These are really just the most common causes. There could be any number of other things going wrong that might be causing your site to perform poorly. The bad news is, you may not be able to find and address every single one.
The good news is, a lot of Drupal's performance issues can be fixed by applying the Pareto Principle, otherwise known as the 80/20 Rule.
What is the 80/20 Rule?
The 80/20 Rule, also called the Law of the Vital Few, is actually a term that is used most in business. According to the Pareto Principle, in business 80% of sales come from 20% of the clients. When you relate this to Drupal performance terms, this principle dictates that when you have a problem, 80% of the symptoms come from 20% of the causes.
Therefore, in the same token, addressing 20% of the causes will remedy or fix 80% of the symptoms. So, you may not even need to know the exact cause of your slow loading times or why your site can't handle a sudden increase in visitors. With Drupal optimization, addressing the few key aspects of your site's performance will alleviate most - if not all - of the problems you're seeing.
Consider the list that I gave earlier the vital few. If you're experiencing prformance problems in Drupal, there is a chance that they are not the cause. Perhaps they are merely contributing factors to another, well-hidden cause. Or perhaps your cause is something entirely different that should be discussed with your hosting company. Regardless of whether your exact cause is just one or a combination of several of the issues listed above, addressing these vital areas is guaranteed to help alleviate the symptoms of a slow loading, poorly performing site.
1. Addressing Your Server Configuration
As I said earlier, Drupal is a big boy. And there are several hosting companies that run into problems trying to give Drupal the resources it needs.
I have had issues with more than a few hosting companies in which they would not allow the PHP memory limit to be raised above 64 MB. I've run into other problems with hosts as well - such as only allowing one cron command in a day, having the wrong version of PHP available, and restricting the ability to add or edit a
php.ini file. So, before you settle on a hosting company, make sure that they can handle you and your site.
Recommended vs. Required
Drupal has a lot of requirements. But requirements are meant to give you just the starting tools that you need to get Drupal running out of the box. The list of requirements are not necessarily what you'll need to get your site running.
In truth, there is a very high chance that your site will need more than the listed requirements to run and perform at its best.
Disk space is a variable requirement, meaning that the exact amount you need will vary greatly depending on exactly what you're trying to do with your site. The 15 MB listed as required is only for a base installation of Drupal, and does not include database content, pictures or other media, backups or other files that may take up disk space. Contributed modules will also require more disk space - and the more modules you use, the more this requirement will grow.
However, disk space is very rarely an issue. It seems that nearly every hosting company out there that can host a Drupal site offers massive (or even "unlimited," whatever that means) disk space.
The rest of the requirements are relatively static and not wholly dependent on your needs. The web server, database server and PHP versions are requirements that must be chosen correctly to begin with.
Web Server - Although Drupal will work on both Apache and Microsoft IIS, Apache is recommended because of the level of testing already done. Most of Drupal has been developed on Apache, meaning that there is more experienced help available should a server-related problem come up.
Database Server - There are several database applications that will work with Drupal, but again, MySQL is recommended. For Drupal 7, you'll need MySQL 5.0.15 or higher. Just as with the web server issue, more testing has been done with MySQL, leading to more experienced users who will be available to help should a problem with the database server arise.
PHP - The PHP requirement, in my experience, is the requirement that should be looked at and addressed the most. Most hosting companies can handle the database server, web server, and disk space requirements with little or no issue. Meeting the PHP requirement is usually where there's a problem.
Drupal 7 requires PHP 5.2.5 or higher - but 5.3 is recommended. This is a recommendation that is not made simply because more people use it, or because more help is available. It is recommended because Drupal 7 works better with PHP 5.3. I have seen several sites' performance receive a boost simply from switching from the required PHP 5.2.5 over to the recommended PHP 5.3. This is a place not to skimp just because you'll save a buck or two each month.
Don't Forget About Your Database Server
Even if your database server is the recommended MySQL 5.0.15 or higher, it can still be slowing down your site's performance. A lot of a site's responsiveness, especially in a content management system, depends on the response time from the database.
I mentioned earlier that a lot of hosting companies try to save money on hardware upgrades and the like by hosting your database on a separate server than your web site. On smaller sites, the difference may be negligible. But for more complicated sites, the load times can be excruciating. And one of the first things anyone learns about surfing the Internet is that a slow-loading web site is a quickly-forgotten web site.
Having your database hosted on a separate server from your web site is fairly common, and does have the advantage of redundancy. Nevertheless, this is a challenge to us developers who are concerned about performance, especially given the popularity of database-driven content management systems. But there are a few hosts that understand the advantages of having the database and the site hosted on the same server.
Really, it all boils down to doing your homework. As you're researching for the perfect hosting company, think about the type of site you're building and how the database configuration might affect the speed and performance of your site. If you believe your site may have a stronger pull on the database - such as with membership sites, ecommerce sites, multi-blogging sites and social networking sites - then make sure you find a host that will accomodate this added pull.
2. Caching: Consider Your Users
And Caching Is...?
As you surf around the web, you're not actually going anywhere. Your computer and your network don't actually move. Instead, your computer and Internet browser send out a request. The server on the other end of the network responds to this request by sending out instructions on how to load required resources and assemble the web page on your computer.
If your Internet browser is set to allow caching (all standard browsers are set to cache by default) then much of the information shown on that web site, such as images, CSS files and scripts will be saved in a temporary folder. This way, the next time you wish to load up that web site, your browser will already have access to much of the information you're about to receive from that other end of the network. Instead of making several new requests and trying to re-load everything to build the same site, your browser and computer can instead use the files that have been temporarily stored on your computer. This allows the browser to build the site much faster.
Caching works in much the same way for Drupal. When you turn on Drupal's cache feature, your site will begin storing database queries and their results in a series of special files designed to hold cached data. In turn, your site will use these pre-rendered queries and their results to build your site faster whenever a visitor tries to load one of your pages.
Dynamic or Static? Authenticated or Anonymous?
Most sites operate with significant differences between the anonymous visitors and authenticated visitors to their site. Either the authenticated visitors have access to different areas, or are allowed some level of personalization such as a buddy list, shopping cart and / or wish list, saved searches, favorite posts or comments, etc. Anonymous visitors have no need for these things; subsequently, you have no need to ensure that these things are updated for them. Many times, authenticated users will specifically look for the most up-to-date content on your site while anonymous visitors may do well with cached content that is aged a few minutes or hours.
If the majority of your site is meant to be "read-only" - meaning that most or all of the site will be used for informational purposes with very little or no interaction from your visitors - then there is no reason for you to not be caching. Essentially, the more of your site's visitors who are anonymous, the more effective your site's cache will be. If you do not plan on having a lot of updates or new content posted every day, there is no reason why you should not be caching your site.
Once authenticated users log into their accounts on your site, the number of database queries increases significantly. If your visitors are primarily authenticated users who are logged into your site, a simple cache is going to be less effective and you'll need to develop another plan.
3. Reduce and Combine External Resources
There are dozens upon dozens of tidbits, extensions, and add-ons that you can use for most any site. Weather listings, news feeds, YouTube videos, Flickr photos, Facebook boxes, count downs, site statistics, Tweets, store fronts, and (of course) advertising. Many of these things are offered off-site - just copy their little bit of code onto your page and, voila, whatever news story, status update, product sale or picture they are showing on their site can now be seen on your site instead.
The problem with so many external resources, however, is the same as the database problem when it's being hosted on a different server. As your page loads, it uses that code to send out a request and return that news story, photo, movie or status update so that it can be displayed. I'm sure you can figure out - the more requests that your site has to make through the network, the longer things will take to load.
4. Handle Those Images
Make sure that the images you use on your site are optimized. By optimized, I mean make sure have file names that make sense and are easily readable, and are compressed to a size that promotes easy loading. You'll want to compress as much as possible while maintaining acceptable quality.
If you are using large images, make sure that you also have smaller copies of the photos to use as thumbnails. And make sure that the picture you upload is already the size that you want to display. It takes time to resize the picture to be the perfect size, regardless of whether you're using the server or browser-side CSS or HTML to do the resizing. This means less data will have to be transferred, and resources won't have to be used to perform resizing.
Drupal 7 offers a built in module called "Image Cache". When using a field to upload a picture, this module will create a thumbnail and save that thumbnail in a different folder from the original. Of course, image cache does a lot more than merely creating thumnails - there are all sorts of cool little tricks this module can do to your images. If everything on your site's server is configured correctly (e.g. your PHP
memory_limit is set high enough), then image cache can be a blessing.
5. Cache Handling
If you are going to be using Drupal's cache features, make sure you understand how they work and if they are working effectively for you.
I've often heard clients complain that Drupal's cache system just doesn't work for them; yet when I go to take a look, they've sabotaged themselves. They've had modules and other features enabled that would only work by first disabling cache.
How much will your "caching" be worth then? Not much of anything, because it's not being used.
So, to get the best use of your cache, make sure you know what is being cached and what might effect your site's ability to be cached. One of the most common mistakes I've seen is people having their site set to cache every page and then allow comments on their pages and implement a captcha on their comments to help prevent spam.
But, as I said earlier, the captcha module forceably disables the cache. So let's say you have a blog, and on that blog you allow comments. Many blog sites have the comments and the comment form right there at the bottom of the blog entry. If you have yours set the same way, and the captcha module is enabled for the comment form, then none - not one - of your blog entries will get cached. Have a store in which you use the comments as reviews? Same thing. And with this set up, a huge chunk of your site will end up not getting cached, increasing load times and SQL queries in a big spiral.
So, what are you supposed to do? Subject yourself to spam just to save a few seconds on loading time? No, of course not. But be mindful.
Set comments up so that the comment form appears on a different page, or is loaded into the main page via AJAX. You can set Drupal to separate the comments from their page, or you can use the Talk module as well. Use whichever method works best for you. Then, with your comment form on a separate page, you can feel free to captcha to your heart's content because the blog post itself will still be cached.
Don't instantly run out and remove all of the cache-debilitating features that you want and need; just use them wisely. Some statistics and feeds disable caching on the page where they appear, just like the captcha module does. So, when using these features, be mindful of where you place them. If you have a news feed posted in a block, and that block appears on every page of your site, of course your site's cache is not going to work and its performance is going to suffer.
Helpful Drupal Modules
Once you've completed the simple "do it yourself" steps, there are some modules offered by Drupal's community to help boost performance. Be sure to read through each module's documentation carefully, though - these are not simple plug-n-play modules. They require some configuration to get installed correctly.
1. Memcache API and Integration
The Memcache API and Integration module serves as, you guessed it, an integration between Drupal and Memcached.
Memcached works to improve dynamic web applications by alleviating database load. It helps to optimize your site's use of available memory for caching. The installation looks scary, so be sure to read through every thing carefully to make sure that you've got it set correctly. However, once set, you'll see that it's totally worth the effort. Make sure your webhost supports memcache before trying this module.
2. APC - Alternative PHP Cache
The APC - Alternative PHP Cache module works to provide a reliable framework for caching and optimizing PHP intermediate code. It is best suited for caches which are not expected to change often or grow beyond a certain size.
The Boost module is another module that requires a little extra effort to get installed and working correctly. But, once it's up and running, you'll be glad you put in that extra effort.
Unlike some other caching modules, Boost has been tested and works well on all sorts of server types - Shared, VPS, Dedicated. This module provides static page caching, which will have significant impact on your site's performance if your site's visitors are primarily anonymous visitors. If your site's traffic is mostly logged in / authenticated users, then Boost won't help you all that much.
4. Varnish HTTP Accelerator Integration
The Varnsh HTTP Accelerator Integration module works as a stack above Apache, helping to serve up cached files and page views faster for anonymous visitors.
And My Personal Favorite Is...
Personally, I use either Boost or Memcache depending on a particular site's needs. Boost if the traffic is mostly anonymous, and Memcache if the traffic is mostly registered members. Seems fairly cut and dry, right?
But, in recent years, I have been working with more and more clients where it hasn't been quite so cut and dry. Their traffic has been almost equal between anonymous and logged in / authenticated users. So, you may like to know that these two modules play nicely together when installed on the same site - without any more work or configuration beyond their normal installation.
Other Hints and Tricks
There are, of course, several other things that you can do to get your performance at its best. For example, disable any core modules that eat up a lot of resources but aren't necessary.
For example, I don't think I have ever used the color module. It's a nice feature to have, in theory, and there are a couple of modules that require it for a particular effect. But for the most part, the color module is for themes - and since I build themes for my clients, there's is no reason to use the color module. So, this module never gets installed on any of my sites.
The overlay module might also be one that you don't need. And the toolbar module, though it's nice, can also slow down some sites. Personally, I like having both of these modules enabled. But, as a last resort when optimizing, there is a noticeable performance boost when these modules are disabled. And the Administration Menu module does work faster than the toolbar module. If you've tried everything else and still need to shave a few more milliseconds off your site's load times, disabling these two modules can help.
The Update Manager module can also cause a small strain on your site's performance. Again, this is a module that is nice to have working, but in situations where every millisecond counts, disabling this module can help. Just make sure that you remain vigilant in checking for updates on your own if you disable this module.
If you used the Devel module during your site's development, make sure to disable it once your site is live. I've found that the Devel module can be a useful tool in troubleshooting some performance issues, such as when you need to know how many database connections are happening, but overall this module can kill the performance of a live site.
YSlow analyzes pages and returns a list of possible performance problems on the page.
When you run the YSlow test, it will report on a number of performance elements and provide tips to help alleviate any problems. It then categorizes these components and offers each category a grade - the better the grade, the better your performance. Near the top is an overall grade for your page. Here is a screenshot showing the grade for one of my client's sites as an 89 (a B):
How does this compare to other sites? Well, I ran the same YSlow test on some of the web's most popular sites (based on the list available from Alexa) and some of the most popular CMS sites. Here are the scores that came up:
- Drupal.org - 90 (B)
- Wordpress.org - 84 (B)
- PostNuke.com - 81 (B)
- YouTube.com (not logged in) - 81 (B)
- Amazon.com (logged in) - 78 (C)
- Joomla.org - 78 (C)
- Google.com (logged in) - 77 (C)
- Facebook.com (logged in) - 77 (C)
- Wordpress.com (not logged in) - 77 (C)
- Plone.org - 76 (C)
So, as you can see, a score of 89 is pretty good overall in comparison with this list. And while I'm happy and my client has no issues with the overall performance of this site, there are still a list of things I can do to improve the performance even further.
Interestingly, the YSlow site scored an 82 (B) on their own test...
And It All Adds Up To...
There's a good chance that you won't have to go through most of these steps. Following the Pareto Principle, fixing just one or two of these things will get rid of most of your performance problems. Heck, even just making sure that the cache is not accidentally being disabled can help tremendously. My advice? Don't try to do all of these fixes all at once. If you start changing, installing and disabling everything all at once, you'll have a harder time trying to figure out exactly what helped that particular site. You won't learn as much, and you might accidentally break something.
And, of course, every site is different - so every site is going to require a different solution.
Start with the server configuration, then address your user types and behavior, then implement the correct caching techniques needed to optimize your site's performance. Then start reducing external resources (internalizing and combining where possible) and optimizing your images. Use YSlow to help diagnose any other potential performance problems and work to correct the ones you can (not all of them will be within your power to correct).
Most of all, be patient and mindful of what you're doing. Watch for modules that are not compatible with your cache and find ways to make them work. Make sure you disable and uninstall a module before removing it from your modules directory. Use only the modules you need and reduce the number of SQL queries wherever possible.
Drupal is big - and it can be hungry. But if you feed and care for it properly, it is highly effective and provides performance that will compete with any other CMS.