WordCamp US Talk: Scaling Dynamic WordPress Sites
Our CTO Aaron Edwards spoke at WordCamp US last weekend about his experiences scaling dynamic websites, specifically Edublogs, CampusPress and WPMU DEV. This is an edited transcript of his presentation. If you have any questions about scaling, or even Aaron’s role as CTO, leave a comment for him at the end of the article. Scroll to the bottom for the slides from the presentation.
Normally, when you go to performance talks at events like WordCamps, speakers talk more about the front-end of your site, speeding that up, caching plugins, configurations, all of that kind of stuff, and the big focus is on full-page caching.
The problem is, if you have a dynamic website and run bbPress, eCommerce, WordPress Multisite, BuddyPress, or even a membership site – pretty much any site where a lot of your visitors are going to be logged in – users are going to skip full-page cache almost all the time, which doesn’t help you out at all.
So today, I’m going to talk about fixing page generation time as part of your stack.
If you look at a normal request to your website when you’re logged in, it might look something like this:
You’ll see this big bar where it’s waiting for your page to be generated and sent to you and then after that your browser can download all the assets and things, so just talking about this page generation time and how we can improve that can make a huge impact on your site, especially when it’s a dynamic site.
Google recommends a maximum 200 milliseconds for your page generation time, but really you want to get it much lower than that as much as possible. But when you have a lot of large plugins and themes that are running code all the time, it can be a big challenge to meet that mark.
When I first started working on this presentation, I thought I’d talk about about scaling onto multiple servers and different architectures, but I realized from my own experience that doing that doesn’t actually solve performance issues, all it’s going to do is multiply them if you don’t fix the underlying performance issues first.
In order to fix underlying issues, if you look at normal page generation time you can divide it up into a few chunks, as you can see in the image above. The biggest chunk is usually your database. It’s the biggest bottleneck you’ll run into on your site. To fix the database problem you want to optimize things as much as possible and look at your queries.
There’s an awesome plugin called Query Monitor that allows you to analyze any queries your plugins are making and identify any problems areas where you can add an index or just get rid of a plugin completely.
Optimizing your MySQL configuration is important. Query cache is a big thing. It’s surprising how many people don’t have the query cache enabled in MySQL since it can be a big help if you have a lot of read-heavy tables.
Also, convert your tables to InnoDB – high-write tables only and not all tables, because if you’re running a large WordPress Multisite install, like we are for Edublogs, converting all of your tables to InnoDB can cause some major headaches – trust me, we’ve experienced it. So we tend to focus more on the tables that have a lot of write requests going to them, so the global tables on a Multisite install.
MariaDB is an alternative to MySQL. It’s actually a fork of MySQL with newer code. You can get a 10%-20% performance boost by switching to MariaDB. And if you’re lucky enough to be hosting on Amazon, Aurora is a new service – an alternative to MySQL – that has 2-3 times better benchmarks and speeds, which is pretty amazing.
WordPress Object Cache
Ultimately, the best way to optimize your time spent in the database is never letting the queries get there in the first place, and that’s where the WordPress object cache comes in.
Normally, PHP talks to MySQL directly, which is slow, but if you have object cache configured you can cache a lot of those queries and requests in memory, which is so much faster.
A few object caching plugins are Memcache and Redis. Those are the ones I recommend because they’re memory-based and if you scale out multiple servers they can share the same cache. If you try to do file-based or APC (Alternative PHP Cache), file-based is just sometimes slower that not using one at all, so I highly recommend against that.
PHP Optimization: Code Profiling
The other thing you should look at is the PHP chunk of the page load generation time on your site. One of the most important things you can do is optimize your code.
For beginners, there’s a cool plugin called P3 (Plugin Performance Profiler) that allows you to ask yourself, “Is this plugin causing this amount of resources,” or “This plugin is a resource hog, I don’t really need it, I’ll get rid of it.” It’s very easy for beginners to use.
For more advanced users, there’s Xdebug, which allows you to profile your code and see which functions are bad, which loops are running too many times, etc. And on a production site where you’re getting a lot of traffic, New Relic is an awesome paid service that does wonders in helping you analyze and profile your PHP code.
PHP Optimization: Worst Offenders
These are some of the issues I’ve run into when analyzing PHP code:
Unnecessary and Unoptimized Queries
You can cache a lot of queries to the object cache, saving you a lot of time. On the WPMU DEV website, we run a lot of our own plugins on our own high-scale sites. We spend a lot of time trying to optimize plugins with the object cache. Also, look out for plugins like stats plugins, and others for redirection logging.
Often times, they try to write to the database on every page load – a big no-no since it will slow down your site considerably if you have a high-traffic site.
Watching Out for Remote Requests
When PHP has to call an external API, like Google or Facebook, it has to wait for a response from those services before it can finish generating your page. And if that API service is running slow, it’s going to slow your site down, too, and even crash it if that third-party service goes down. So you want to make sure you use low timeouts and you cache them as long as possible.
Don’t cache them in transients even though the Codex says that’s what they’re there for, because if that transient expires and the API service goes down, it’s going to make an external call every single time the page loads, and that will crash your site like it’s done to Edublogs before. So watch out for that.
Flushing Rewrite Rules Poorly
Some plugins will try to flush the rewrite rules every single page load. It’s still a common problem out there.
Direct Filesystem Access
Anything that tries to write to your file system directly, that will slow down your site, too.
Speeding Up PHP
If you’re running Multisite, make sure you’re using a CDN or Varnish cache in front of your media files. Many people don’t know this, but your uploads are actually re-written through PHP in Multisite – it’s a weird thing they do – but that will give you a lot of head room if you can keep those requests from hitting PHP in media.
Also, make sure you’re using the latest versions of PHP, like a 5.5, 5.6, because you can get a 20% increase in speed compared to some older versions, like 5.2.
And finally, if you’re not using the OPcache, you’re crazy because it’s built into PHP. You just need to enable it and it will speed up your requests by two times.
Switch to HHVM or PHP7
You might want to think about switching to PHP7, which was released only recently. I’m very excited to start rolling it out on Edublogs and WPMU DEV and some of the sites we host soon. It’s showing a 2-3 times higher speed than the previous version of PHP, which is a huge thing. So think about switching your server to using the latest versions of PHP if possible.
App Monitoring at Scale
Another thing you should look at doing once you have a live site up and running at scale and it’s running a lot of requests is how to monitor it.
For our sites, we use StatsD, an awesome open source project by the team at Etsy. We use it with a custom WordPress plugin I’ve written called StatsD WordPress Client. It allows you to analyze your code and what’s going on in your WordPress site without causing any latency. Developers can just put in a one-line piece of code anywhere in our stack and see how long a particular query takes or how many times a query fires. All that kinda stuff helps them optimize their code.
I highly recommend looking into StatsD if you’re able to roll it out on your own. If not, New Relic is a great paid service, but if you have multiple servers it can start getting expensive really fast.
Too Technical for You?
If all this sounds too technical for you, and you’re not able to hire an experienced sysadmin to take care of this for you, you might want to look into managed WordPress hosting.
Many managed hosts implement the suggestions I’ve talked about here, like object cache and using NGINX. Most also support HHVM on certain plans and many I’ve talked to recently are getting ready to release PHP7 support. So managed hosting may be an option for you.