Scalable Web Applications – Programming the new world

Purpose of the entry

On Saturday June 13th 2009 I attended a talk by Eli White on Scalable web applications. Eli White previously worked at digg.com and now holds the position PHP Community Manager & DevZone Editor-in-Chief at Zend Technologies. When you hear him talk you immediately notice he has a high knowledge on programming, good practices and he is eager to distribute his knowledge and does it with passion. That explains his position at Zend. Eli gave for what I am concerned the best talk at the conference. Of course I didn’t attend all talks but I can hardly imagine a talk more interesting than this. He gave valuable tips on how to implement new projects with small tips. He showed you how you could already take some small steps that would lead to great benefit if your application would turn out to be mass consumed. The following entry is my interpretation of the talk itself.

Overview

What is scalable application design
Tip 1: load balancing the webserver
Tip 2: scaling from a single DB server to a Master-Slave setup
Tip 3: Partitioning, Vertical DB Scaling
Tip 4: Partitioning, horizontal DB Scaling
Tip 5: Application Level Partitioning
Tip 6: Caching to get around your database
Resources
Closing notes

What is scalable application design

Scalable application design first means, there is a design for an application. So there is a start on the application itself. It is no use designing and thinking on how to create your killer application by trying to fit in as many good practices as possible, so that in the end you feel discouraged in even beginning to program. First step in creating an application is actually beginning. Do not overkill the project with great ideas which will take forever to implement or require you to study till you drop. Does this mean you do not have to study or apply good design principles? Of course not, you should on a permanent basis. It should become a habit, a second nature. It just means do not overdo it in the way you will never begin the project at all. To make it easy for you Eli gave some practical tips which you can apply immediately without any problems and which doesn’t require any real study or understanding of design principles.

Tip 1: load balancing the webserver

Load balancing is the act of dispatching a user his request towards one server that is part of a server farm. This way the load (generated by user requests) is distributed amongst several servers and your application will be able to respond quicker. There are some caveats to keep in mind when coding for a system that will eventually have load balancing.
Do not program depending on a cache that is written to the filesystem, do not rely on the filesystem at all. The reason why is straight forward. Because the requests are distributed amongst the servers the user his second request might end up on a different server where the cache is not available. So if you are using local caching (APC / Zend Server) avoid assuming exclusive/single cache. Most of us start with only having one server, so it might be a good idea to encapsulate the sessions, cache reliabilities. So if you switch to load balancing it is only a matter of changing the code in the encapsulation.

Tip 2: scaling from a single DB server to a Master-Slave setup

class DB {
    private $master;
    private $slave;
    private static $cfg = array(
        'write' => array('mysql:dbname=MyDB;host=213.136.52.29'),
        'read' =>  array('mysql:dbname=MyDB;host=213.136.52.30',
                         'mysql:dbname=MyDB;host=213.136.52.31',
                         'mysql:dbname=MyDB;host=213.136.52.32');
        );

    public function __construct()
    {
        $this->master = $this->_getConnection('write');
        $this->slave = $this->_getConnection('read');
    }

    public function query($query, $pool)
    {
        if('master' == $pool) {
            return $this->master->query($query);
        } else if('slave' == $pool) {
            return $this->slave->query($query);
        }
    }

    private function _getConnection($pool) {
        $max = count(self::$cfg[$pool]) - 1;
        $dsn = self::$cfg[$pool][mt_rand(0, $max)];
        return new PDO($dsn, USER, PASS);
    }
}

Adopting a Master-Slave setup is not available for the start-up developer. It is a costly thing, and most of us have a shared hosting account or one dedicated server. Does that mean you should not already implement it? Of Course not. You should already accommodate your code for this setup. And it can be done with little extra coding. Create an extra layer which can instantiate connections to different DB servers. As many as you want for Master and as many as you want for Slaves. Implement your layer that you have a query method which accepts an extra parameter. Everytime you query the database for a write action (insert, delete or update) you pass the extra param as ‘master’ and for all reads (select) ‘slave’. Then in your layer delegate the query to the specific server: master or slave. I only have one dedicated server so In my code I make sure both ‘master’ and ‘slave’ point to the same database connection. But as soon as my website gets heavy traffic I can add another DB server, and all I have to do is change one method and that is it. All my write queries go to the master which is replicated to the slave and all my select queries are done on the slave. You just made your application more scalable.
Is it really that simple? In fact it is.

$db->query('update article set comments += 1', 'master');
$db->query('select comments from articles', 'slave');

Of course you have to take several other issues into consideration. One of those is slave-lag.
Slave-lag is the time needed between an Insert, Delete, Update query on the master and the replication towards the slave. We are speaking miliseconds here, but it could be enough for you to mismatch the latest ID from your select statement. Write your code that it is not dependent on lag. Lag from any matter because if you really start to scale bigtime, you will scale over several data centers around the planet and then replicating one farm to another can take seconds, minutes…

Tip 3: Partitioning, Vertical DB Scaling

Sometimes tables can become to big when you talk about the number of columns they have. That will mean that with every action on the database the query will experience overhead. When a table of 15 columns has to be searched on one field and only 2 columns should be returned almost every time then it can make sense to split up the data in multiple tables. This will decrease the size of the tables and thus also the time and memory needed to query them. You can vertically partition your tables when one of the following situations presents itself:

Columns that are rarely used can move to their own table
Columns that are often empty
And columns that are not used in the where clause

Tip 4: Partitioning, horizontal DB Scaling

A table can have too much columns, but a table can also reach a point that it can have to much rows. In this case it might be wise to split the table in rowsets.
The splitting can be done on several ways. Here are the more common ones.

Range Based
Date Based
Interlaced
User Based

Depending on your application you can use any of the methologies described above. For a news site you might want to give users access to the archive with articles of last year. Mostly we only access the ones from this year. Date based would be a good way to go for such an application. Range based is when you have for instance 8 million users in your database and want to split it up per one million rowsets based on the userId or any other primary key. Interlaced means every row is switched between tables. First row goes to the first table, second to the second table, third to the third, fourth back to the first table and so on. And last but not least is user based. You can split upon username alphabet or any other characteristic, but remember all this horizontal scaling must be retrieved from your application code. So think hard on how you want to split. It is to no use if you split it up and you still have to query each and every table to find the data you are looking for. So put some thought in it, you need to be able to map it later on from your code.

Tip 5: Application Level Partitioning

Application level partitioning involves moving various tables of your DB onto different servers. These can be single tables or related tables so you still are able to join them in your query.

Tip 6: Caching to get around your database

Your performance can get a real boost with the right DB result caching. Always try to Write through cache in as close to the final processed form possible but choose small, discrete, reusable data units. Of course you should not be storing data you can’t recreate. Take special note on the discrete, reusable data units. These are units that can be re-used by other parts of your application or by the same part but on different requests. This way caching can be shared and no (extra) calls need to be made to the database.
For example Facebook has millions of users and those millions of users can keep track of hundreds of friends. If every status update would have to be queried from a database the application would be painfully slow. Facebook solved this issue with reusable data units. When you have 100 friends in your list facebook creates 100 small cached units for every friend individually. If one of your friends also has 100 frends and shares 50, Facebook can reuse 50 units and get the rest from the database. Another user might have 75 friends, 40 the same as you and 25 the same as your friend. 65 data units can be reused and only 10 have to be extracted from the database. AS you can see this way you can avoid a lot of calls. What happens when a user makes a status update? Well the status update is written towards the master Db and to the cached data unit of that specific user. No query is needed to retrieve the data. All application parts that use that unit are updated automatically. Only when the cache would turn out invalid a select query is done. I hope you see the real power in this kind of caching.

Resources

Closing notes

Most bottlenecks in the next hype is at database level. Distributing your data in databases accross multiple database servers is often the best and only solution. The problem is if you have to rewrite all your code to accommodate these changes across multiple servers it might take a lot of budget. Which can be easily prevented.
Of course you do not yet have these scaling problems. Hopefully you eventually will and that means you should already encapsulate your code that will handle the queries. By encapsulating you have a single spot where you will have to make the necessary changes (if any, because you could start developing with all these tips and tricks already in mind and implement them). For small applications the overhead that you create with for instance vertical partitioning is small. But the benefit is huge when your application is the next big thing. The caching is something you could already implement to the full extend.

Now it is up to you!
Have fun.
Nick Belhomme

June 25, 2009 at 7:36 pm

EliW says:

Hey Nick! Thanks for this writeup, it’s a great summary of my talk. A few things I want to clarify:

The references to ‘Facebook’ in this talk about caching discrete units of data. My example there was actually Digg.com (I used Facebook in an example about partitioning). In this case, it’s using memcached, and simply instead of caching large sets of data that will have duplication in them. (Such as caching all the stories dugg by each of a user’s 100 friends as a whole unit). It’s about breaking that down into reusable pieces. So in this case, separately putting in memcached the ‘stories dugg’ by each user. Then when someone logs in, you just request each of their friend’s information. Yes, you end up making 100 memcached queries for example in that case. But that’s still better than making 1 DB query. And by making the units small enough, you’ve gained other efficiencies as discussed in this article (when the second user logs on, you get to read even less from the database)

On a second point. The example code you give above nick for doing master/slave writing. It’s some good basic example code. But it should be clarified that it’s not a complete solution. Part of that solution would involve handling failover for example. (If one slave fails to connect, just try another one, gaining you not only scalability, but uptime)

Also, as your code there is written, it always makes a connection to both the master and a slave. It would be preferable to make those connections on the fly. Afterall that way if you have a page that never needs a ‘master’ connection in the first place, then you don’t waste resources connecting to it.

And as response to claylenhart: You are correct. For ‘small to medium to large-ish’ tables, just making indices is perfectly valid and was already assumed. What Nick’s writeup here doesn’t quite convey, was that these were ‘steps’ I was going through, in order. Doing Vertical and Horizontal partitioning are things that should only be breached well beyond the realm of issues you are talking about. Where you are talking about billions of rows (and potentially wide rows), and where you are literally seeing performance issues of your indices not keeping up. These are extreme measures at the outer limits of scalability.

Thanks again Nick!

7 comments

June 24, 2009 at 5:39 pm

zerone says:

Hey, nice article!

This facebook data units are files? Plain files? Or a bunch of serialized data?

Or a file per user with the top online friends (for the chat, for example). Can you elaborate a bit more. Thanks
June 24, 2009 at 9:52 pm

claylenhart says:

I want to discourage the two partitioning tips, 3 and 4.

For vertical partitioning, a good alternative is covering indexes that would include the frequently used columns such as id, nickname, password and firstname. The database engine will just read the columns from the index and will not read the table, effectively giving vertical partitioning without the effort.

For horizontal partitioning, indexes are not as bad as you might think. For 8 million rows, the DB engine will only need to read about 4 pages of data to find a singe record (it depends on the width of the clustered index). Splitting the table into multiple 1 million row tables will, at best, reduce this by one to 3 pages, but still could be 4 pages. In the best case when you only need to examine one table, it will be better, but when you need to examine multiple tables, you’ll likely join these partitioned tables and the query will likely look for each record in all the partitioned tables making the query much slower.
June 25, 2009 at 8:25 am

Amit says:

Your self examples and good closing notes made it a good read even though its mostly summary of “Habits of Highly Scalable Web Applications” slide By Eli White. Thanks for sharing your thoughts.
Pingback: » Zend Framework 1.8 Workshop at the Dutch PHP Conference 09 Programming the new world: Programming your life and the net, one day at a time
June 25, 2009 at 7:36 pm

EliW says:

Hey Nick! Thanks for this writeup, it’s a great summary of my talk. A few things I want to clarify:

The references to ‘Facebook’ in this talk about caching discrete units of data. My example there was actually Digg.com (I used Facebook in an example about partitioning). In this case, it’s using memcached, and simply instead of caching large sets of data that will have duplication in them. (Such as caching all the stories dugg by each of a user’s 100 friends as a whole unit). It’s about breaking that down into reusable pieces. So in this case, separately putting in memcached the ‘stories dugg’ by each user. Then when someone logs in, you just request each of their friend’s information. Yes, you end up making 100 memcached queries for example in that case. But that’s still better than making 1 DB query. And by making the units small enough, you’ve gained other efficiencies as discussed in this article (when the second user logs on, you get to read even less from the database)

On a second point. The example code you give above nick for doing master/slave writing. It’s some good basic example code. But it should be clarified that it’s not a complete solution. Part of that solution would involve handling failover for example. (If one slave fails to connect, just try another one, gaining you not only scalability, but uptime)

Also, as your code there is written, it always makes a connection to both the master and a slave. It would be preferable to make those connections on the fly. Afterall that way if you have a page that never needs a ‘master’ connection in the first place, then you don’t waste resources connecting to it.

And as response to claylenhart: You are correct. For ‘small to medium to large-ish’ tables, just making indices is perfectly valid and was already assumed. What Nick’s writeup here doesn’t quite convey, was that these were ‘steps’ I was going through, in order. Doing Vertical and Horizontal partitioning are things that should only be breached well beyond the realm of issues you are talking about. Where you are talking about billions of rows (and potentially wide rows), and where you are literally seeing performance issues of your indices not keeping up. These are extreme measures at the outer limits of scalability.

Thanks again Nick!
June 25, 2009 at 7:42 pm

Nick says:

Well there you have it, from the main man himself. Thank you so much for the clarifications Eli!
December 1, 2009 at 5:17 am

Rizky Rukmana says:

Hi,

This is interested post.
Anyway what is Application Level Partitioning ? is it database Sharding?

tip 7: Shared Storage System like NFS/SAN.
tip 8: Separate your static files or CDNed.
tip 9: Do database archiving
tip 10: Separate OLTP database from OLAP usage