Imagine a site that keeps track of gamers' scores in computer games and displays gamers in "leaderboards" ordered by decreasing score. The site is written in PHP and the backend is a MySQL 5 database server. Because the data changes frequently, the server uses the InnoDB storage engine.
This scenario occurs often in applications that need to rank and paginate data by some criterion such as popularity, score, or recency. Forums are a good example, as are hot lists on social bookmarking sites.
Because the example gaming site has millions of users, the leaderboards use pagination. This means they require a combination of ranking and limits with offets. This is a common requirement, and it's hard to optimize. The obvious queries will probably not perform well under heavy load or high concurrency, and don't scale with increasing data size. The ranking information will either be expensive to generate or expensive to maintain.
I've found three possible solutions. The obvious design performs badly, but there is a better way.
I think these designs are appropriate for a mid-sized site, which is big enough that you can't use a single MySQL server without optimization, but not so big that it requires massive clustering or horizontal scaling. My goal is to show how you can design ranking applications to get more out of your existing systems.
Before I start, I need to explain the site in more detail. It keeps scores for 10,000 gamers. The gamers collectively play 10 games in 5 countries. Pretend every gamer plays every game, so the site keeps 100,000 score records. I'm intentionally making this small so you can run the queries on an average computer.
The site's most important pages are the leaderboards. They display gamers in descending order of rank by various criteria such as country, game, and overall. For example, gamer 1010 lives in country 2 and plays game 1. She is ranked 51st in game 1 in her country, 290th in game 1 in the world, and 1,867th overall in the world.
Each leaderboard shows twenty gamers per page. Suppose you click on the link that leads from gamer 1010's profile to her leaderboard page in game 1. There are 50 gamers in her country with an equal or higher score in game 1, so you see page 3 in the leaderboard, and she is 11th on the page.
Here's a script that will create tables and populate them with quasi-random data for the rest of this article. I've given a seed to all calls to
RAND(), so if you run this script, you will get the same data I am using. This script will generate 100,000 rows and takes several minutes to run on my desktop machine.
set @num_gamers := 10000, @num_countries := 5, @num_games := 10; drop table if exists gamer; drop table if exists game; drop table if exists country; drop table if exists score; drop table if exists semaphore; create table semaphore(i int not null primary key); insert into semaphore(i) values (0); create table gamer( gamer int not null, country int not null, name varchar(20) not null, primary key(gamer) ) engine=InnoDB; create table game( game int not null, name varchar(20) not null, primary key(game) ) engine=InnoDB; create table score( gamer int not null, game int not null, score int not null, primary key(gamer, game), index(game, score), index(score) ) engine=InnoDB; create table country( country int not null, name varchar(20) not null, primary key(country) ) engine=InnoDB; -- I use the integers table to generate large result sets. drop table if exists integers; create table integers(i int not null primary key); insert into integers(i) values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9); insert into country(country, name) select t.i * 10 + u.i, concat('country', t.i * 10 + u.i) from integers as u cross join integers as t where t.i * 10 + u.i < @num_countries; insert into game(game, name) select t.i * 10 + u.i, concat('game', t.i * 10 + u.i) from integers as u cross join integers as t where t.i * 10 + u.i < @num_games; insert into gamer(gamer, name, country) select th.i * 1000 + h.i * 100 + t.i * 10 + u.i, concat('gamer', th.i * 1000 + h.i * 100 + t.i * 10 + u.i), floor(rand(0) * @num_countries) from integers as u cross join integers as t cross join integers as h cross join integers as th; insert into score(gamer, game, score) select gamer.gamer, game.game, floor(rand(0) * @num_gamers * 10) from gamer cross join game;
Take the page showing gamer 1010's position in game 1 as an example. The simplest design for a leaderboard page is to find on which page gamer 1010 falls, then select 20 gamers in descending order of game 1 score, discarding the data for pages 1 through 19. This is design #1. In SQL:
-- Find gamer 1010's score in game 1. Result: 97101 select score from score where gamer = 1010 and game = 1; -- At which place does gamer 1010 appear? Result: 309. -- Ties are resolved by gamer ID. select count(*) from score where score.game = 1 and score.score >= 97101; -- Select the leaderboard page that shows gamer 1010. select gamer.gamer, gamer.name, score.score from gamer inner join score on gamer.gamer = score.gamer where score.game = 1 order by score.score desc limit 300, 20; -- Find the rank for the first row on that page. Result: 284. select count(distinct score.score) from score where score.game = 1 and score.score >= 97173;
You need the last query because the others don't give the PHP code all the data it needs. The leaderboard query selects the rows, but not their ranks, and the leaderboard page needs to display ranks. It's up to the PHP code to add the rank; every time it sees a row with a different score than the last row, it has to increment the rank. Still, it needs to know the first row's rank, which is what that last query finds.
The previous queries scan and discard many rows just to retrieve 20. Is there a better way to get a row's rank than counting, on average, half the rows in the current game?
The queries need to find ranks and offsets; that's unavoidable. Unfortunately, InnoDB doesn't perform well for
The other problem is how to optimize a query that does a LIMIT with an offset.
LIMIT 20 is not a problem, but
LIMIT 2000, 20 is, because it generates and discards 99 percent of the rows. I need to get rid of the offset and make sure the
ORDER BY uses an index.
To accomplish this, I'll denormalize slightly by adding indexed rank columns to the tables. Ignoring the maintenance cost for the moment, this is design #2:
-- Find score and rank as above. select score, rank_in_game from score_ranked where gamer = 1010 and game = 1; -- Find position in leaderboard again. Result: 309 select count(*) from score_ranked where game = 1 and score >= 97101; select gamer.gamer, gamer.name, u.score, u.rank_in_game from gamer inner join ( ( -- Fetch start (first 9 rows) of leaderboard. select gamer, score, rank_in_game from score_ranked where game = 1 and rank_in_game <= 290 and gamer <= 1010 order by rank_in_game desc, gamer desc limit 9 ) union all ( -- Fetch remaining 11 rows. select gamer, score, rank_in_game from score_ranked where game = 1 and ((gamer > 1010 and rank_in_game = 290) or rank_in_game > 290) order by rank_in_game, gamer limit 11 ) order by rank_in_game, gamer ) as u on gamer.gamer = u.gamer;
That's actually an improvement, even though it looks complicated. I've eliminated one
COUNT() query and gotten rid of the offsets on the limits. The queries now return all the necessary data for the leaderboard, including the rank.
There is still one
COUNT() query to align the page perfectly modulo 20, but if I don't care about that, I can remove it too. This is design #3:
-- Find score and rank as above. select score, rank_in_game from score_ranked where gamer = 1010 and game = 1; -- Select the leaderboard page that shows gamer 1010. select gamer.gamer, gamer.name, u.score, u.rank_in_game from gamer inner join score_ranked as u on gamer.gamer = u.gamer where game = 1 and rank_in_game >= 290 and u.gamer >= 1010 order by rank_in_game, gamer limit 20;
This is even better. The query looks like it's only going to read a few rows. Other types of leaderboards (rank in country, rank overall, etc) can be similarly optimized.
If you must align the pagination along multiples of 20, you can do an extra query the first time a random record is accessed, use it to align the page, and then carry the positioning along to further requests.
The rank column is basically a cache. This seems like a better design than calculating rank with every query, except for one thing: maintaining the column. If there isn't a cheap way to do this, I've just made the table and indexes larger, and moved the work from the
SELECT queries somewhere else. In some cases, this might be a big win for example, if the data changes infrequently. Yet as I told you, this site has frequent updates, so this design is only justifiable if I have a cheap way to maintain the rank column.
Fortunately, there is a way. It relies on two things. The first is simulating a write cursor with MySQL's user variables, so you can do one-pass rank calculations. The second is smart update batching; you can calculate how expensive an update will be, and apply it immediately only if it's cheap.
Here's how to add the
rank_in_game column I used in the previous queries:
drop table if exists score_ranked; create table score_ranked like score; alter table score_ranked add rank_in_game int not null default 0, add key(game, rank_in_game);
I'm creating a new table, rather than modifying the existing one. I'm going to benchmark later, so I want to keep the original table as it is.
Now I'm ready to populate the table. This is where I have to get tricky with user variables. Here's the query to insert the existing data while simultaneously calculating rank:
set @rank := 0, @game := null, @score := null; insert into score_ranked(gamer, game, score, rank_in_game) select gamer, game, score, greatest( @rank := if(@game = game and @score = score, @rank, if(@game <> game, 1, @rank + 1)), least(0, @score := score), least(0, @game := game)) as rank from score order by game desc, score desc;
Don't worry, I'll explain how that works!
MySQL's user variables let you use and assign to them at the same time. This lets you do some really neat things with them. I've written several articles about advanced user variable techniques on my own website, but I'll give a summary here. First, analyze the query to see how it works:
select gamer, game, score, greatest( -- Calculate current row's rank @rank := if(@game = game and @score = score, @rank, if(@game <> game, 1, @rank + 1)), -- Save score and game for next row least(0, @score := score), least(0, @game := game)) as rank from score order by game desc, score desc;
GREATEST() function for a moment, and read the query from the inside out. The first thing to know about a user-variable expression is that it executes for every row in the result set, when the row is generated. In other words, MySQL processes the rows iteratively, not all at once as a set.
The second is that the query reads every row from the
score table in rank order. This query only works if the query reads the rows in the right order.
The query works because you can find a gamer's rank by starting at 1, and increasing the rank every time you find a gamer with a lower score. Row by row, the query looks to see if the current game and score are the same as the last values seen. If so,
@rank remains unchanged; otherwise it resets to 1 when the game changes, and increments
@rank if the game is the same.
After finding the current row's rank, the query needs to save the current game and score for the next row, but you can't put these assignments into a column in the
SELECT list. There are no corresponding columns in the destination table to accept them, so it must be possible somehow to remove those spurious columns. That's why I wrapped them inside a
LEAST(0, ...) function, and then wrapped the whole thing inside
GREATEST(). This ensures that the
@rank assignments are executed but discarded.
GREATEST() function returns the newly computed
So far I've shown you how to initialize the rank column, but not how to maintain it when scores change. The good news is, the user-variable trick works in
UPDATE statements too. You can't normally place user-variable assignments in a
SET clause, but you can if they're inside a function:
set @rank := 0, @pos := 0, @game := null, @score := null; update score_ranked set rank_in_game = greatest( @rank := if(@game = game and @score = score, @rank, if(@game <> game, 1, @rank + 1)), least(0, @pos := if(@game = game, @pos + 1, 1)), least(0, @score := score), least(0, @game := game)), pos_in_game = @pos order by game desc, score desc;
This will update the entire table in one pass. If you need to update every row, this is as good as it gets.
This query performs the kind of logic you can do in other database products with cursors. MySQL 5 has server-side cursors, but they're read-only. This query simulates a write cursor.
You don't always have to update the whole table. I can think of at least four major scenarios, each of which has a separate optimization.
Scenario 1, the best case, is when the change doesn't affect any existing data. For example, suppose I want to delete gamer 1010's score for game 1. There is another gamer with the same score, so the rankings do not change. Inserting a new record for game 1 with the same score won't change them either.
Scenario 2 is where only a single row needs an update. For example, suppose gamer 1010's score for game 1 decreases to 97084. There's an existing row with that same score, so gamer 1010's score and rank will change, but nothing else will. Another single-row scenario is when a row's score goes up or down, but not enough to reach its nearest "neighbor." For example, look at game 1 scores for ranks 291 through 294. If gamer 2805's score increases to 97070, no other rows are affected; gamer 2805 is still at rank 294:
+-------+------+-------+--------------+ | gamer | game | score | rank_in_game | +-------+------+-------+--------------+ | 9094 | 1 | 97084 | 291 | | 7462 | 1 | 97076 | 292 | | 4839 | 1 | 97075 | 293 | | 2805 | 1 | 97062 | 294 | +-------+------+-------+--------------+
Scenario 3 is when an internal range of rows need updates. For example, suppose gamer 4839's score increases to 97080. This moves gamer 4839 into rank 292, and pushes gamer 7462 down one rank to 293, but it affects only those two rows. If gamer 2805's score had increased to 97078, several rows would move down in the ranks, but the change would still be localized.
Finally, an update could affect every row with a lower score for that game "from here to the end of the rank." For example, if gamer 1010's score increased to 97102, she would still be in position 290, but every gamer with a lower score would decrease by 1. This is what I call an "expensive" update. In the worst case, all records for a game would have to be updated.
It's easy to detect each of these scenarios. You just need a few simple queries for each kind of operation: inserts, deletes, and updates. The first query determines if the operation will be expensive, and other queries handle whatever work you decide to do as a result. For example, this query will determine if it will be expensive to insert a new gamer with score 97101 for game 1:
select count(*) as cheap, coalesce( avg(rank_in_game), ( select rank_in_game from score_ranked use index(score) where game = 1 and score < 97101 order by score desc limit 1 ) ) as rank from score_ranked where game = 1 and score = 97101;
That query returns a non-zero value for
cheap, so it will be a cheap operation. If there had been zero rows, the subquery would tell where the beginning of the open-ended range is, and it would be up to you to decide (hopefully with some statistics) whether it's an expensive range to update. If the new score were very low, which is likely because presumably a lot of
INSERTs come from new gamers, it would probably go near the end of the ranking, so only a few rows need updating.
You can update scores and ranks at the same time with standard SQL queries that have no user-variables. For example, if you're only increasing gamer 2805's score to 97085, and your initial query determined it'll "bump" three other scores out of position, you can write the update like:
update score_ranked set score = if(gamer = 2805, 97085, score), rank_in_game = if(gamer = 2805, 291, rank_in_game + 1) where game = 1 and rank_in_game between 291 and 294;
Similar queries are easy to write for other cases.
These optimizations work well if you do single-row updates, but some queries are expensive, and will require you to update more of the table--perhaps the whole table at once. It's probably better to postpone these and do them in a batch. When you do, the user-variable query saves the day. It makes the whole batch about as efficient as an individual "expensive" query. Without user-variables, you will have to do at least some updates with slower methods.
The strategy I just showed you takes one pass through the table. That is, it's an O(n) algorithm in the size of the table (as I showed, the best case is O(1), but the worst case is bounded above by n). This is as good as you can get when you have to calculate a rank for each row.
The alternatives I know of are quadratic, or O(n2) in the size of the table. For example:
-- This query is illegal in MySQL, but even if it weren't, -- it would take a very long time to run. update score_ranked as o set rank_in_game = ( select count(distinct score) from score_ranked as i where i.score >= o.score and i.game = o.game); -- This is legal, but creates an un-indexed temp table (even worse!) -- inside the subquery. Don't try this at home! update score_ranked as o set rank_in_game = ( select count(distinct score) from (select * from score_ranked) as i where i.score >= o.score and i.game = o.game); -- This is still a cross join because of the <= update score_ranked as o inner join ( select l.game, l.score, count(distinct r.score) as rank from score_ranked as l left outer join score_ranked as r on l.game = r.game and l.score <= r.score group by l.game, l.score ) as i on o.game = i.game and o.score = i.score set o.rank_in_game = i.rank;
If that were the best you could do in SQL, what are your other options? One is to simulate cursors in a programming language like PHP. This means selecting the entire table and looping through the rows, calculating each row's rank and sending an
UPDATE back to the MySQL server. I advise against this. It's not quadratic, but it's death by 100,000 cuts.
Failing that, you have to abandon rank columns, and settle for all the
COUNT() queries and limits with offsets. In my experience as a consultant, this is the path most people take, and it doesn't scale. If it did, they wouldn't have called me for help.
Think about it this way: the leaderboards (reads) for Design #1 are O(n) in the size of the table, and writes are O(1). If I eliminate
OFFSET queries, the reads are O(1) and writes are O(n). If I postulate that there are many more reads than writes, I've probably made the right decision.
It's not enough to just rely on gut feeling, look at query plans, or run queries and see how long they take. Profiling and benchmarking are essential to understanding whether "optimizations" actually help. The first thing I always do is profile the queries on an otherwise quiet machine so I can measure how much work they really do. This is why I wrote the MySQL Query Profiler.
I profiled the three sets of queries to test the leaderboards. Because gamer 1010 is fairly high in the rankings in those leaderboards, I also profiled gamer 755, who is very low in the rankings, to get a sense of the two extremes. Here is a sampling of the most relevant results:
|Design||Design #1||Design #2||Design #3|
|Index range scans||2||2||1||1||0||1|
|Full scans initiated||1||1||0||0||0|
|Next in index||610||19975||20309||20000||10000||8|
|Temp table inserts||10000||10000||40||18||8727||8|
It looks like some of the queries might not be using indexes the same way on the two gamers, but I don't want to get into early optimization. I just want to look for a general improvement with the ranking columns. The profiling data shows why it's not a good idea to rely only on query plans and execution times; the query times aren't that different, but the other numbers show how different the workload is.
The profiling shows the naïve strategy works well only if the leaderboard page being generated is close to the front--that is, if people are interested in the highest-ranking gamers. If your application follows that pattern, you should optimize for it, but it's not always the case. Assume the gaming site doesn't follow this pattern, because all the low-ranked gamers are vain and want to check their rankings all the time. Now you have to optimize for the pages buried low in the ranks.
In any case, the profiling results are encouraging that I'm going in a good direction with the ranking columns, but benchmarking is crucial.
I ran benchmarks to show how the queries perform. The benchmarks randomly choose and request leaderboard pages with the same queries I've shown you. I'm including the benchmarking scripts in the code you can download from this article.
I benchmarked against the data I gave you, with 100,000 rows in the score tables. This load is entirely CPU-bound on my single-disk, single-CPU home computer.
I used innotop to watch which queries ran slowly. It was clearly the
To simulate concurrent updates, I added threads that choose a random score and change it. In designs #2 and #3, this means not only changing the score, but updating the rank column, too. I did not batch updates to ranks; I simply updated the affected open-ended range every time, so there's still a lot of room for optimization here.
I averaged across five runs for each level of concurrency, from one to twenty concurrent threads. Each thread generated 100 leaderboards. Here are the benchmark results, in leaderboards per second by number of concurrent requests:
|Read Threads||Design #1||Design #2||Design #3|
|Reads Only||1 Writer||5 Writers||Reads Only||1 Writer||5 Writers||Reads Only||1 Writer||5 Writers|
Look how much better design #3 performs! This vindicates my hunch that the key to good performance is getting rid of queries that scan lots of rows.
Beyond that, my main concern is how the updates perform on the rank column, because if this isn't efficient, the rank column isn't a good strategy. As you can see, the update queries cause some performance degradation. This reinforces how important it is to optimize the updates.
Updates to the
rank_in_game column caused an irregular workload, which is a bit smoothed out in the average. As I explained earlier, some updates will be cheap, but others may cause large parts of the
rank_in_game index to be updated, and this will happen randomly in the queries my benchmark script issues. That explains the irregularities.
While it looks like designs #2 and #3 suffer about the same or slightly greater performance penalty from the updates as design #1, I don't think that's really the case. The problem is that my benchmarks aren't realistic. The updates to tables with ranks aren't optimized in my benchmarks, and the updates to the tables without ranks are trivially cheap. In real life I think design #1 would take a greater hit, and smarter updates to the ranked tables would help designs #2 and #3 significantly. However, I cannot prove this without redesigning my benchmarks and running them again.
I also ran some benchmarks against a table with a million rows, but I don't show the results because they took too long to run for design #1. I didn't get a full set of results. However, the results I got show design #1 performs badly as the table grows. At 5 concurrent readers, it could only generate 0.1 leaderboards per second, whereas design #2 achieved about 2.5 and design #3 produced 420.
If you have questions or suggestions, please contact me.
The benchmarks show the rank column is a significant improvement over calculating ranks and offsets every time a leaderboard is generated. This naturally suggests that you can do more of the same and get further improvement. Though I haven't designed and run benchmarks for any of the following, here are some ideas.
The next logical step is to add a
pos_in_game column. This makes it more efficient to align random leaderboards on boundaries of rank module 20. The downside is it's harder to maintain. There are fewer "cheap update" queries; most updates cause more work. While this might be a good idea in some workloads, I think it's reaching the point of diminishing returns.
A better strategy is to keep the score tables small and lean, and create separate rank tables. This should result in less data overall, partly because InnoDB's design makes extra indexes on the score table contain the columns from the primary key. While that's sometimes desirable, in this case it makes the indexes larger than I want. Adding the
rank_in_game column increased the data and index size from about 78MiB to around 102.
The rank table should have a primary key on score. Without the redundant data, it will be smaller than the index on the ranking column in the examples above. This lets you update less data. It also makes it easier to do some calculations; you don't need to code special cases for rows that have the same score.
You could also maintain a count column that records how many times a particular score is duplicated in the score table. This column can help you avoid
COUNT() queries on the score and rank tables; they become
SUM() queries on the rank table instead. This might be a significant efficiency if you have many duplicated scores. In fact, you can even do better: you can maintain those sums in the table, too, so for any given score you can see its rank, how many times it occurs in the score table, and how many scores come before it.
I would optimize queries and maintenance on this table with many of the same techniques I explained.
Smart caching can solve many problems, but MySQL's query cache won't help much for this workload. Updating a row invalidates every cached query that uses the table. This is likely to cause more work than it saves, and use memory that could otherwise cache real data. I have seen a significant performance increase from disabling the query cache in such situations.
In this article I've shown some ways to optimize ranks in MySQL so you can efficiently generate ranked, paginated views of a dataset. This generally a very hard problem to solve. The de-normalizations I showed you have some maintenance overhead, but if you have a high ratio of reads to writes, they may be worth it. Indeed, my benchmarks showed that even simple designs produce constant-time queries that out-perform traditional linear-time queries even with concurrent updates. The benefits become even greater as the data size increases. The alternative strategies I mentioned will probably be even more efficient than the designs I benchmarked.
This approach might help a lot on some workloads, perhaps not at all on others, but I hope I've given you some good ideas on how to optimize rank data.
Please leave your thoughts in the comments. I'm looking forward to learning how you handle this kind of problem yourself. You can also download the example code for this article.
Baron Schwartz is a software engineer who lives in Charlottesville, Virginia and goes by the online handle of "Xaprb," which is his first name typed in QWERTY on a Dvorak keyboard. He blogs about software engineering at http://www.xaprb.com/blog/.
Return to MySQL.
Copyright © 2009 O'Reilly Media, Inc.