Emulating Analytic (AKA Ranking) Functions with MySQL
by Stephane Faroult03/29/2007
One of the most hailed extensions brought to SQL in recent years has been these functions that Oracle calls analytic functions, DB2 calls OLAP functions, and SQL Server 2005 calls ranking functions--but which MySQL, so far, still lacks. The good news is that they can be (relatively) easily and efficiently emulated.
Oracle introduced these functions early (with version 8.1.6, back in 1999 or about) and still offers a greater number of such functions than the other major players (who usually provide the most basic functions, from which the others can be derived.) Therefore, I will take my references from Oracle, and will show how you can, in most cases, obtain the same result with MySQL.
What Are Analytic Functions?
Anyone who has dabbled with SQL is familiar with aggregate functions such as SUM() or COUNT(). But as their name implies, when you use these functions and perform a group by, the detail is lost in the aggregate. Very simply, analytic functions allow you to return the detail (simple, unaggregated rows) and at the same time, on the very same row, a result that is computed from operations on several rows that are related in one way or another to the current one. Let me give an example. Suppose we have the unavoidable Oracle EMP table:
SQL> desc emp
Name Null? Type
----------------------------------------- -------- ----------------
EMPNO NOT NULL NUMBER(4)
ENAME VARCHAR2(10)
JOB VARCHAR2(9)
MGR NUMBER(4)
HIREDATE DATE
SAL NUMBER(7,2)
COMM NUMBER(7,2)
DEPTNO NUMBER(2)
We can easily compute with a group by the total salary amount, department by department:
SQL> select deptno, sum(sal)
2 from emp
3 group by deptno;
DEPTNO SUM(SAL)
---------- ----------
30 9400
20 10875
10 8750
By doing so, we lose the detail. However, if we use SUM() in an analytical way, we can return each row, and compute on the same row the total salary amount, specifying in the over clause that the SUM() function actually applies to all rows that refer to the same department number as the current one:
SQL> select deptno, ename, sal, sum(sal) over (partition by deptno)
2 from emp
3 order by 1, 3
4 /
DEPTNO ENAME SAL SUM(SAL)OVER(PARTITIONBYDEPTNO)
---------- ---------- ---------- -------------------------------
10 MILLER 1300 8750
10 CLARK 2450 8750
10 KING 5000 8750
20 SMITH 800 10875
20 ADAMS 1100 10875
20 JONES 2975 10875
20 SCOTT 3000 10875
20 FORD 3000 10875
30 JAMES 950 9400
30 MARTIN 1250 9400
30 WARD 1250 9400
30 TURNER 1500 9400
30 ALLEN 1600 9400
30 BLAKE 2850 9400
14 rows selected.
SQL>
Such a result can be very useful in computing, for instance, what percentage of the salary mass of a department does the Pointy Haired Boss's salary represent? Needless to say, all we need to do with MySQL is to join a regular select to the aggregate to obtain a similar result:
mysql> select a.DEPTNO, a.ENAME, a.SAL, b.TOTSAL
-> from EMP as a
-> inner join (select DEPTNO, sum(SAL) TOTSAL
-> from EMP
-> group by DEPTNO) as b
-> on a.DEPTNO = b.DEPTNO
-> order by 1, 3;
+--------+--------+------+--------+
| DEPTNO | ENAME | SAL | TOTSAL |
+--------+--------+------+--------+
| 10 | MILLER | 1300 | 8750 |
| 10 | CLARK | 2450 | 8750 |
| 10 | KING | 5000 | 8750 |
| 20 | SMITH | 800 | 10875 |
| 20 | ADAMS | 1100 | 10875 |
| 20 | JONES | 2975 | 10875 |
| 20 | SCOTT | 3000 | 10875 |
| 20 | FORD | 3000 | 10875 |
| 30 | JAMES | 950 | 9400 |
| 30 | WARD | 1250 | 9400 |
| 30 | MARTIN | 1250 | 9400 |
| 30 | TURNER | 1500 | 9400 |
| 30 | ALLEN | 1600 | 9400 |
| 30 | BLAKE | 2850 | 9400 |
+--------+--------+------+--------+
14 rows in set (0.00 sec)
mysql>






