To ensure your test cases call efficient MySQL

  def test_my_case
    assert_efficient_sql do

      # just wrap them in this block!

    end
  end

The assertion intercepts and copies out your MySQL SELECT statements, then calls EXPLAIN on each one, and inspects the results for common problems.

The goal is test cases that resist database pessimization, even as you change your data relations, to add new features. If you run your tests after every few changes, you can easily detect which change broke your database’s indices and relations.

This article is a reference for this assertion’s options. The techniques should be ported to any database with an EXPLAIN or similar system.

Credits to Jay Pipe for his excellent little survey of ways to detect pessimizations!

Install

piston is the best way to install this assertion:

  piston import svn://rubyforge.org/var/svn/efficient-sql/ \
                     vendor/plugins/efficient-sql

The documented Rails way is the worst:

  script/plugin install -x svn://rubyforge.org/var/svn/efficient-sql

The -x option bonds your Subversion repository to rubyforge.org. Each time you svn update your own code, you will also automatically get the latest version of the plugin, whether you need it or not. piston avoids these fun issues. You can update on command, with piston update

Diagnose

If your SQL is already efficient, use :verbose to diagnose why it‘s efficient:

  def test_verbose
    assert_stdout /select_type/ do
      assert_efficient_sql :verbose do
        Foo.find_by_id(42)
      end
    end
  end

It prints a table like this to STDOUT:

 query for SELECT * FROM foos WHERE (foos.`id` = 42)  LIMIT 1

 Foo Load
    select_type | key_len | type  | id | Extra | possible_keys | table | rows | ref   | key
    -------------------------------------------------------------------------------------------
    SIMPLE      | 4       | const | 1  |       | PRIMARY       | foos  | 1    | const | PRIMARY

Reading a table‘s primary key is naturally efficient, so the type field is a healthy const.

See the MySQL document Optimizing Queries with EXPLAIN to learn what those results mean.

Optimize

This test shows assert_efficient_sql failing:

  def test_assert_inefficient_sql
    assert_flunked /Pessimistic.*
                    full.table.scan.*
                    Foo.Load/mx do
      assert_efficient_sql do
        Foo.find_by_sql('select * from foos a')
      end
    end
  end

Its failure would look like this:

    1) Failure:
  test_assert_inefficient_sql(AssertEfficientSqlTest)
      [./lib/assert_efficient_sql.rb:220:in `analyze_efficiency'
       ./test/assert_efficient_sql_test.rb:67:in `test_assert_inefficient_sql']:
  Pessimistic
  query for select * from foos a
  full table scan
  Foo Load
     select_type | key_len | type | id | Extra | possible_keys | table | rows | ref | key
     ------------------------------------------------------------------------------------
     SIMPLE      |         | ALL  | 1  |       |               | a     | 43   |     |    .

  7 tests, 9 assertions, 1 failures, 0 errors

Most queries should not blindly read ALL rows in a database table.

If that query used a WHERE condition that selects fewer records, but if EXPLAIN produced a query type of ALL, this would indicate MySQL might read the entire database table just to apply the WHERE. The best fix, in that situation, is to add an index (a MySQL "KEY") to the table, and put the fields from that WHERE condition into it.

That technique allows MySQL to read the index first, then chop directly thru the table to the records you need.

Pessimize

Sometimes you need an ALL, even while other assert_efficient_sql checks must pass. To positively declare we like ALL, pass in :ALL => true.

  def test_assert_all
    assert_efficient_sql :ALL => true do
      Foo.find(:all)
    end
  end

Assertions use positive reinforcement; they state conditions we like. They don‘t try to deny conditions we don‘t like. If :ALL => false meant "false to warn about ALL", the assertion would confuse developers. So we use :ALL => true to allow queries to scan all rows.

Options

If assert_efficient_sql (generally) dislikes your arguments, it will print out its default options, each with an explanation.

  def test_help
    assert_stdout /invalid.*argument.*
                   verbose.*=\>.*false/mx do
      assert_efficient_sql(:help){}
    end  
  end

Nesting

You can also nest the assertion, to provide different options for different blocks. The assertion allows this because your test might also have some other reason to use blocks.

  def test_nest
    outer_result = assert_efficient_sql do
      inner_result = assert_efficient_sql :ALL => true do
        Foo.find(:all)
      end
      
      assert_no_match /where/i, inner_result[0][0]
      Foo.find(42)
    end
    assert_match /where/i, outer_result[0][0]
  end

Using filesort

If your WHERE and ORDER clauses are too complex, MySQL might need to write a file (or worse), just to satisfy a query. assert_efficient_sql detects this pernicious situation:

  def test_prevent_filesorts
    _exec %[ CREATE TABLE `t1` (
               `a` int(11) NOT NULL DEFAULT '0',
               `b` blob NOT NULL,
               `c` text NOT NULL,
               PRIMARY KEY (`a`,`b`(255),`c`(255)),
               KEY `t1ba` (`b`(10),`a`)
             ) ENGINE=InnoDB ]
    
    assert_flunked /Using.filesort/ do
      assert_efficient_sql do
        Foo.find_by_sql('SELECT a FROM t1 ORDER BY b')
      end
    end
  ensure
    _exec 'drop table t1'
  end

If you want to use the assertion without faulting on filesorts, pass the option :Using_filesort => true.

Throttle

One common pessimization is a query that reads thousands of rows just to return a few. assert_efficient_sql counts the rows hit in each phase of an SQL SELECT, and faults if any row count exceeds 1,000.

Adjust this count with :throttle => 42.

  def test_throttle
    101.times{|x|  Foo.create :name => "foo_#{ x }"  }

    assert_flunked /Pessimistic.*
                    more.than.*100.*
                    Foo.Load/mx do
      assert_efficient_sql :throttle => 100, :ALL => true do
        Foo.find(:all)
      end
    end
  end

SHOW SESSION STATUS

assert_efficient_sql calls SHOW SESSION STATUS before and after its sampled block. If you are seeking an advanced pessimization, such as Created_tmp_disk_tables, pass :Created_tmp_disk_tables => 0. The assertion will compare difference in STATUS before and after calling its block. A difference greater than the allowed difference will trigger a fault.

To test this, we simply detect a STATUS variable which is not a warning.

  def test_declare_futile_war_on_Innodb_rows_read
    assert_flunked /just.for.test.*
                    Innodb_rows_read/mx do
      assert_efficient_sql :diagnostic => 'just for test!',
                           :Innodb_rows_read => 0 do
        Foo.find(:all)
      end
    end
  end

Inconclusion

assert_inefficient_sql is available at http://efficient-sql.rubyforge.org/. Watch that space (and this) for more tuning and pessimization detections.