Which matters more: the number of queries, or the quality of queries? Would running 10,000 small queries, each grabbing one row, be faster than running one query to grab 10,000 rows, and then selecting the rows when they are needed?
Two Approaches
I have developed an easy way to “cache” the result of MySQL queries in PHP. Here is the code:
class method_one {
var $cache;
function retrieve( $id ) {
if( !isset( $this->cache ) ) {
$this->cache = array();
$query = "SELECT * FROM speed_test";
$result = mysql_query( $query );
while( $row = mysql_fetch_array( $result ) ) {
$this->cache[ $row['id'] ] = $row;
}
}
return $this->cache[ $id ][ 'rand_value' ];
}
}
As you can see, the class contains the method retrieve();. The method retrieve(); creates a cache array by grabbing all of the rows in the database and sticking them in an array. This is the big query that I spoke of earlier. It then returns the part of the array that is being asked for. This is fairly limited in the fact that you can only select by one key, which in this case is id. This method also requires more memory than the second. Since PHP is holding the result in an array, it uses RAM.
Now, we will go over a method with similar results but a totally different approach.
class method_two {
function retrieve( $id ) {
$query = "SELECT * FROM table WHERE id = {$id} LIMIT 1";
$result = mysql_query( $query );
$row = mysql_fetch_array( $result );
return $row[ 'value' ];
}
}
This class has a method, also named retrieve();, that simply queries the database for the row. Although the method that I wrote is very simple, the approach is very dynamic. You could write a method similar to this that can run any query on the database. This approach is the small query approach.
Which is faster?
I will begin testing on a database with 5000 rows. Each approach will be asked for a specific number of rows. These rows will not be displayed in any way. The code for testing is as follows:
$number = 0 // this is the number of rows that the method is being asked for
$start_time = microtime_float();
for( $i = 0; $i <= $number; $i++ ) {
$class->retrieve( rand( 0, 5000 ) );
}
$stop_time = microtime_float();
Test Results
- 5000 results
method one: 0.0269620418549 seconds
method two: 0.245247840881 seconds
winner: first method
The first method is the clear winner here. It went up to 10x faster than the second method, and made 1/5000th of the amount of queries that the second method made. - 1000 results
method one: 0.0193378925323 seconds
method two: 0.0494480133057 seconds
winner: first method
Once again, the first method is the winner. This time the second method performed considerably faster than in the first trial run, but the first method was still 2x faster than the second. - 500 results
method one: 0.0175378322601 seconds
method two: 0.0243937969208 seconds
winner: first method
The first method wins again, but this time by only a little. In a more realistic situation, where the database does not contain 5000 keys, the first method would have done a little better. - 100 results
method one: 0.016597032547 seconds
method two: 0.0049729347229 seconds
winner: second method
The second method was faster this time. It went 3x as fast as the first method did. - 10 results
method one: 0.0167009830475 seconds
method two: 0.000600099563599 seconds
winner: second method
The second method is significantly faster than the first this time. This is probably the most realistic test of them all – usually you aren’t pulling thousands and thousands of row out of the database. A number under 100 is more accurate to the average number of rows on a normal pageload.
Looking Back
The tests that I have done were in a very controlled environment. There were 5000 rows in the table that we were drawing from, and this never changed. In a real situation the number of rows would, in most cases, grow with time. In conclusion, the first method is a lot faster if you are drawing lots and lots of rows from the database. As you draw less and less rows, the times become closer, until the second method is faster. The point where the second method becomes faster all depends on the number of rows in the database. The second method would be faster using a database with a lot of rows, the first better with a database without many rows. It all depends on the situation. To be honest, it doesn’t really matter what method you use. Switching methods could add, or shave off, a fraction of a second from your script execution time. In the scheme of things, this isn’t much at all.
Questions or comments? Please leave a comment on my post. I am looking forward to reading your comments and helping you.