Performance analysis fetching data with PDO and PHP.


Fetching data from databases is a common operation in our work as developers. There are many drivers (normally I use PDO), but the usage of all of them are similar and switch from one to another is not difficult (they almost share the same interface). In this post I will focus on fetching data. Basically we’ve got two functions: fetch and fetchAll. I’ve created two examples. One with fetch and another one with fetchAll:

// Example with fetch
error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();

$dbh = new PDO('pgsql:dbname=mydb;host=localhost', 'username', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$stmt = $dbh->prepare('SELECT * FROM tableName limit 10000');
$stmt->execute();

$i=0;
while ($row = $stmt->fetch()) {
	$i++;
}
echo '
<h1>fetch()</h1>
';
echo '
<strong>{$i} </strong>

';
print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'seconds' => microtime(TRUE) - $time));
// Example with fetchAll
error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();

$dbh = new PDO('pgsql:dbname=mydb;host=localhost', 'username', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$stmt = $dbh->prepare('SELECT * FROM tableName limit 10000');
$stmt->execute();

$i=0;
$data = $stmt->fetchAll();
foreach ($data as $row) {
	$i++;
}

echo '
<h1>fetchAll()</h1>
';
echo '
<strong>{$i}</strong>

';
print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'seconds' => microtime(TRUE) - $time));

if we execute the test we obtain:

fetchAll: [memory] => 31.305999755859
fetch: [memory] => 0.002532958984375

OK. It’s obvious. If we approach to the data extraction with fetchAll method we will use more memory. That’s because we’re mapping the whole recorded to a variable ($data) at once. With the fetch loop we are mapping only on row per iteration. By the way if we change the fetch loop to:

$data = array();
while ($row = $stmt->fetch()) {
	$i++;
	$data[] = $row;
}

We will use almost the same amount of memory than the fetchAll method
[memory] => 31.267543792725

Conclusion:
Is it better fetch than fetchAll? The answer is simple: No. We only need to take care what are we doing and use the best solution that fix to our need. If we’re handling small recordset, they’re similar, but if we work with big ones we need to realize that the memory usage we are using changes drastically if we use one method or another.

17 thoughts on “Performance analysis fetching data with PDO and PHP.

  1. Hi Gonzalo,
    First of all, nice post. Although for advanced developers it’s obvious, for others less-advanced it might not be. It is however entirely logical.

    Still, I would love to see an article about performance comparisons between the use of objects and arrays (associative or not) for data manipulation.
    IMHO, OOP is excellent because allows us to develop (web) applications more maintainable, portable and easily re-usable. However, performance it’s not exactly one of it’s benefits.

    Arrays are, in a more natural and native way, a more faster and less resource-needed way to store data in memory and to manipulate it. Not so easy as the object-oriented way.

    Hugz.

    1. txs. In fact I wrote this article because I’ve seen a lot of times fetchAll instead of fetch within projects without any reason. If we need to use it, it’s perfect but we always must bear in mind the significant memory usage differences. Probably the outcome is too obvious. BTW, I write your suggestion 😉

    1. I’ve omitted speed because it’s almost the same. The main important difference is the memory usage, and I’ve focussed on it

    1. The difference between both methods is only visible with big datasets, indeed. fetchAll is more comfortable (at least for me) but we need to take into account the big memory usage with big datasets.

      Do you flush output to the browser? Another possible problem is the Select statement. (if you use sub-queries within each the performance will be penalized). Another thing may be the latency between web-server and database server.

  2. Hi Gonzalo,

    I found that fetch() performs faster that fetchAll() with large set of data.With 1K of data fetch() took .1sec to create form while fetchAll took .8 sec.Can you please let me know the reason.

    1. I wrote this post to explain this issue. Read the second paragraph. It’s problably because the memory that PHP need to create the variable.

  3. Hi, nice shot, but I want to implement it on PDOStatement::execute(). How we can implement this “telemetry” on that? I tried to use __call(), but it don’t works with public methods.

    So, any ideas?

    Hugs

      1. My idea is implement your techniques to monitor every single call for PDOStatement::execute();

        This way, we can monitor EVERY single call to this method automaticaly. So, we don’t need to write your code for every call to PDOStatement::execute(). Get it?

        My initial idea was to use “magic methods”, but unfortunately PHP don’t have one magic method what intercepts every single call.

        http://www.php.net/manual/es/language.oop5.magic.php

        What I wanna do is something like:
        info = array(‘memory’ => (memory_get_usage() – $mem) / (1024 * 1024), ‘seconds’ => microtime(TRUE) – $time);

        } else {

        // execute \PDOStatement::__call() for another methods
        parent::__call($name, $arguments);

        }
        }
        }

        // with this class implementation, it would be possible:

        $pdo = new \PDO($dsn, $user, $password);

        // it need to return MyPDOStatement instead PDOStatement
        $sth = $pdo->prepare(‘my big heavy SQL here’);
        $sth->execute(array(‘param’ => ‘value’));

        // here we get our memory and time info
        echo ‘

         . 'print_r($sth->info, true) . '

        ;

        ?>

        tks.

  4. I don’t understand the implications of this post. It seems kind of obvious, and its not the fact that fetchAll uses more memory that is the key thing here.

    What fetchAll is doing is loading all of the values (10,000) to a single array in memory.

    (total variables = 10,000)

    With the regular fetch() statement, your loading each fetch into the same variable. At the end of the application, you are left with only the one variable, the last fetch statement.

    (total variables = 1)

    Basically this entire article can be summed up to “fetch only loads one variable, and fetchAll() loads all the data. More Data = More Memory.”

    1. Yes. When I wrote this post I wanted to measure it. I wanted to show with numbers the big difference between fetch and fetchAll (especially with big datasets)

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.