Speed up PHP scripts with asynchronous database queries

October 11, 2010October 12, 2010 ~ Gonzalo Ayuso ~ 32 Comments

That’s the situation. A web application with 4 heavy queries. Yes I know you can use a cache or things like that but imagine you must perform the four queries yes or yes. As I say before the four database queries are heavy ones. 2 seconds per one. Do it sequentially, that’s means our script will use at least 8 seconds (without adding the extra PHP time). Eight seconds are a huge time in a page load.

So here I will show up a technique to speed up our website using asynchronous calls.

One possible solution is fork each query into a separate thread but, does PHP support threads? The short is answer No. OK that’s not 100% true. We can use PCNTL functions but it’s a bit tricky for me to use them. I prefer to use this another solution to create fork and perform asynchronous operations.

The basic idea is isolate each query into a separate script.

// fork.php
$dbh = new PDO('pgsql:dbname=pg1;host=localhost', 'nov', 'nov');
$dbh-&gt;setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$t = filter_input(INPUT_GET, 't', FILTER_SANITIZE_STRING);
switch ($t) {
    case 0;
    case 1;
    case 2;
       $sql = &quot;select * from test.tbl1&quot;;
       break;
}
$stmt = $dbh-&gt;prepare($sql);
$stmt-&gt;execute();
$data = $stmt-&gt;fetchAll();
echo json_encode($data);

And now we are going to use the curl‘s multi functions to execute the database queries at the same time:

// index.php
class Fork
{
    private $_handles = array();
    private $_mh      = array();

    function __construct()
    {
        $this-&gt;_mh = curl_multi_init();
    }

    function add($url)
    {
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_HEADER, 0);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_TIMEOUT, 30);
        curl_multi_add_handle($this-&gt;_mh, $ch);
        $this-&gt;_handles[] = $ch;
        return $this;
    }

    function run()
    {
        $running=null;
        do {
            curl_multi_exec($this-&gt;_mh, $running);
            usleep (250000);
        } while ($running &gt; 0);
        for($i=0; $i &lt; count($this-&gt;_handles); $i++) {
            $out = curl_multi_getcontent($this-&gt;_handles[$i]);
            $data[$i] = json_decode($out);
            curl_multi_remove_handle($this-&gt;_mh, $this-&gt;_handles[$i]);
        }
        curl_multi_close($this-&gt;_mh);
        return $data;
    }
}

$fork = new Fork;

$output = $fork-&gt;add(&quot;http://localhost/fork.php?t=0&quot;)
    -&gt;add(&quot;http://localhost/fork.php?t=1&quot;)
    -&gt;add(&quot;http://localhost/fork.php?t=2&quot;)
    -&gt;run();

echo &quot;&lt;pre&gt;&quot;;
print_r($output);
echo &quot;&lt;/pre&gt;&quot;;

And that’s all. Four queries will be executed at the same time and our scrip will use 2 seconds (the slowest query) instead of 8 seconds. But there’s a little inconvenient within this solution. We need to open 4 connections instead of one because of the lack of connection pooling in PHP. Another problem with this technique appears if we need to debug our script. The debugger won’t debug our “forks” at the same time (we will need to debug then separately). Because of that we need to balance the usage of this technique or not, but it’s another weapon we have to improve our scripts.

Performance analysis using bind parameters with PDO and PHP.

October 5, 2010 ~ Gonzalo Ayuso ~ 17 Comments

Some months ago a work mate asked me for the differences between using bind variables versus executing the SQL statement directly as a string throughout a PDO connection.

Basically the work-flow of almost all database drivers is the same: Prepare statement, execute and fetch results. We have the following small example with a simple update

$dbh = new PDO('pgsql:dbname=pg1;host=localhost', 'user', 'pass');
$dbh-&gt;setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$dbh-&gt;beginTransaction();
$stmt = $dbh-&gt;prepare(&quot;UPDATE test.tbl1 set field1=:F1 where id=1&quot;);
$stmt-&gt;execute(array('F1' =&gt; $field1));
$dbh-&gt;commit();

And we also can get the same result with the following code:

$dbh = new PDO('pgsql:dbname=pg1;host=localhost', 'user', 'pass');
$dbh-&gt;setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$dbh-&gt;beginTransaction();
$dbh-&gt;prepare(&quot;UPDATE test.tbl1 set field1='{$field1}' where id=1&quot;)-&gt;execute();
$dbh-&gt;commit();

What’s the best one? Both method work properly. The difference is how databases manage the operation internally. When we prepare a statement we are compiling the string into the database within the current connection. After that we can execute the statement and if our statement is ready to receive parameters we can bind those parameters with PHP values or variables. With this idea in mind we can guess that if we need to perform several executions of the same prepared statement is better to use bind parameters instead of compile and execute directly the string.
b
I have created a simple benchmark to show it.

&lt;?php
error_reporting(-1);
function microtime_float()
{
   list($usec, $sec) = explode(&quot; &quot;, microtime());
   return ((float)$usec + (float)$sec);
}

$dbh = new PDO('pgsql:dbname=pg1;host=localhost', 'user', 'pass');
$dbh-&gt;setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$time_start = microtime_float();
$dbh-&gt;beginTransaction();
$field1 = uniqid();
$stmt = $dbh-&gt;prepare(&quot;UPDATE test.tbl1 set field1=:F1 where id=1&quot;);
$stmt-&gt;execute(array('F1' =&gt; $field1));
$dbh-&gt;commit();
$time_end = microtime_float();
$time = round($time_end - $time_start, 4);
echo &quot;&lt;p&gt;Single UPDATE with bind parameters: $time seconds&lt;/p&gt;&quot;;

$time_start = microtime_float();
$dbh-&gt;beginTransaction();
$field1 = uniqid();
$stmt = $dbh-&gt;prepare(&quot;UPDATE test.tbl1 set field1='{$field1}' where id=1&quot;);
$stmt-&gt;execute();
$dbh-&gt;commit();
$time_end = microtime_float();
$time = round($time_end - $time_start, 4);
echo &quot;&lt;p&gt;Single UPDATE without bind parameters: $time seconds&lt;/p&gt;&quot;;

$time_start = microtime_float();
$dbh-&gt;beginTransaction();
$stmt = $dbh-&gt;prepare('UPDATE test.tbl1 set field1=:F1 where id=1');
foreach (range(1, 5000, 1) as $i) {
   $field1 = $i;
   $stmt-&gt;execute(array('F1' =&gt; $field1));
}
$dbh-&gt;commit();
$time_end = microtime_float();
$time = round($time_end - $time_start, 4);
echo &quot;&lt;p&gt;Multiple UPDATE with bind parameters: $time seconds&lt;/p&gt;&quot;;

$time_start = microtime_float();
$dbh-&gt;beginTransaction();
foreach (range(1, 5000, 1) as $i) {
	$stmt = $dbh-&gt;prepare(&quot;UPDATE test.tbl1 set field1='{$field1}' where id=1&quot;);
   $field1 = $i;
   $stmt-&gt;execute();
}
$dbh-&gt;commit();
$time_end = microtime_float();
$time = round($time_end - $time_start, 4);
echo &quot;&lt;p&gt;Multiple UPDATE without bind parameters: $time seconds&lt;/p&gt;&quot;;

The output of this benchmark is the following one:

Single UPDATE with bind parameters: 0.2623 seconds
Single UPDATE without bind parameters: 0.0195 seconds
Multiple UPDATE with bind parameters: 4.1123 seconds
Multiple UPDATE without bind parameters: 8.1732 seconds

As we can see in the output of the benchmark a single update is slightly faster without bind any parameter but if we need to execute the same update with different parameters the bind parameters way is significantly faster than parse+execute again and again.

There’s another benefit of using bind parameters. Databases normally have internally a cache system for our prepared statement. In theory they reuse the statements. The main problem for us (in our PHP world) is that normally we create a new connection to the database at the beginning of the script (or lazy connection) and close it at the end. We don’t have natively a connection pooling like J2EE. So I’m not 100% sure if using bind parameters helps to the database to reuse statements.

There’s another issue we must take into account. Without prepared statements the multiple update example may throw a database exception (depend on our database configuration). Too many open cursors in active transaction. That error appear because for our database each update is a different statement (5000 in the example) and with the another method there’s only one.

Live video streaming with PHP

September 20, 2010November 29, 2010 ~ Gonzalo Ayuso ~ 49 Comments

Picture this. We want to stream live video. In fact don’t need PHP. We only need a flash player (or HTML5) and our live feed. The problem appear when we need to offer some kind of security. For example we want to show videos only to registered users based on our authentication system. Imagine we’re using sessions for validate users. That’s means we cannot put the media in a public folder and point our media player to those files. We can obfuscate the file name but it’ll remain public. In this small tutorial We’re going to see how to implement it with PHP. Let’s start.

First we are going to forget for a while video files. We are going to think in pictures.

If we don’t need to protect the file we can place the image in public folder:

&lt;image href=’/path/to/image.png’ alt=’image’ width=’10’ height=’10’/&gt;

But if we want to protect the picture we cannot place it on a public folder. We need to place it on a server folder and flush it with a code like this:

// here you can chech your sessions
header(&quot;Content-Type: image/jpeg&quot;);
readfile(‘/path/of/image’);

That’s easy. Isn’t it?.
Video files are similar than pictures. We cannot use image tag but the behaviour is the same. With HTML5 we can show them with HTML tags and without HTML5 we need to use a client viewer (generally a flash viewer).

We can use the same technique than the previous example with video files but the problem now is the size of the file. Live video feeds are never ends. If we use the previous code our server must read the live feed forever, and our browser will be waiting forever too. We can easily change this behaviour. Instead to read the entire video file and flush it to the browser we can read small pieces of the file and flush them without waiting until the end of the file.

&lt;?php
// here you can chech your sessions

function flush_buffers(){
    ob_end_flush();
    ob_flush();
    flush();
    ob_start();
}

$filename = &quot;/path/of/fideofeed.flv&quot;;
if (is_file($filename)) {
    header('Content-Type: video/x-flv');
    header(&quot;Content-Disposition: attachment; filename=video.flv&quot;);
    $fd = fopen($filename, &quot;r&quot;);
    while(!feof($fd)) {
        echo fread($fd, 1024 * 5);
        flush_buffers();
        }
    fclose ($fd);
    exit();
}

If we want to stream static video files we also need to take into account the start of the video. When someone want to move across the video going forward and backward within the video client, our server script will retrieve a variable with the start time. That’s means we need to use this start time to open the file from this byte instead from the beggining.