Speed up PHP scripts with asynchronous database queries

October 11, 2010October 12, 2010 ~ Gonzalo Ayuso

That’s the situation. A web application with 4 heavy queries. Yes I know you can use a cache or things like that but imagine you must perform the four queries yes or yes. As I say before the four database queries are heavy ones. 2 seconds per one. Do it sequentially, that’s means our script will use at least 8 seconds (without adding the extra PHP time). Eight seconds are a huge time in a page load.

So here I will show up a technique to speed up our website using asynchronous calls.

One possible solution is fork each query into a separate thread but, does PHP support threads? The short is answer No. OK that’s not 100% true. We can use PCNTL functions but it’s a bit tricky for me to use them. I prefer to use this another solution to create fork and perform asynchronous operations.

The basic idea is isolate each query into a separate script.

// fork.php
$dbh = new PDO('pgsql:dbname=pg1;host=localhost', 'nov', 'nov');
$dbh-&gt;setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$t = filter_input(INPUT_GET, 't', FILTER_SANITIZE_STRING);
switch ($t) {
    case 0;
    case 1;
    case 2;
       $sql = &quot;select * from test.tbl1&quot;;
       break;
}
$stmt = $dbh-&gt;prepare($sql);
$stmt-&gt;execute();
$data = $stmt-&gt;fetchAll();
echo json_encode($data);

And now we are going to use the curl‘s multi functions to execute the database queries at the same time:

// index.php
class Fork
{
    private $_handles = array();
    private $_mh      = array();

    function __construct()
    {
        $this-&gt;_mh = curl_multi_init();
    }

    function add($url)
    {
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_HEADER, 0);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_TIMEOUT, 30);
        curl_multi_add_handle($this-&gt;_mh, $ch);
        $this-&gt;_handles[] = $ch;
        return $this;
    }

    function run()
    {
        $running=null;
        do {
            curl_multi_exec($this-&gt;_mh, $running);
            usleep (250000);
        } while ($running &gt; 0);
        for($i=0; $i &lt; count($this-&gt;_handles); $i++) {
            $out = curl_multi_getcontent($this-&gt;_handles[$i]);
            $data[$i] = json_decode($out);
            curl_multi_remove_handle($this-&gt;_mh, $this-&gt;_handles[$i]);
        }
        curl_multi_close($this-&gt;_mh);
        return $data;
    }
}

$fork = new Fork;

$output = $fork-&gt;add(&quot;http://localhost/fork.php?t=0&quot;)
    -&gt;add(&quot;http://localhost/fork.php?t=1&quot;)
    -&gt;add(&quot;http://localhost/fork.php?t=2&quot;)
    -&gt;run();

echo &quot;&lt;pre&gt;&quot;;
print_r($output);
echo &quot;&lt;/pre&gt;&quot;;

And that’s all. Four queries will be executed at the same time and our scrip will use 2 seconds (the slowest query) instead of 8 seconds. But there’s a little inconvenient within this solution. We need to open 4 connections instead of one because of the lack of connection pooling in PHP. Another problem with this technique appears if we need to debug our script. The debugger won’t debug our “forks” at the same time (we will need to debug then separately). Because of that we need to balance the usage of this technique or not, but it’s another weapon we have to improve our scripts.

Published by Gonzalo Ayuso

Web Architect. PHP, Python, Node, Angular, ionic, PostgreSQL, Linux, ... Always learning. View all posts by Gonzalo Ayuso

32 thoughts on “Speed up PHP scripts with asynchronous database queries”

George Secrieru says:

October 11, 2010 at 5:45 pm

Very clever move. IMHO it sounds like a good approach for database tasks/jobs (e.g. backups, purging) instead of simple transactions.

Regards,
George

Loading...

Reply
1. Gonzalo Ayuso says:
  
  October 12, 2010 at 2:12 pm
  
  yes but If those jobs/tasks are not 100% necessary within the execution of the script (in the user point of view) it’s better to put them off it. I have a new post (in my mind yet 🙂 ) about those kind of things.
  
  Loading...
  
  Reply
Marcus says:

October 12, 2010 at 10:27 am

Nice trick!

However there’s an error in the code on line 31.

Loading...

Reply
1. Gonzalo Ayuso says:
  
  October 12, 2010 at 12:20 pm
  
  Damn, WordPress editor has changed my code! Updated. Many thanks
  
  Loading...
  
  Reply
Taco says:

October 12, 2010 at 9:42 pm

It feels like you should optimize your queries, two seconds is a long time for a single query. Another thing I’m wondering about is: do you really need the return value of those scripts? If you don’t you could write the queries in a file and execute them later and / or asynchronous in another way.

Loading...

Reply
1. Gonzalo Ayuso says:
  
  October 13, 2010 at 3:58 pm
  
  1) It depends on the database and the query you are executing. There are some datamining queries may use seconds. Even optimized ones. It also depend on the database location. It’s not the same when you are retreiving data from localhost or you are doing it from a database in another location (we need consider network speed). Of course this method is not good for fast queries (it can be even worst).
  
  2) I return data in those scripts because I need the information online. Imagine I want to populate a datagrid with this information when the user clicks on a button. We also can use another comunication technique instead of json encode/decode. We can use memcached (write en the fork and read in the main script). If I do what you said and I execute the query offline (save them and execute later) I need an event or something like that to populate the datagrid. If we are speaking about it we are speaking about comet or webworkers. That’s another approach but definetly a more complex solution.
  
  Loading...
  
  Reply
Lucas Arruda says:

October 15, 2010 at 8:21 pm

Well,

That’s was a smart solution. I see the problem of not being able to use a pool of connection to manage them. But my concern is about security.

Do you use some kind of hash through the GET parameter or some other kind of security measure?

Cheers,
Lucas

Loading...

Reply
1. Gonzalo Ayuso says:
  
  October 15, 2010 at 8:58 pm
  
  The lack of a real pool in PHP is a problem indeed. You can use SQL Relay but I don’t really like it.
  
  BTW your security appreciation is great. To solve the issue I like to use a separate virtualhost for “forks” only available at localhost. That’s means only server can reach to localhost. With this solution, I trust the localhost connections(server clients with curl) and client (browser clients) never reach to localhost. I haven’t add this to the post becase my aim was to show only the idea of threads with curl.
  
  Loading...
  
  Reply
  1. Lucas Arruda says:
    
    October 16, 2010 at 6:05 pm
    
    That`s a good solution, mate.
    
    I would also use, just for the precaution, a hash or a number combination (with digit verification) that both knew, so the person would need to have access to both sources to be a threat.
    
    Loading...
Mike Willbanks says:

October 17, 2010 at 6:07 pm

I am curious as to why you created your own approach here? Things like Gearman can perform far better optimization for things like this.

For instance, if you are looking just to offload different types of SQL selects or anything else, have a worker pool that does them and send them through asynchronously.

This way you are not using cURL and having an additional http request which could eventually become a bottleneck in a large enough site.

Loading...

Reply
1. Gonzalo Ayuso says:
  
  October 17, 2010 at 8:26 pm
  
  is it possible to do it with gearman? As far as I known with gearman I can enqueue jobs. I use gearman when I want to execute jobs offline (like mailing, logs …). But may I execute several jobs at the same time (not with a queue) as we do with curl-multi funtions? If it’s possible it will be great for me.
  
  Loading...
  
  Reply
vlad says:

October 26, 2010 at 9:26 pm

Very clever, but hard to implement and debug. Especially if you user MVC or other patterns. In enterprise world to speed up the DB job you can use a technique called CQRS. The idea is to denormalize your DB so you never ever need to run 3 complicated long lasting queries.
back to your article, as I said it is very interesting idea.

Loading...

Reply
1. Gonzalo Ayuso says:
  
  October 27, 2010 at 5:50 pm
  
  Yes. Probably the main problem is debugging. I solve it with a simple trick. I use a flag in the development server to perform queries online instead of using curl. It’s not the same but it allows me to debug them.
  
  Loading...
  
  Reply
koncept says:

October 31, 2010 at 7:50 pm

Very clever

Loading...

Reply
xmcdeltatt says:

November 7, 2010 at 5:54 pm

Thank you for the information provided….

Loading...

Reply
Thierry says:

November 26, 2010 at 9:28 pm

If you are willing to drop PDO support, PHP postgresql library support asynchronous queries since at least 2002:
http://www.php.net/manual/en/function.pg-send-query.php

Loading...

Reply
1. Gonzalo Ayuso says:
  
  November 26, 2010 at 11:42 pm
  
  Reading …
  
  Loading...
  
  Reply
2. Gonzalo Ayuso says:
  
  November 28, 2010 at 7:39 pm
  
  Great thing. I didn’t knew that. I need to test it deeply. I’m a PDO fan but I’ll test it. Many thanks
  
  Loading...
  
  Reply
Hone says:

December 4, 2010 at 10:48 pm

Async is great but Curl is still over head.

If you want native Async calls to MySQl with PHP and PDO you should instead look into the new PHP native mysql client for php 5.3 – mysqlnd.

http://blog.ulf-wendel.de/?p=201

Loading...

Reply
1. Gonzalo Ayuso says:
  
  December 5, 2010 at 12:48 pm
  
  I will have a look to mysqlInd. As I see in a glance it also has something similar than my experiment of connection pooling with gearman (http://wp.me/pqmgV-7Y).
  
  Loading...
  
  Reply
Iouri Goussev says:

January 5, 2011 at 7:13 am

Hmm, nice hack ?

Loading...

Reply
Pingback: Parallel processing in PHP | davblog: webdev and stuff
Anonymous says:

August 11, 2011 at 6:14 am

Very good code. Performance is always a great concern for majority of our projects. asynchronous calls is one good way to speed up things. Thanks for the post.

Loading...

Reply
Faiz Ahmed says:

November 21, 2012 at 1:15 pm

very good job..

Loading...

Reply
vkaplich says:

April 5, 2013 at 3:15 pm

Here is another one solution for parallel processing without redundant HTTP requests:
http://php.net/manual/en/function.proc-open.php
see comments for code samples

Loading...

Reply
1. Gonzalo Ayuso says:
  
  April 5, 2013 at 5:11 pm
  
  Yes. But in fact the motivation of the example of the post was to change one implementation that I have in one proyect with proc-open. It works, but it’s a mess. At least for me is more simple and clean the solution with curl. With curl we can also use different servers and scale out, but as you said proc-open is another choice.
  
  Loading...
  
  Reply
Vinny says:

May 10, 2016 at 5:49 pm

Very nice. So maybe you can use it in some microservice environment, where you’ve to request multiple ones. And there you need, as someone mentioned before, the returning data. Anyway, nice idea to run multiple queries asynchronous.

Loading...

Reply
goodlike says:

May 30, 2016 at 7:55 am

you can wirte your code like this：

$obj = new asyncRedisProxy();
$obj->connect(“127.0.0.1”, “6379”);
$obj->set(“a”, 11111, function($obj, $ret) {
$obj->get(“a”, function($obj, $data) {
var_dump($data);
$obj->release(); //release to con pool
});
});

$obj2 = new asyncPdoProxy(‘mysql:host=192.168.1.19;dbname=mz_db’, “public_user”, “1qa2ws3ed”);
$obj2->query(“select 1 from mz_user where user_id=299”, function($obj, $stmt) {
$arr = $stmt->fetchAll();
var_dump($arr);
$obj->query(“select 2 from mz_user where user_id=299”, function($obj, $stmt) {
$arr = $stmt->fetchAll();
var_dump($arr);
$obj->release(); //release to con pool
});
});

$obj3 = new asyncPdoProxy(‘mysql:host=192.168.1.19;dbname=mz_db’, “public_user”, “1qa2ws3ed”);
$obj3->exec(“insert into t1(name) values(‘111111’)”, function($obj, $data) {
var_dump($data);
$obj->release(); ////release to con pool
});

$obj4 = new asyncPdoProxy(‘mysql:host=192.168.1.19;dbname=mz_db’, “public_user”, “1qa2ws3ed”);
$stmt = $obj4->prepare(“select * from mz_account where user_id=:user_id”);
$stmt->bindParam(‘:user_id’, “311”);
$stmt->execute(function($stmt, $data) {
var_dump($stmt->fetchAll());
$stmt->release();//release to con pool
});

amazing ！！right？ this sulution base on https://github.com/swoole/php-cp

Loading...

Reply
1. Gonzalo Ayuso says:
  
  June 1, 2016 at 8:15 am
  
  Looks great! I must put this library in the top of my tools-to-study list. I’ll check it
  
  Loading...
  
  Reply
eggmatters says:

December 19, 2016 at 6:06 pm

This is good. The only issue – a minor one – is you’re now blocking 4 connections on port 80. Additionally, how are you managing transacitons? i.e. row / table level locking? We’ve tackled these issues by relegating the queries to separate scripts managed by either a remote process controller (like a cron) or placed on an MQueue.

Loading...

Reply
Alan P says:

January 15, 2019 at 10:17 am

My script searches through the database and might get 100 or more results, each of which would be identified like:

->add(“https://mysite.com/dev/fork.php?id=’ . $someId .'”)
‘;

How can this output be added to the fork ($fork) object as they change and cannot be hard coded?

Thanks

Loading...

Reply
alanpollenz says:

January 15, 2019 at 10:17 am

My script searches through the database and might get 100 or more results, each of which would be identified like:

->add(“https://mysite.com/dev/fork.php?id=’ . $someId .’”)
‘;

How can this output be added to the fork ($fork) object as they change and cannot be hard coded?

Thanks

Loading...

Reply

Speed up PHP scripts with asynchronous database queries

Like this:

Related

Published by Gonzalo Ayuso

32 thoughts on “Speed up PHP scripts with asynchronous database queries”

Leave a ReplyCancel reply

Share this:

Like this:

Related

Published by Gonzalo Ayuso

32 thoughts on “Speed up PHP scripts with asynchronous database queries”

Leave a ReplyCancel reply

Discover more from Gonzalo Ayuso - Web Architect