Category Archives: Web Development

How to run a Web Server from a PHP application

Normally we deploy our PHP applications in a webserver (such as apache, nginx, …). I used to have one apache webserver in my personal computer to play with my applications, but from time to now I preffer to use PHP’s built-in webserver for my experiments. It’s really simple. Just run:

php -S 0.0.0.0:8080 

and we’ve got one PHP webserver at our current directory. With another languages (such as node.js, Python) we can start a Web Server from our application. For example with node.js:

var http = require('http');
http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello World\n');
}).listen(8080, '0.0.0.0');
console.log('Server running at http://0.0.0.0:8080');

With PHP we cannot do it. Sure? That assertion isn’t really true. We can do it. I’ve just create one small library to do it in two different ways. First running the built-in web server and also running one React web server.

I want to share the same interface to start the server. In this implementation we will register one callback to handle incomming requests. This callback will accept a Symfony\Component\HttpFoundation\Request and it will return a Symfony\Component\HttpFoundation\Response. Then we will start our server listening to one port and we will run our callback per Request (a simple implementeation of the reactor pattern)

We will create a static factory to create the server

namespace G\HttpServer;
use React;

class Builder
{
    public static function createBuiltInServer($requestHandler)
    {
        $server = new BuiltInServer();
        $server->registerHandler($requestHandler);

        return $server;
    }

    public static function createReactServer($requestHandler)
    {
        $loop   = React\EventLoop\Factory::create();
        $socket = new React\Socket\Server($loop);

        $server = new ReactServer($loop, $socket);
        $server->registerHandler($requestHandler);

        return $server;
    }
}

Each server (BuiltIn, and React) has its own implementation.

And basically that’s all. We can run a simple webserver with the built-in server

use G\HttpServer\Builder;
use Symfony\Component\HttpFoundation\Request;

Builder::createBuiltInServer(function (Request $request) {
        return "Hello " . $request->get('name');
    })->listen(1337);

Or the same thing but with React

use G\HttpServer\Builder;
use Symfony\Component\HttpFoundation\Request;

Builder::createReactServer(function (Request $request) {
        return "Hello " . $request->get('name');
    })->listen(1337);

As you can see our callback handles one Request and returns one Response (The typical HttpKernel), because of that we also can run one Silex application:
With built-in:

use G\HttpServer\Builder;
use Symfony\Component\HttpFoundation\Request;

$app = new Silex\Application();

$app->get('/', function () {
        return 'Hello';
    });

$app->get('/hello/{name}', function ($name) {
        return 'Hello ' . $name;
    });

Builder::createBuiltInServer(function (Request $request) use ($app) {
        return $app->handle($request);
    })->listen(1337);

And the same with React:

use G\HttpServer\Builder;
use Symfony\Component\HttpFoundation\Request;

$app = new Silex\Application();

$app->get('/', function () {
        return 'Hello';
    });

$app->get('/hello/{name}', function ($name) {
        return 'Hello ' . $name;
    });

Builder::createReactServer(function (Request $request) use ($app) {
        return $app->handle($request);
    })->listen(1337);

As an exercise I also have created one small benchmark (with both implementations) with apache ab running 100 request with a 10 request at the same time. Here you can see the outcomes.

  builtin react
Simple response    
ab -n 100 -c 10 http://localhost:1337/
Time taken for tests 0.878 seconds 0.101 seconds
Requests per second (mean) 113.91 [#/sec] 989.33 [#/sec]
Time per request (mean) 87.791 [ms] 10.108 [ms]
Time per request (mean across all concurrent requests) 8.779 [ms] 1.011 [ms]
Transfer rate 21.02 [Kbytes/sec] 112.07 [Kbytes/sec]
Silex application
ab -n 100 -c 10 http://localhost:1337/
Time taken for tests 2.241 seconds 0.247 seconds
Requests per second (mean) 44.62 [#/sec] 405.29 [#/sec]
Time per request 224.119 [ms] 24.674 [ms]
Time per request (mean across all concurrent requests) 22.412 [ms] 2.467 [ms]
Transfer rate 10.89 [Kbytes/sec] 75.60 [Kbytes/sec]
ab -n 100 -c 10 http://localhost:1337/hello/gonzalo
Time taken for tests 2.183 seconds 0.271 seconds
Requests per second (mean) 45.81 [#/sec] (mean) 369.67 [#/sec]
Time per request (mean) 218.290 [ms] (mean) 27.051 [ms]
Time per request (mean across all concurrent requests) 21.829 [ms] 2.705 [ms]
Transfer rate 11.54 [Kbytes/sec] 71.84 [Kbytes/sec]

Built-in web server is not suitable for production environments, but React would be a useful tool in some cases (maybe not good for running Facebook but good enough for punctual situations).

Library is available at github and also you can use it with composer

Deploying tips for Web Applications

I’ve seen the same error in too many projects. When we start a project normally we start with a paper, a white board or something similar. After the first drafts we start coding. In the early stage of the project we want to build a working prototype. It’s natural. It’s important to have a working prototype as fast as we can. The things are different in a browser. All works within a white board, but with a working alpha release we will feel “real” sensations.

Now the project is growing up. Maybe we need several weeks to going live yet. Maybe we haven’t even decide the hosting, but there is something we need to take into account even in the early stages of the project. We need to build an automate deploy system. The way we’re going to use to put our code in the production server. It’s mandatory to have an automated way to deploy our application. Deploy code in production must be something really trivial. Must be done with a few clicks. Hard to deploy means we are not agile, and that is not cool.

If the project is a “professional” one (someone pay/will pay for it), problems in the deploy means down times. Down times are not good. Our clients don’t pay us for those kind of problems. If the project is a personal project, a hard deploy system means that we’re going to be very lazy to improve our project. Deploy by hand is good idea only if we never forget anything and if we’re perfect. If not, it’s always better to have a build script.

It’s important to define different environments within our application. Modern frameworks such as symfony2 has a great way to define environments. It’s important to take into account that. Our code must be exactly the same in our development environment and at the production one. Exactly the same means exactly the same. If we need to change the code before deploy it into production server we’ve got a problem. A simple trick to define environments is create two ini files one with development data (database dsn, urls, paths) and another one to production. We can also use enviromnent variables, but keeping the source code identical.

So we need at least a build script to the source code, but we must remember that we also need to deploy database changes. Deploy database changes is a hard work, but source code can be trivial if we take into account a few details:

  • Source code must be the same in all environments. Differences must be placed in configuration files.
  • Never perform file-system operations directly with the console. We need to create scripts and execute the script to perform file-system operations. (folder creation, write-enables to log and caches, …)

If we follow those simple rules we can create a very simple build scrip with our scm (git, mercurial).

The idea is very simple. One mercurial repository on development server. Another one on production server.

// .hg/hgrc
[paths]
prod = ssh://user@host//path/to/app

[hooks]
changegroup = hg update

Now we can easily clone the development repository. A simple “hg push prod” will push code to the production server and update the working repository. If you don’t have ssh access to the server maybe you need to build a custom script. Please do it. “Waste” your time creating your build script. It must works like a charm. Your life will be better. Another tools that will help us to build deploy scripts:

http://capifony.org/
https://github.com/capistrano/capistrano
http://www.phing.info/trac/

And that’s all. Regards, Gonzalo

Populating datagrid techniques with PHP

Today I want to speak about populating datagrid techniques with PHP. At least in my daily work datagrids and tabular data are very common, because of that I want to show two different techniques when populating datagrids with data from our database. Maybe it’s obvious, but I want to show the differences. Let’s start.

Imagine we need to fetch data from our database and show it in a datagrid. Let’s do the traditional way. I haven’t use any framework for this example. Just old school spaghetti code.

$dbh = new PDO('pgsql:dbname=mydb;host=localhost', 'gonzalo', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$stmt = $dbh->prepare('SELECT * FROM test.tbl1 limit 10');
$stmt->setFetchMode(PDO::FETCH_ASSOC);
$stmt->execute();

$data = $stmt->fetchAll();

$table = "";
$table.= "<table>";
foreach ($data as $ow) {
    $table.= "<tr>";
        foreach ($ow as $item) {
            $table.= "<td>{$item}</td>";
        }
    $table.= "</tr>";
}
$table.= "</table>";
?>
<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8" />
        <title>inline grid</title>
    </head>
    <h1>inline grid</h1>
    <body>
        <?php echo $table; ?>
        <script type="text/javascript" src=" https://ajax.googleapis.com/ajax/libs/jquery/1.6.2/jquery.min.js"></script>
    </body>
</html>

And that’s all. The code works, and we’ve got or ugly datagrid with the data from our DB. Where’s the problem? If our “SELECT” statement is fast enougth and our connection to the database is good too, the page load will be good, indeed. But what happens if or query is slow (or we even have more than one)? The whole page load will be penalized due to our slow query. The user won’t see anything until our server finish with all the work. That means bad user experience. The alternative is load the page first (without populated datagrid, of course) and when it’s ready, we load with ajax the data from the server (JSON) and we populate the datagrid with javaScript.

Page without the populated datagrid:

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8" />
        <title>grid ajax</title>
    </head>
    <h1>grid ajax</h1>
    <body>
        <table id='grid'></table>

        <script type="text/javascript" src=" https://ajax.googleapis.com/ajax/libs/jquery/1.6.2/jquery.min.js"></script>
        <script type="text/javascript">
        $(function(){
            $.getJSON("json.php", function(json){
                        for (var i=0;i<json.length;i++) {
                            $('#grid').append("<tr><td>" + json[i].id + "</td><td>" + json[i].field1 + "</td></tr>")
                        }
                    });
        });
        </script>
    </body>
</html>

JSON data fron the server:

$dbh = new PDO('pgsql:dbname=mydb;host=localhost', 'gonzalo', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$stmt = $dbh->prepare('SELECT * FROM test.tbl1 limit 10');
$stmt->setFetchMode(PDO::FETCH_ASSOC);
$stmt->execute();

$data = $stmt->fetchAll();

header('Cache-Control: no-cache, must-revalidate');
header('Expires: Mon, 26 Jul 1997 05:00:00 GMT');
header('Content-type: application/json');

echo json_encode($data);

The outcome of this second technique is the same than the first one, but now user will see the page faster than the first technique and data will load later. Probably the total time to finish all the work is better in the classical approach, but the UX will be better with the second one. Here you can see the times taken from Chorme’s network window:

Even though the total time is better in the inline grid example: 156ms vs 248ms, 1 http request vs 3 HTTP request. The user will see the page (without data) faster with the grid data example.

What’s better. As always it depends on our needs. We need to balance and choose the one that fits with your requirements.

Probably the JavaScript code that I use to populate the datagrid in the second technique could be written in a more efficient way (there’s also plugins to do it). I only want to show the indirect way to create HTML to improve the user experience of our applications.

Flushing files with PHP

Sometimes we need to show files as pdf from PHP. That’s pretty straightforward.

$buffer = null;
$f = fopen($filePath, "rb");
if ($f ) {
    $buffer.= fread($f, filesize($filePath));
}
fclose ($f);

And now we flush the buffer with the right headers.

$filename = basename($filePath);
$type = ‘attachement’; // can be ‘innline’

// get mime type. finfo must be installed. PHP >= 5.3.0, PECL fileinfo >= 0.1.0
$finfo = finfo_open(FILEINFO_MIME_TYPE);
$mimeType = finfo_file($finfo, $filePath);

header("Expires: Wed, 20 Sep 1977 16:10:00 GMT");
header("Cache-Control: no-cache");
header('Cache-Control: maxage=3600');
header('Pragma: public');
//header("Content-Length: " .filesize($path) );
header("Content-Disposition: {$type}; filename={$filename}");
header("Content-Transfer-Encoding: binary");
header("Content-type: ".mimeType”);

echo $buffer;

Apparently the order of the headers is irrelevant but if you need to work with IE (poor guy), use those headers exactly in this order. I don’t know exactly the reason but this combination always works for me and if I overlook it and I change the order it’s likely to have problems with IE.

Another trick is the commented line.

//header("Content-Length: " .filesize($path) );

According to standards it must be set the length of the file with the Content-Length header, but I noticed that if I don’t set this header, the browser opens associated application (Acrobat reader e.g with pdf files) earlier than if I set it. This behaviour is visible with big files. With the Content-Length header the browser opens associated application to the file when the file if fully downloaded and without this header the browser don’t wait to finish the download of the file to open it. That’s means better user experience.

My website is slow. What can I do?

You are working with a website. The website works. All is perfect but your clients told you it’s very slow. You must face the problem and improve the behaviour of the web site. But remember all site works properly. There isn’t any errors. The common way to resolve problems (understand the problem, reproduce the problem in test environment, solve the problem) doesn’t fit in this scenario. What can we do? I want to give some recommendations to improve performance problems in this post. Let’s go

Don’t assume anything

It’s very typical in our work to assume what is the problem when someone gives us an issue. Normally the user is not a technician. He suffer a problem and tell us the symptoms. For example some people from procurement office call you telling the application doesn’t work. You assume he is speaking about the application he use every day doesn’t work. You can go to check the server. It’s OK. Server logs OK. What happens? Finally you discover there is a network problem. Not a problem in the application. but the effects to the users are the same.
When you face a performance issue don’t assume anything. First of all you must debrief the user to take a picture of the problem. Forget the solution just now. In this phase you only need to collect the information and perform an analysis later. If it possible speak and go to his office and see the problem with him. I remember a performance problem some time ago. The user had serious problems with the application. I tested the application and I didn’t find anything wrong. But the problem persists and she was the main user of the application. I went to her office and I discover the real problem. The application was slow. But the screen-saver was slow too. Spreadsheet was slow. In fact all was slow there. The problem was the RAM memory of the PC. More RAM and magically all applications become faster.

Measure the problem

If something is slow you must check times. But take care about it. Wrong measures can make you waste your time trying to solve the problem. A typical mistake is check times only at server-side. For example you start a timer when the script start and finish it when it ends. Imagine you have 1 second. It can be improved but, is the end-user complaining for a performance issue with 1 second of response time? Probably not. You can start to improve your server-side code. You spend some time coding and you turn from 1 second to 0.1 second. You are very proud of your improvements, and you tell to the user the problem is solved. But the user is not agree with you. The problem persists. You’ve been working on a different problem. A problem indeed but different one than user claims. Why? Because you’ve assumed the problem was in server code and your time measurements have been done in a wrong scenario.It’s quite probably the problem in in client site. If you take a look into firebug’s net tab you can realize server-side part (e.g. php ones) normally is the first one but is not the only one. Even it’s not the longest part. It can be a short percent of full-page load and render time. If you want to acheive significant success within your performance problem you must attack directly to the main bottle neck (and detect them before of course). If you want to learn a lot about client side performance, please pick up Steve Soulders’s “High performance web sites” book. You can also read the another Steve’s book “Even faster web sites” but first one is definitely a must read book for people who work in this area. You can also see many conferences of great Steve Soulders in youtuve. Do it. He is a great guru in this area and also a good speaker. After watching his conferences you will have  the desire to drink a beer with him. Probably working on “High performance web sites”’s recommendations you will achieve a significant results, following a really simple rules.

Cache it

I know I not very original giving recommendations about caching in this post but caching is very important. There is a lot theory about caching. You must cache all as you can. But don’t do it like mad. You will get a cache nightmare if you don’ have a good caching plan. You must define the storage, the ttl (time to live) and what is going to be cached and what not.  A wrong caching politics can jeopardize a project but a good caching ones will improve the performance.

Do it offline

Doing all online is cool. The user will get the fresh results when he clicks a button but what happen if the action takes too many time?. Imagine you have a button that sends ten emails every time the user clicks on it. In normal situations the operation is fast enough but what happens if mail server has a big load, or even it’s down. Your application will freeze and your user will become angry. Think moving operations to background. There are great tools like gearman to perform those kind of work. Transform your button from: user clicks, mail one is sent, mail 2, … mail 10 is sent, OK  to: use clicks, new task in our job server, OK. Now true doesn’t mean the ten emails have been sent. Now means they will be send. Balance the possibility of this new behaviour. I realize sometimes it isn’t possible but it is viable in other cases.Imagine you have an important report that uses a complex SQL to extract information from database. This SQL uses several tables to met user expectations. Is it mandatory to perform always the query to get the results? Think in the possibility of creating some statistic tables to collect old information (non changed ones, such as old months or years). Take a snapshot offline of your real-time data and perform queries over those snapshots instead of real information. Sometimes this technique is not possible but if it is available you can achieve important time benefits within your database queries.

Database Connections.

Working with relational databases the creation of database connection is a slow operation. Take care about it. It’s a good practise to put a counter in your script to show you how many connections you create in your script, how many queries and how many results gives your queries to your application. You can realize unpleasant problems with this simple log. If your application perform more than one connection to the same database within the same execution script it’s very sure you are doing something wrong. Use always lazy connections to the database. I also have seen scripts that connects to the database without doing any operation. That’s means you are wasting time connection to the database. Connect only when you really are going to use it. Not always in the beginning of the script as general rule.Check the sql you are using. If for example you are always doing the same query every click on the site to check some kind user information a red light must appear in your mind with a flashing box with the text: cache it! inside.

Trace long query and analyze them into the database. Check indexes and execution plans. This normally is a great bottle neck in the web applications.

Debug flags

Use debug flags to measure the problem. Firebug in combination with FirePHP are a great team to help us in our work. But don’t forget to turn those flag off in the production server. To many debug information collectors active in our production servers will slow down or application with unnecessary actions

Detecting errors and bottle necks in PHP.

In this post I want to make a short list of different ways to detect errors and bottle necks in our PHP projects. It’s an informal list of recommendations and best practises useful at least for me

Indent code.

Python has a great characteristic. Code blocks are defined with indentation. Not with any especial character like other program languages. That’s means if you don’t want syntax errors and you want to execute your program, your script must be properly indented. That’s is a great practise., even in program languages not as restrictive as python with code indention. For example when I need to check a piece of code with a any kind of error, if it isn’t correctly indented I start to indent it. I know it isn’t necessary but for me at least it helps me to understand the code (if I’m not the original developer of the code) and quickly find errors like wrong if clauses and another things like that.

Analyze code and check syntax

You must use a modern text editor or IDE with at least check syntax and also code analyze tools. That’s mandatory. Chose what you want but use one. Zend Studio is a great tool but you also can do it with another editors, even with vi. You can programing PHP with notepad but it’s definitely a waste of time. You will use significant less time programming with modern IDE. I like Zend Studio code Analyzer. It helps you with errors easy to implement and difficult to find like :

// error 1
if ($a=1) {

// error 2
$variable= 2;
if ($Variable == 2) {
...
}

I like to fix every warning even if those warnings are recommendations and not in fact a problem.

Check indentation level.

Even with a perfect indentation big indentation level is a clear signal of “code smell”. Take care of it. If you see a big indentation level you must ask yourself: Is it really the way to do it?

Loops.

If you want to check performance issues and you don’t have too many time,focus yourself finding loops and analyze them. If there is a performance problem is very probably to discover your problem inside a loop. One slow operation executed one time is a problem. But if this operation is executed 1000 times inside a loop is definitely a big problem. OK it’s the same problem but the impact is 1000 times bigger.

Copy-paste code

Avoid them like a plague. It can be helpfully and attractive but the use of the same piece of code distributed among a set of source code files will give you definitely a headache in the future, Be sure about it. We have a lot techniques in to avoid copy and paste code.

Sessions, comet and PHP

I faced this problem when I was developing a comet server but it can happen with any script that needs too many time. I have done a comet process. This process is very simple. Basically is a PHP script who looks the modification date of a file. When it changes, the script ends but if nothing happens the script ends after 30 seconds and start again (with a JavaScript loop). The script worked perfectly on sandbox. In production also worked (brilliant isn’t it?). But the problem appears when I open other tabs in the browser. The application became slow. Very slow. Every click, even really simple operations turned into unusable operations. I realized this behavior appear only when comet was enabled.

A small skeleton of comet server (the code ):

for ($i=0; $i<10; $i++) {
    if (checkSomething()) {
        echo getData();
        flush();
    }
    sleep(1);
}

The problem was in the authentication. The comet server uses session for authentication. The session are stored as files. The system worked perfectly but I realized I didn’t use session_write_close. That’s means server open session file and frees it when the script ends. Normally script takes one second or less but the comet server may take 20-30 seconds.

auth();
for ($i=0; $i<10; $i++) {
    if (checkSomething()) {
        echo getData();
        flush();
    }
    sleep(1);
}

In this case the solution was easy. The auth process was only in the beginning of the script so I only need to use session_write_close after authentication process. With this simple command the server doesn’t lock user sessions and I can ope as many tabs as I need.

auth();
session_write_close(); // this command is on auth function but I put it here for legibility purposes

for ($i=0; $i<10; $i++) {
    if (checkSomething()) {
        echo getData();
        flush();
    }
    sleep(1);
}

There are other storage to session files instead of filesystem (it’s the default one). In relational database, non-relational database and even in memory with mm

Building a simple template engine in PHP.

Yes that’s another template engine in PHP. I’ve been using Smarty for years. It’s easy to install and easy to use. But there is something I don’t like in Smarty. I need to learn another language (smarty markup) to create my templates. Normally my templates are not very complex. I only use them to move HTML code outside PHP. Basically my templates are: HTML and some variables that pass from PHP to tpl. Sometimes I need to do a loop. I know Smarty has a lot of helpers but I never use them.

Now I’m in a project and I must to choose a template engine. This project don’t have any dependencies to any other libraries, so I don’t want to include full Smarty library to my simple templates. A quick search in Google gives us a list of template engines in PHP. Mostly all engines use PHP as template language. That’s a good decision. In fact PHP is a template language so, why we need to use another one?. I think the main problem of using PHP as template language is the temptation of put logic in the template and have a nice spaghetti code.

So I am going to build a simple template engine. Let’s start:
As always I like to start from the interface. When I start a library I like to think the library is finished (before start coding. cool isn’t it?) and write the code to use it. Finally when the I like the interface I start coding the library.

I want this template:

<h1>Example of tpl</h1>
var1 = _('var1') ?>
var2 = _('var2') ?>
foreach ($this->_('var3') as $item) {
    echo $this->clean($item);
}
?>

And I want to call it with something like this:

echo Tpl::singleton()->init('demo1.phtml')->render(array(
    'var1' => 1,
    'var2' => 2,
    'var3' => array(1, 2, 3, 4, 5, 6, 7)
    ));

Basically Tpl class will be a container to collect the configuration and init function will be a factory of another class (Tpl_Instance) that will do the templating itself.

render function is the main function. Basically call to php’s include function with the selected tpl.

if (!is_file($_tplFile)) {
    throw new Exception('Template file not found');
}

ini_set('implicit_flush',false);
ob_start();
include ($_tplFile);

$out = ob_get_contents();
ob_end_clean();
ini_set('implicit_flush',true);

return $out;

As we see the tpl file is a simple PHP file included in our script with Tpl_Instance::render. So $this in our tpl’s PHP code is the instance of Tpl_Instance. That means whe can use protected and even private functions of Tpl_Instance.

Now I going to show different usages of the library:

// Sets the path of templates. If nuls asumes file is absolute
Tpl::singleton()->setConf(Tpl::TPL_DIR, realpath(dirname(__FILE__)));
echo Tpl::singleton()->init('demo1.phtml')->render(array(
    'var1' => 1,
    'var2' => 2,
    'var3' => array(1, 2, 3, 4, 5, 6, 7)
    ));
// The same instance a different template and params added in a different way
$tpl = Tpl::singleton()->init('demo2.phtml');
$tpl->addParam('header', 'header');
$tpl->addParam('footer', 'footer');
echo $tpl->render();
// Disable exceptions if we don't assign a variable
Tpl::singleton()->setConf(Tpl::THROW_EXCEPTION_WITH_PARAMS, false);
$tpl = Tpl::singleton()->init('demo1.phtml');
$tpl->addParam('var1', 'aaaa');
$tpl->addParam('var3', array(1, 2, 3, 4, 5, 6, 7));
echo $tpl->render();
// Using factory
$objTpl = Tpl::factory();
$objTpl->setConf(Tpl::THROW_EXCEPTION_WITH_PARAMS, true);
try {
    $tpl = $objTpl->init('demo1.phtml');
    $tpl->addParam('var1', 'aaaa');
    $tpl->addParam('var3', array(1, 2, 3, 4, 5, 6, 7));
    echo $tpl->render();
} catch (Exception $e) {
    echo "" . $e->getMessage() . "

";
}

And like always full source code is available on google code.

Building a REST client with asynchronous calls using PHP and curl

One month ago I posted an article called Building a simple HTTP client with PHP. A REST client. In this post I tried to create a simple fluid interface to call REST webservices in PHP. Some days ago I read an article and one light switch on on my mind. I can use curl’s “multi” functions to improve my library and perform simultaneous calls very easily.

I’ve got a project that needs to call to different webservices. Those webservices sometimes are slow (2-3 seconds). If I need to call to, for example, three webservices my script will use the add of every single call’s time. With this improve to the library I will use only the time of the slowest webservice. 2 seconds instead of 2+2+2 seconds. Great.

For the example I’ve created a really complex php script that sleeps x seconds depend on an input param:

sleep((integer) $_REQUEST['sleep']);
echo $_REQUEST['sleep'];

With synchronous calls:

echo Http::connect('localhost', 8082)
    ->doGet('/tests/gam_http/sleep.php', array('sleep' => 3));
echo Http::connect('localhost', 8082)
    ->doPost('/tests/gam_http/sleep.php', array('sleep' => 2));
echo Http::connect('localhost', 8082)
    ->doGet('/tests/gam_http/sleep.php', array('sleep' => 1));

This script takes more or less 6 seconds (3+2+1)

But If I switch it to:

$out = Http::connect('localhost', 8082)
    ->get('/tests/gam_http/sleep.php', array('sleep' => 3))
    ->post('/tests/gam_http/sleep.php', array('sleep' => 2))
    ->get('/tests/gam_http/sleep.php', array('sleep' => 1))
    ->run();
print_r($out);

The script only uses 3 seconds (the slowest process)

I’ve got a project that uses it. But I have a problem. I have webservices in different hosts so I’ve done a bit change to the library:

$out = Http::multiConnect()
    ->add(Http::connect('localhost', 8082)->get('/tests/gam_http/sleep.php', array('sleep' => 3)))
    ->add(Http::connect('localhost', 8082)->post('/tests/gam_http/sleep.php', array('sleep' => 2)))
    ->add(Http::connect('localhost', 8082)->get('/tests/gam_http/sleep.php', array('sleep' => 1)))
    ->run();

With a single connection, the exceptions are easy to implement. If curl_getinfo() returns an error message I throw an exception, but now with a multiple interface how I can do it? I throw an exception if one call fail, or not? I have decided not to use exceptions in multiple interface. I always return an array with all the output of every webservice’s call and if something wrong happens instead of the output I will return an instance of Http_Multiple_Error class. Why I use a class instead of a error message? The answer is easy. If I want to check all the answers I can check if any of them is an instanceof Http_Multiple_Error. Also I don’t want to check anything I put a silentMode() function to switch off all error messages.

$out = Http::multiConnect()
    ->silentMode()
    ->add(Http::connect('localhost', 8082)->get('/tests/gam_http/sleep.php', array('sleep' => 3)))
    ->add(Http::connect('localhost', 8082)->post('/tests/gam_http/sleep.php', array('sleep' => 2)))
    ->add(Http::connect('localhost', 8082)->get('/tests/gam_http/sleep.php', array('sleep' => 1)))
    ->run();

The full code is available on google code but the main function is the following one:

    ...
    private function _run()
    {
        $headers = $this->_headers;
        $curly = $result = array();

        $mh = curl_multi_init();
        foreach ($this->_requests as $id => $reg) {
            $curly[$id] = curl_init();

            $type     = $reg[0];
            $url       = $reg[1];
            $params = $reg[2];

            if(!is_null($this->_user)){
               curl_setopt($curly[$id], CURLOPT_USERPWD, $this->_user.':'.$this->_pass);
            }

            switch ($type) {
                case self::DELETE:
                    curl_setopt($curly[$id], CURLOPT_URL, $url . '?' . http_build_query($params));
                    curl_setopt($curly[$id], CURLOPT_CUSTOMREQUEST, self::DELETE);
                    break;
                case self::POST:
                    curl_setopt($curly[$id], CURLOPT_URL, $url);
                    curl_setopt($curly[$id], CURLOPT_POST, true);
                    curl_setopt($curly[$id], CURLOPT_POSTFIELDS, $params);
                    break;
                case self::GET:
                    curl_setopt($curly[$id], CURLOPT_URL, $url . '?' . http_build_query($params));
                    break;
            }
            curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, true);
            curl_setopt($curly[$id], CURLOPT_HTTPHEADER, $headers);

            curl_multi_add_handle($mh, $curly[$id]);
        }

        $running = null;
        do {
            curl_multi_exec($mh, $running);
            sleep(0.2);
        } while($running > 0);

        foreach($curly as $id => $c) {
            $status = curl_getinfo($c, CURLINFO_HTTP_CODE);
            switch ($status) {
                case self::HTTP_OK:
                case self::HTTP_CREATED:
                case self::HTTP_ACEPTED:
                    $result[$id] = curl_multi_getcontent($c);
                    break;
                default:
                    if (!$this->_silentMode) {
                        $result[$id] = new Http_Multiple_Error($status, $type, $url, $params);
                    }
            }
            curl_multi_remove_handle($mh, $c);
        }

        curl_multi_close($mh);
        return $result;

Clean way to call to multiple xmlrpc remote servers

The problem:

I’ve got a class library in PHP. This class library is distributed in a several servers and I want to call them synchronously. The class library has a XMLRPC interface with Zend Framework XMLRPC server.
There is an example class

class Gam_Dummy
{
    /**
     * foo
     *
     * @param integer $arg1
     * @param integer $arg2
     * @return integer
     */
    function foo($arg1, $arg2)
    {
        return $arg1 + $arg2;
    }
}

and the xmlrpc server:

$class = (string) $_GET['class'];
$server = new Zend_XmlRpc_Server();
$server->setClass($class);
echo $server->handle();

First solution

An easy a fast solution for calling remote interfaces is:

$class = "Gam_Dummy";
$client = new Zend_XmlRpc_Client("http://location/of/xmlrpc/server?class={class}");
echo $client->call('foo', array($arg1, $arg2));

and if we have several remote servers:

$servers = array(
    'server1' => 'http://location/of/xmlrpc/server1',
    'server2' => 'http://location/of/xmlrpc/server2',
    'server3' => 'http://location/of/xmlrpc/server3'
    );

$class = "Gam_Dummy";
foreach (array_values($servers) as $_server) {
    $server = "{_server}?class={$class}";
    $client = new Zend_XmlRpc_Client($server);
    echo $client->call('foo', array($arg1, $arg2));
}

Second solution (one remote server):

I want to use the following interface to call my remote class

$class = "Gam_Dummy";
Gam_Dummy::remote("Gam_Dummy", 'server1')->foo($arg1, $arg2);

Why? The answer is because I like to coding with the help of the IDE. If I use the first solution I must remember Gam_Dummy class has a foo function with two parameters. With the second solution if I place the PHPDoc code correctly my IDE will help me showing me the function list of the Gam_Dummy class and even when I type Gam_ IDE will show me all the classes of my repository starting with Gam_. That issue could sound irrelevant for a lot of people but for me is really useful

To get this interface I will change my Gam_Dummy class to:

class Gam_Dummy
{
    /**
     * foo
     *
     * @param integer $arg1
     * @param integer $arg2
     * @return integer
     */
    function foo($arg1, $arg2)
    {
        return $arg1 + $arg2;
    }

    /**
     * Remote interface
     *
     * @param string|array $server
     * @return Gam_Dummy
     */
    static function remote($server)
    {
        return new Remote(get_called_class(), $server);
    }
}

And of course Remote class:

class Remote
{
    private $_class  = null;
    private $_server = null;

    function __construct($class, $server)
    {
        $this->_class  = $class;
        $this->_server = $server;
    }

    function __call($method, $arguments)
    {
        if (class_exists($this->_class)) {
            $server = "{$this->_server}?class={$this->_class}";
            $client = new Zend_XmlRpc_Client($server);
            return $client->call($method, array($arg1, $arg2));
        }
    }
}

Cool. Isn’t it?. But there is a problem if I want to work with two or more remote servers I must write one line of code for each server:

Gam_Dummy::remote('http://location/of/xmlrpc/server1')->foo($arg1, $arg2);
Gam_Dummy::remote('http://location/of/xmlrpc/server2')->foo($arg1, $arg2);
Gam_Dummy::remote('http://location/of/xmlrpc/server3')->foo($arg1, $arg2);

or may better with the array $servers

foreach (array_values($servers) as $server) {
    Gam_Dummy::remote($server)->foo($arg1, $arg2);
}

Third solution for multiple remote servers:

I would like to use this interface instead of solution two with a foreach for multiple servers:

$servers = array(
    'server1' => 'http://location/of/xmlrpc/server1',
    'server2' => 'http://location/of/xmlrpc/server2',
    'server3' => 'http://location/of/xmlrpc/server3'
    );

Gam_Dummy::remote($servers)->foo($arg1, $arg2);

so I change Remote class to:

class Remote
{
    private $_class  = null;
    private $_server = null;

    function __construct($class, $server)
    {
        $this->_class  = $class;
        $this->_server = $server;
    }

    function __call($method, $arguments)
    {
        $out = array();
        if (is_array($this->_server)) {
            foreach ($this->_server as $key => $_server) {
                $server = "{$_server}?class={$this->_class}";
                $client = new Zend_XmlRpc_Client($server);
                $out[$key] = $client->call($method, $arguments);
            }
        } else {
            $server = "{$this->_server}?class={$this->_class}";
            $client = new Zend_XmlRpc_Client($server);
            $out = $client->call($method, $arguments);
        }
        return $out;
    }
}
Follow

Get every new post delivered to your Inbox.

Join 972 other followers