Monthly Archives: August 2010

Using CouchDb as filesystem with PHP

One of the problems I need to solve in my clustered PHP applications is where to store files. When I say files I’m not speaking about source code. I’m speaking about additional data files, such as download-able pdfs, logs, etc. Those files must be on every node of the cluster. One possible approach to the solution is to use a distributed filesystem, rsync or maybe use a file-server mounted on every node. Another solution may be the usage of CouchDb. CouchDb has two great features to meet or requirements with this problem. It allows us to store files as attachments and it also allows to perform a great and very easy multi-master replica system.

The usage of CouchDB is pretty straightforward. It implements a RESTfull interface to perform every operations. So the only thing we need is a REST client. Zend Framework has a great one. We dont’t really need a library. We can easily perform REST requests with the PHP’s Curl’s extension. I’ve created two libraries for working with CouchDb one is a low-level HTTP client (with curl) and another is higher level one (it uses the HTTP Client) for CouchDB operations. You can read two post about those libraries here and here.

Now I want to extend the features of my library. I want to use CouchDB as file storage in PHP. Instead of using file functions (fopen, fwrite, fread, …) I want to use another ones and store/read files in CouchDB. For doing this I’ve refactored those two libraries into another one called Nov. I also have embraced namespaces so I will use them in the library. This means it’s only available with PHP 5.3.

Here you are a summary of the library. That’s not a complete UML graph. It’s only a diagram with the main features only with educational purpose.

summary

The best to show the library is with an example:

First I’m going to start with the basic usage of Nov\CouchDb library:

// Starting up the loader
require_once("Nov/Loader.php");
Nov\Loader::init();

use Nov\CouchDb;
$cdb = new CouchDb('localhost', 5984);
$cdb->db('users');
$nombre = $cdb->db('ris_users')->select('gonzalo')->asObject()->name;
$apellido = $cdb->db('ris_users')->select('gonzalo')->asObject()->surname;
echo "Hello {$nombre} {$apellido}.
";

To allow me the use of different CouchDb Databases and to put the Database configuration in one file. I use the following configuration class:

class NovConf
{
    const CDB1 = 'CDB1';
    const PG1  = 'PG1';

    public static $_dbs = array(
    	self::PG1  => array(
            'driver'   => 'pgsql',
            'dsn'      => 'pgsql:dbname=pg1;host=localhost',
            'username' => null
            'password' => null,
        ),
        self::CDB1  => array(
            'driver'   => 'couchdb',
            'host'     => 'localhost',
            'port'     => 5984,
            'protocol' => 'http',
            'username' => null,
            'password' => null,
        ),
    );
}

As you can see I use the same configuration file for my PDO drivers and CouchDb.

Now I can use:

require_once("Nov/Loader.php");
Nov\Loader::init();

use Nov\CouchDb;
$cdb = CouchDb::factory(NovConf::CDB1)->db('users');

try {
    $cdb->insert('xxx', array('name' => 'xxx'));
} catch (CouchDb\Exception\DupValOnIndex $e) {
    echo "Already created\n";
}

$data = $cdb->select('xxx')->asObject();
$cdb->update('xxx', array('name' => 'xxx1'));
$cdb->delete('xxx')->asObject();

And now finally the file storage part:

For storing the files I’ve taken one design decision. Every files will be stored into separate CouchDb document. That’s means one file, one document. There’s another possible approach. One CouchDb document can be one folder and store every files as attachments of this folder in the same document. But I prefer the idea of not to track folders. Only files. So each CouchDb document will have only one attachment.

That’s an example of one document in CouchDb

{
   "_id": "/home/gonzalo/aasa.txt",
   "_rev": "2-48b501a81c38fd84a3e0351917e64135",
   "path": "/home/gonzalo",
   "_attachments": {
       "aasa.txt": {
           "stub": true,
           "content_type": "application/octet-stream",
           "length": 12,
           "revpos": 2
       }
   }
}

There’s another usage script. Here we can see all the features together. We create files, update and delete them. Internally Nov\CouchDb\Fs uses a predefined CouchDb database called fs.

use Nov\CouchDb\Fs;
use Nov\CouchDb\Fs\Exception;
require_once ("Nov/Loader.php");
Nov\Loader::init();

echo "<pre>";
// create an instance from a factory method
$fs = Fs::factory(NovConf::CDB1);
// Now we're going to delete a file. If it doesn't exists will throw a FileNotFound exception
try {
    $fs->delete("/home/gonzalo/aaa.txt");
} catch (Exception\FileNotFound  $e) {
    echo $e->getMessage() . "\n";
}
// Now we are going to create a file.
// the second parameter 'true' means if the file doesn't exist will be created. Similar than 'r+'
try {
    $fs->open("/home/gonzalo/aaa.txt", true)
	->write("asasasasasas", "application/octet-stream");
} catch (Exception\FileNotFound $e) {
    echo $e->getMessage() . "\n";
} catch (Exception\WriteError $e) {
    echo $e->getMessage() . "\n";
} catch (Exception $e) {
    echo $e->getMessage() . "\n";
}
// We open the file
$res = $fs->open("/home/gonzalo/aaa.txt");

// we can get the length and the content type
echo $res->getLenght() . "\n";
echo $res->getContentType(). "\n";
// We move it to another location
$to = "/another/location";
$res->move($to);

$res = $fs->open($to);
// we flush the file to the browser
echo $res->raw();

// finally we delete it
$res->delete();
echo "</pre>";

I’ve also created an extra class to allow to dump files from filesystem to CouchDb and vice-versa.

require_once ("Nov/Loader.php");
Nov\Loader::init();
echo "<pre>";
// from filesystem to couchdb
\Nov\CouchDb\Fs\Utils::fs2cdb("/path/from/", NovConf::CDB1);
// from couchdb to filesystem
\Nov\CouchDb\Fs\Utils::cdb2fs(NovConf::CDB1, "/path/to/");
echo "</pre>";

And that’s all. You can download the source code with the examples here. The examples are under document_root/tests/couchdb/ folder. Remember you will need PHP5.3.

Looking for the perfect PHP IDE

I’ve got a problem. I haven’t found the perfect IDE for me. Yet. I’ve got problems with every software. Now I will try to explain the problems I have and maybe someone shows me the light and helps me to discover the perfect editor/IDE.

First important thing. I’m a Linux user so Windows only softwares are out. I know Mac user really love Textmate but buying a mac is not in my scope. They are good hardware, reliable and of course cool. I know nowadays there are a lot of web developers working with mac but I’m not convinced yet to change my computer from PC with linux to mac. Probably I will make myself the question when the core breaks down. Don’t ask me why but my last three PCs broke down with a fail the core (after years of usage). So the baseline is clear. Software must work with linux and it must work in a native way. Not with wine or things like that.

My requirement list is the following ones:

  • Code auto-completion.
  • Syntax highlight.
  • Debugger.

Easy, isn’t it? But even with those simple requirements I haven’t found yet my perfect IDE. So I’m going to enumerate the problems I’ve got with each one.

Zend Studio 7

It’s not free. That’s not the main problem. But we can find another similar for free. Debugger works fine. Code auto-completion really good. Syntax highlight perfect. Auto indentation and one of my favourite feature code analysis. Really useful. It detects problems such as:

$var = 0;
if ($Var == 0) { // $Var and $var are different variables. It’s probably a type error very difficult to detect in a glance
	...
}

It’s look like the perfect IDE but it’s maddeningly slow. If you are working with a few files is perfect but is your project is big (hundred of files) sometimes you are forced to make too many coffee breaks when it becomes crazy (building workspace…). Maybe if you’ve got a super pc with infinite RAM and hundred cores it works fine but, at least for me it’s irritating.

Eclise PDT

Zend Studio 7 is based on eclipse. PDT it’s almost the same than Zend Studio 7 (without its great integration with the zend services that I don’t use). Exactly the same problems (very slow) but even without some extra features, such as code analysis (really useful for me). By other hand it’s free (as free beer)

Netbeans

Similar features than Eclipse PDT and lighter. Also free. The problem here it’s the debugger. Only works with xdebug. Not a real problem for me. I don’t really mind what debugger to use. Zend debugger and xdebug meets my requirements. But it’s a bit strange how to debug. I don’t like it. In early versions it created the url when you started to debug. And that url cannot be changed properly. That’s means it doesn’t worked for me. Now it’s better but still not good. It looks like the PHP plugin of Netbeans isn’t as good as Java one’s. The future of Netbeans isn’t clear with the acquisition of Oracle. They said Netbeans isn’t an strategical project (that’s means they cut some funds)

VIM

It looks like the perfect editor but It’s hard to learn, at least in the beginning. Syntax highlight perfect. Really light. Works perfect everywhere even on remote hosts via ssh. Code auto-completion works but not as good as in eclipse and Netbeans (but it’s fair enough). I said is not as good as Eclipse and Netbeans because for example it doesn’t hint me with the variables of the function or ignores PHPDoc in the auto-completion pop up, and I really appreciate it. The main problem I have is with the debugger. It’s so strange for me. I thing I need a couple of hours working on it because nowadays debugging with VIM is something like a miracle a miracle for me. Sometimes work but I it’s too many endeavor. Yet. I hope so. If you want more about about PHP and VIM, you must take a look to this link.

Emacs

I must admit that I am too lazy to learn to use it. I invested time with VIM. I feel that I need more and I don’t want to start with another one like emacs from zero. Maybe I’m wrong and that’s the perfect IDE but it’s not in my mind now.

Zend Studio 5.5

That “was” the perfect IDE. It was not for free but it was really good. Fast. Even with big projects (sometimes become crazy and crashed, I known). Debugging was perfect. Only with Zend debugger but really easy to use. You don’t need a great pc. 2G RAM was fair enough. But there’s a problem. This software is obsolete. Zend changed to an Eclipse based one. We can still using it but there isn’t any update. The main problem with it is we cannot use any new PHP5.3 features, such as namespaces. OK we can use them if our server has PHP5.3 but IDE mark those new keywords as syntax errors. So If we work with this software and a PHP5.3 project we need to assume that the red warnings of our editor (syntax errors) are not always errors. It’d be a great new for me if Zend people releases Zend Studio 5.5 as open source a someone continues the project adding PHP 5.3 support.

As you can see I’m not 100% happy with any IDE. Do you have a perfect IDE? I’d really like to be wrong in my assumptions. So I’ll keep looking for.

My development tips

Another unsorted list of ideas this time about coding tips.

code = mass

The source code isn’t an abstract element. OK it’s not as touchable as apples or bricks but it’s somehow under physical laws too. You must write as less code as you can to meet your requirements. Big code means big mass and if you have a big mass you will need more energy to change whatever you need. I can stop a car toy with my hand but I need something more to stop a real train with the same speed. Less code means less failure points and it’s easier to manage. Some people claim proudly they have written a library with thousands of code lines. Nobody pay us for writing lines of code. They pay us for the solutions. The lines of code that we write is our problem. So be as minimalistic as you can. The problem is that writing less code is more complicated than write more. We need to think more to write less code.

DRY (Don’t repeat yourself)

Original, isn’t it?.  Copy & paste is evil. Never use it in your projects. Spread the same code among different files means that you will need to remember where are those pieces of code when a bug appear (they appear, believe me). There’re a lot of techniques to avoid copy & paste source code. Use them. There isn’t any excuse to it.

Coding standards

The source code is one of your deliverables as developer. You must care about it. Adopt a coding standard. Chose one and use it. If you work with a team take this decision with your team, but don’t try to create one standard by your own. It’s a big job. If you use one of the existing coding standards (let’s say Zend Framework’s one) you will find tools to validate your code against it and the most common IDEs and text editors will be able to use it too. In other way if you create a new one you will need to develop those plug-ins and tools and even you will need to document it if you want to show it to another developer. To many work that nobody will pay for it only because you dismiss an existing one.

Revision control.

Mandatory. Even for small and personal projects. I like Mercurial. it’s easy to use and easy to install. Anyway there are others similar like git or bazzar. Or if you are old school, CVS and Subversion. When I start a project, after create an empty folder with the name of the project I always execute “hg init” to create my repository. Commit you code often. Every time you make something important commit your work. Don’t wait until your project is finished. Revision control is a great backup system too. When something wrong happens like accidentally delete of a folder or something similar you can recover it easily without any problem. Also with a simple hg push (or similar command with another software different from mercurial) you will save your project into another place. If you don’t know to use any revision control system aside time for it. Believe me. You will not need too many time to learn how to use it. At least a basic usage.

Text editor

Is your main tool as a developer. It’s like the hammer for a carpenter. You must train yourself in the use of it. A good skill in the use of a text editor or IDE will give great benefits. If you think in term of ROI, the investment you use in your IDE will return sooner or later. Some of them are not trivial. For example Eclipse needs time to learn how to configure it. You must skill yourself into the IDE if you don’t want to lose your time developing (remember time = money). I’ve seen people programming PHP with notepad. You need at least syntax highlight, syntax error detection, auto-completion, auto-completion with code introspection and debugging. Don’t waste your time.

Digressing about working with old code, refactoring and scope creep.

Maybe there’re people who write code without bugs and never need to change one line of their source code. That’s not me. I try to improve my coding skill for the years. That means sometimes I need to face against my old code and if need to rewrite it probably I will do in a different way. I always feel the desire to refactor it. Refactor code is a good thing but is not always viable.Let’s show an example. Nowadays I’ve embraced Zend coding standard. For instance one not camelized variable creeps me out. But in my early days as a PHP developer (almost ten years ago) I didn’t take care about any coding standards. If code worked, it was OK. Now I realized that’s not enough. Code must work. Indeed!. But it must be ready to be changed, adapted to new needs and meet new requirements, and another developer must be able to understand it. Last days before holidays I’ve been working adapting a project to PHP5.3. It’s not a hard job. Some functions have disappeared, now we only have one kind of regular expressions and a couple of things more. I need to check a lot of lines of code. Sometimes I blush myself when I see the code I wrote five or even eight years ago. I want to change a lot of things. Refactor again and again but I must remember that’s not in the scope of the project. The objective of the project is clear: The application must work in a server with PHP5.3. If I start to refactor code the project deviates from project’s main scope. That’s a clear scope creep’s signal.

According with theory the main signals of scope creep are:

  • More work has been done than required
  • Product features don’t match the specifications.

Scope creep one of the most important problems in real projects. It’s very difficult to explain to someone that you will delay your project because you have decided to refactor some pieces of code (code that worked). That’s means your project will be delayed because you’ve decided to improve something that nobody ask for it. If you are on schedule maybe you can assume some refactor but think this kind of work is only viable if the triple constraint model (aka Scope + Budget + Schedule) keep unaltered. You must assess if those unsolicited changes have any impact in the project. Projects must end at some point. At some point you will need to submit the project’s deliverables, put your feet up and relax. But the temptation is big. I know.

First impressions about namespaces in PHP

I’ve been working wih my first project using namenspaces in PHP. As a PHP developer I have been suffering the lack of a program language without namespaces for years. But now it’s over. Finally we have namespaces in PHP since PHP 5.3. My early impressions wasn’t good. I had a problem. I really liked PEAR naming conventions. I know that leads to those ugly long class names, but I felt very comfortable with them. I even started to write a post in my blog called “Things I don’t like in PHP. Namespaces”. But I decided not to publish it since have been working a bit seriously with them. Now I have use them and my impressions have been changed. I’ve embraced them and they are not so bad. They have the same advantages than PEAR naming conventions and they give us some extra benefits further. Here we have a great article about namenspaces in PHP5.3

The only doubt I have is: Are they really better than classical naming conventions?
I make myself the question because as well as in classical naming convention the Classes are fully defined, with namespaces we must take care about aliases, ‘use’ statements and the scope. When I say ‘fully defined’ it means that in a glance to the class name we known exactly the location of the class in the filesystem.. We pay a tax for it. Our class name become bigger. Namespaces come to help us but they add to our code a bit of complexity. I feel really comfortable with classical naming conventions but namespaces are cool. Are they some kind of hype? I don’t think so but I am not 100% convinced. Yet. What do you think?

Follow

Get every new post delivered to your Inbox.

Join 973 other followers