Monthly Archives: August 2010
Using CouchDb as filesystem with PHP
One of the problems I need to solve in my clustered PHP applications is where to store files. When I say files I’m not speaking about source code. I’m speaking about additional data files, such as download-able pdfs, logs, etc. Those files must be on every node of the cluster. One possible approach to the solution is to use a distributed filesystem, rsync or maybe use a file-server mounted on every node. Another solution may be the usage of CouchDb. CouchDb has two great features to meet or requirements with this problem. It allows us to store files as attachments and it also allows to perform a great and very easy multi-master replica system.
The usage of CouchDB is pretty straightforward. It implements a RESTfull interface to perform every operations. So the only thing we need is a REST client. Zend Framework has a great one. We dont’t really need a library. We can easily perform REST requests with the PHP’s Curl’s extension. I’ve created two libraries for working with CouchDb one is a low-level HTTP client (with curl) and another is higher level one (it uses the HTTP Client) for CouchDB operations. You can read two post about those libraries here and here.
Now I want to extend the features of my library. I want to use CouchDB as file storage in PHP. Instead of using file functions (fopen, fwrite, fread, …) I want to use another ones and store/read files in CouchDB. For doing this I’ve refactored those two libraries into another one called Nov. I also have embraced namespaces so I will use them in the library. This means it’s only available with PHP 5.3.
Here you are a summary of the library. That’s not a complete UML graph. It’s only a diagram with the main features only with educational purpose.
The best to show the library is with an example:
First I’m going to start with the basic usage of Nov\CouchDb library:
// Starting up the loader require_once("Nov/Loader.php"); Nov\Loader::init(); use Nov\CouchDb; $cdb = new CouchDb('localhost', 5984); $cdb->db('users'); $nombre = $cdb->db('ris_users')->select('gonzalo')->asObject()->name; $apellido = $cdb->db('ris_users')->select('gonzalo')->asObject()->surname; echo "Hello {$nombre} {$apellido}. ";
To allow me the use of different CouchDb Databases and to put the Database configuration in one file. I use the following configuration class:
class NovConf { const CDB1 = 'CDB1'; const PG1 = 'PG1'; public static $_dbs = array( self::PG1 => array( 'driver' => 'pgsql', 'dsn' => 'pgsql:dbname=pg1;host=localhost', 'username' => null 'password' => null, ), self::CDB1 => array( 'driver' => 'couchdb', 'host' => 'localhost', 'port' => 5984, 'protocol' => 'http', 'username' => null, 'password' => null, ), ); }
As you can see I use the same configuration file for my PDO drivers and CouchDb.
Now I can use:
require_once("Nov/Loader.php"); Nov\Loader::init(); use Nov\CouchDb; $cdb = CouchDb::factory(NovConf::CDB1)->db('users'); try { $cdb->insert('xxx', array('name' => 'xxx')); } catch (CouchDb\Exception\DupValOnIndex $e) { echo "Already created\n"; } $data = $cdb->select('xxx')->asObject(); $cdb->update('xxx', array('name' => 'xxx1')); $cdb->delete('xxx')->asObject();
And now finally the file storage part:
For storing the files I’ve taken one design decision. Every files will be stored into separate CouchDb document. That’s means one file, one document. There’s another possible approach. One CouchDb document can be one folder and store every files as attachments of this folder in the same document. But I prefer the idea of not to track folders. Only files. So each CouchDb document will have only one attachment.
That’s an example of one document in CouchDb
{ "_id": "/home/gonzalo/aasa.txt", "_rev": "2-48b501a81c38fd84a3e0351917e64135", "path": "/home/gonzalo", "_attachments": { "aasa.txt": { "stub": true, "content_type": "application/octet-stream", "length": 12, "revpos": 2 } } }
There’s another usage script. Here we can see all the features together. We create files, update and delete them. Internally Nov\CouchDb\Fs uses a predefined CouchDb database called fs.
use Nov\CouchDb\Fs; use Nov\CouchDb\Fs\Exception; require_once ("Nov/Loader.php"); Nov\Loader::init(); echo "<pre>"; // create an instance from a factory method $fs = Fs::factory(NovConf::CDB1); // Now we're going to delete a file. If it doesn't exists will throw a FileNotFound exception try { $fs->delete("/home/gonzalo/aaa.txt"); } catch (Exception\FileNotFound $e) { echo $e->getMessage() . "\n"; } // Now we are going to create a file. // the second parameter 'true' means if the file doesn't exist will be created. Similar than 'r+' try { $fs->open("/home/gonzalo/aaa.txt", true) ->write("asasasasasas", "application/octet-stream"); } catch (Exception\FileNotFound $e) { echo $e->getMessage() . "\n"; } catch (Exception\WriteError $e) { echo $e->getMessage() . "\n"; } catch (Exception $e) { echo $e->getMessage() . "\n"; } // We open the file $res = $fs->open("/home/gonzalo/aaa.txt"); // we can get the length and the content type echo $res->getLenght() . "\n"; echo $res->getContentType(). "\n"; // We move it to another location $to = "/another/location"; $res->move($to); $res = $fs->open($to); // we flush the file to the browser echo $res->raw(); // finally we delete it $res->delete(); echo "</pre>";
I’ve also created an extra class to allow to dump files from filesystem to CouchDb and vice-versa.
require_once ("Nov/Loader.php"); Nov\Loader::init(); echo "<pre>"; // from filesystem to couchdb \Nov\CouchDb\Fs\Utils::fs2cdb("/path/from/", NovConf::CDB1); // from couchdb to filesystem \Nov\CouchDb\Fs\Utils::cdb2fs(NovConf::CDB1, "/path/to/"); echo "</pre>";
And that’s all. You can download the source code with the examples here. The examples are under document_root/tests/couchdb/ folder. Remember you will need PHP5.3.
Looking for the perfect PHP IDE
I’ve got a problem. I haven’t found the perfect IDE for me. Yet. I’ve got problems with every software. Now I will try to explain the problems I have and maybe someone shows me the light and helps me to discover the perfect editor/IDE.
First important thing. I’m a Linux user so Windows only softwares are out. I know Mac user really love Textmate but buying a mac is not in my scope. They are good hardware, reliable and of course cool. I know nowadays there are a lot of web developers working with mac but I’m not convinced yet to change my computer from PC with linux to mac. Probably I will make myself the question when the core breaks down. Don’t ask me why but my last three PCs broke down with a fail the core (after years of usage). So the baseline is clear. Software must work with linux and it must work in a native way. Not with wine or things like that.
My requirement list is the following ones:
- Code auto-completion.
- Syntax highlight.
- Debugger.
Easy, isn’t it? But even with those simple requirements I haven’t found yet my perfect IDE. So I’m going to enumerate the problems I’ve got with each one.
Zend Studio 7
It’s not free. That’s not the main problem. But we can find another similar for free. Debugger works fine. Code auto-completion really good. Syntax highlight perfect. Auto indentation and one of my favourite feature code analysis. Really useful. It detects problems such as:
$var = 0; if ($Var == 0) { // $Var and $var are different variables. It’s probably a type error very difficult to detect in a glance ... }
It’s look like the perfect IDE but it’s maddeningly slow. If you are working with a few files is perfect but is your project is big (hundred of files) sometimes you are forced to make too many coffee breaks when it becomes crazy (building workspace…). Maybe if you’ve got a super pc with infinite RAM and hundred cores it works fine but, at least for me it’s irritating.
Eclise PDT
Zend Studio 7 is based on eclipse. PDT it’s almost the same than Zend Studio 7 (without its great integration with the zend services that I don’t use). Exactly the same problems (very slow) but even without some extra features, such as code analysis (really useful for me). By other hand it’s free (as free beer)
Netbeans
Similar features than Eclipse PDT and lighter. Also free. The problem here it’s the debugger. Only works with xdebug. Not a real problem for me. I don’t really mind what debugger to use. Zend debugger and xdebug meets my requirements. But it’s a bit strange how to debug. I don’t like it. In early versions it created the url when you started to debug. And that url cannot be changed properly. That’s means it doesn’t worked for me. Now it’s better but still not good. It looks like the PHP plugin of Netbeans isn’t as good as Java one’s. The future of Netbeans isn’t clear with the acquisition of Oracle. They said Netbeans isn’t an strategical project (that’s means they cut some funds)
VIM
It looks like the perfect editor but It’s hard to learn, at least in the beginning. Syntax highlight perfect. Really light. Works perfect everywhere even on remote hosts via ssh. Code auto-completion works but not as good as in eclipse and Netbeans (but it’s fair enough). I said is not as good as Eclipse and Netbeans because for example it doesn’t hint me with the variables of the function or ignores PHPDoc in the auto-completion pop up, and I really appreciate it. The main problem I have is with the debugger. It’s so strange for me. I thing I need a couple of hours working on it because nowadays debugging with VIM is something like a miracle a miracle for me. Sometimes work but I it’s too many endeavor. Yet. I hope so. If you want more about about PHP and VIM, you must take a look to this link.
Emacs
I must admit that I am too lazy to learn to use it. I invested time with VIM. I feel that I need more and I don’t want to start with another one like emacs from zero. Maybe I’m wrong and that’s the perfect IDE but it’s not in my mind now.
Zend Studio 5.5
That “was” the perfect IDE. It was not for free but it was really good. Fast. Even with big projects (sometimes become crazy and crashed, I known). Debugging was perfect. Only with Zend debugger but really easy to use. You don’t need a great pc. 2G RAM was fair enough. But there’s a problem. This software is obsolete. Zend changed to an Eclipse based one. We can still using it but there isn’t any update. The main problem with it is we cannot use any new PHP5.3 features, such as namespaces. OK we can use them if our server has PHP5.3 but IDE mark those new keywords as syntax errors. So If we work with this software and a PHP5.3 project we need to assume that the red warnings of our editor (syntax errors) are not always errors. It’d be a great new for me if Zend people releases Zend Studio 5.5 as open source a someone continues the project adding PHP 5.3 support.
As you can see I’m not 100% happy with any IDE. Do you have a perfect IDE? I’d really like to be wrong in my assumptions. So I’ll keep looking for.
My development tips
code = mass
DRY (Don’t repeat yourself)
Coding standards
Revision control.
Text editor
Digressing about working with old code, refactoring and scope creep.
According with theory the main signals of scope creep are:
- More work has been done than required
- Product features don’t match the specifications.
Scope creep one of the most important problems in real projects. It’s very difficult to explain to someone that you will delay your project because you have decided to refactor some pieces of code (code that worked). That’s means your project will be delayed because you’ve decided to improve something that nobody ask for it. If you are on schedule maybe you can assume some refactor but think this kind of work is only viable if the triple constraint model (aka Scope + Budget + Schedule) keep unaltered. You must assess if those unsolicited changes have any impact in the project. Projects must end at some point. At some point you will need to submit the project’s deliverables, put your feet up and relax. But the temptation is big. I know.
First impressions about namespaces in PHP
The only doubt I have is: Are they really better than classical naming conventions?
I make myself the question because as well as in classical naming convention the Classes are fully defined, with namespaces we must take care about aliases, ‘use’ statements and the scope. When I say ‘fully defined’ it means that in a glance to the class name we known exactly the location of the class in the filesystem.. We pay a tax for it. Our class name become bigger. Namespaces come to help us but they add to our code a bit of complexity. I feel really comfortable with classical naming conventions but namespaces are cool. Are they some kind of hype? I don’t think so but I am not 100% convinced. Yet. What do you think?