Blog Archives

Yet another Database Abstraction layer with PHP and DBAL

I’m not a big fan of ORMs. I feel very confortable working with raw SQLs and because of that I normally use DBAL (or PDO in old projects). I’ve got one small library to handle my dayly operations with databases and today I’ve written this library

First of all imagine one DBAL connection. I’m using a sqlite in-memomy database in this example but we can use any database supported by DBAL (aka “almost all”):

use Doctrine\DBAL\DriverManager;

$conn = DriverManager::getConnection([
    'driver' => 'pdo_sqlite',
    'memory' => true
]);

We can also create one DBAL connection from a PDO connection. It’s usefull to use DBAL within legacy applications instead of creating a new connection (remember that DBAL works over PDO)

use Doctrine\DBAL\DriverManager;

$conn = DriverManager::getConnection(['pdo' => new PDO('sqlite::memory:')]);

Now we set up the database for the example

$conn->exec("CREATE TABLE users (
            userid VARCHAR PRIMARY KEY  NOT NULL ,
            password VARCHAR NOT NULL ,
            name VARCHAR,
            surname VARCHAR
            );");
$conn->exec("INSERT INTO users VALUES('user','pass','Name','Surname');");
$conn->exec("INSERT INTO users VALUES('user2','pass2','Name2','Surname2');");

Our table “users” has two records. Now we can start to use our library.

First we create a new instance of our library:

use G\Db;

$db = new Db($conn);

Now a simple query from a string:

$data = $db->select("select * from users");

Sometimes I’m lazy and I don’t want to write the whole SQL string and I want to perform a select * from table:

use G\Sql;
$data = $db->select(SQL::createFromTable("users"));

Probably we need to filter our Select statement with a WHERE clause:

$data = $db->select(SQL::createFromTable("users", ['userid' => 'user2']));

And now something very intersting (at least for me). I want to iterate over the recordset and maybe change it. Of course I can use “foreach” over $data and do whatever I need, but I preffer to use the following sintax:

$data = $db->select(SQL::createFromTable("users"), function (&$row) {
    $row['name'] = strtoupper($row['name']);
});

For me it’s more readable. I iterate over the recordset and change the row ‘name’ to uppercase. Here you can see what is doing my “select” function:

/**
* @param Sql|string $sql
* @param \Closure $callback
* @return array
*/
public function select($sql, \Closure $callback = null)
{
    if ($sql instanceof Sql) {
        $sqlString = $sql->getString();
        $parameters = $sql->getParameters();
        $types = $sql->getTypes();
    } else {
        $sqlString = $sql;
        $parameters = [];
        $types = [];
    }

    $statement = $this->conn->executeQuery($sqlString, $parameters, $types);
    $data = $statement->fetchAll();
    if (!is_null($callback) && count($data) > 0) {
        $out = [];
        foreach ($data as $row) {
            if (call_user_func_array($callback, [&$row]) !== false) {
                $out[] = $row;
            }
        }
        $data = $out;
   }

   return $data;
}

And finally transactions (I normally never use autocommit and I like to handle transactions by my own)

$db->transactional(function (Db $db) {
    $userId = 'temporal';

    $db->insert('users', [
        'USERID'   => $userId,
        'PASSWORD' => uniqid(),
        'NAME'     => 'name3',
        'SURNAME'  => 'name3'
    ]);

    $db->update('users', ['NAME' => 'updatedName'], ['USERID' => $userId]);
    $db->delete('users', ['USERID' => $userId]);
});

The “transactional” function it’s very simmilar than DBAL’s transactional function

public function transactional(\Closure $callback)
{
    $out = null;
    $this->conn->beginTransaction();
    try {
        $out = $callback($this);
        $this->conn->commit();
    } catch (\Exception $e) {
        $this->conn->rollback();
        throw $e;
    }

    return $out;
}

I change a little bit because I like to return a value within the closure and allow to do things like that:

$status = $db->transactional(function (Db $db) {
    $userId = 'temporal';

    $db->insert('users', [
        'USERID'   => $userId,
        'PASSWORD' => uniqid(),
        'NAME'     => 'name3',
        'SURNAME'  => 'name3'
    ]);

    $db->update('users', ['NAME' => 'updatedName'], ['USERID' => $userId]);
    $db->delete('users', ['USERID' => $userId]);

    return "OK"
});

The other functions (insert, update, delete) only bypass the calls to DBAL’s funcitons:

private $conn;

public function __construct(Doctrine\DBAL\Connection $conn)
{
    $this->conn = $conn;
}

public function insert($tableName, array $values = [], array $types = [])
{
    $this->conn->insert($tableName, $values, $types);
}

public function delete($tableName, array $where = [], array $types = [])
{
    $this->conn->delete($tableName, $where, $types);
}

public function update($tableName, array $data, array $where = [], array $types = [])
{
    $this->conn->update($tableName, $data, $where, $types);
}

And that’s all. You can use the library with composer and download at github.

BTW I’ve test the new Sensiolabs product (SensioLabs Insight) to analyze the code and verify good practices and I’ve got the Platinum medal #yeah!

Handling several DBAL Database connections in Symfony2 through the Dependency Injection Container with PHP

(This post is the second part of my previous post: Handling several PDO Database connections in Symfony2 through the Dependency Injection Container with PHP. You can read it here)

OK. We can handle PDOs connections inside a Symfony2 application, but what happens if we prefer DBAL. As we know DBAL is built over PDO and adds a set of “extra” features to our database connection. It’s something like PDO with steroids.

If we read the documentation, we will see how to use DBAL:

<?php
$config = new \Doctrine\DBAL\Configuration();
//..
$connectionParams = array(
    'dbname' => 'mydb',
    'user' => 'user',
    'password' => 'secret',
    'host' => 'localhost',
    'driver' => 'pdo_mysql',
);
$conn = DriverManager::getConnection($connectionParams, $config);

As we can see to obtain a DBAL connection we use a factory method in DriverManager class. We can easily implements it in our service container:

# databases.yml
parameters:
  doctrine.dbal.configuration: Doctrine\DBAL\Configuration
  doctrine.dbal.drivermanager: Doctrine\DBAL\DriverManager

  database.db1:
    driver: pdo_sqlite
    memory: true
  database.db2:
    driver: pdo_pgsql
    dbname: testdb
    user: username
    password: password
    host: 127.0.0.1

services:
  dbal_configuartion:
    class: %doctrine.dbal.configuration%
  db1:
    factory_class: %doctrine.dbal.drivermanager%
    factory_method: getConnection
    arguments: [%database.db1%]
  db2:
      factory_class: %doctrine.dbal.drivermanager%
      factory_method: getConnection
      arguments: [%database.db2%]

But if we run again our example Symfony will throws us one error:

RuntimeException: Please add the class to service “db1″ even if it is constructed by a factory since we might need to add method calls based on compile-time checks.

If we use this service container configuration outside Symfony2 application it works (remember we can use Symfony’s Dependency Injection Container outside Symfony application as a component. Example here). But if we want to use it with Symfony2 we need to set the “class”, even here when we only need the static constructor, so we change it to:

# databases.yml
parameters:
  doctrine.dbal.configuration: Doctrine\DBAL\Configuration
  doctrine.dbal.drivermanager: Doctrine\DBAL\DriverManager

  database.db1:
    driver: pdo_sqlite
    memory: true
  database.db2:
    driver: pdo_pgsql
    dbname: testdb
    user: username
    password: password
    host: 127.0.0.1

services:
  dbal_configuartion:
    class: %doctrine.dbal.configuration%
  db1:
    class: %doctrine.dbal.drivermanager%
    factory_class: %doctrine.dbal.drivermanager%
    factory_method: getConnection
    arguments: [%database.db1%]
  db2:
    class: %doctrine.dbal.drivermanager%
    factory_class: %doctrine.dbal.drivermanager%
    factory_method: getConnection
    arguments: [%database.db2%]

And that’s all. We can use DBAL instead of PDO in our database connections.

UPDATE:

After publishing this post someone comment me Doctrine allows us to do it “out of the box” within Symfony with its DoctrineBundle:

https://github.com/doctrine/DoctrineBundle/blob/master/Resources/doc/configuration.rst#doctrine-dbal-configuration

Handling several PDO Database connections in Symfony2 through the Dependency Injection Container with PHP

I’m not a big fan of ORMs, especially in PHP world when all dies at the end of each request. Plain SQL is easy to understand and very powerful. Anyway in PHP we have Doctrine. Doctrine is a amazing project, probably (with permission of Symfony2) the most advanced PHP project, but I normally prefer to work with SQL instead of Doctrine.

Symfony framework is closely coupled to Doctrine and it’s very easy to use the ORM from our applications. But as I said before I prefer not to use it. By the other hand I have another problem. Due to my daily work I need to connect to different databases (not only one) in my applications. In Symfony2 we normally configure the default database in our parameters.yml file:

# parameters.yml
parameters:
    database_driver: pdo_pgsql
    database_host: localhost
    database_port: 5432
    database_name: symfony
    database_user: username
    database_password: password

Ok. If we want to use PDO objects with different databases, we can use something like that:

# parameters.yml
parameters:
  database.db1.dsn: sqlite::memory:
  database.db1.username: username
  database.db1.password: password

  database.db2.dsn: pgsql:host=127.0.0.1;port=5432;dbname=testdb
  database.db2.username: username
  database.db2.password: password

And now create the PDO objects within our code with new \PDO():

$dsn      = $this->container->getParameter('database.db1.dsn');
$username = $this->container->getParameter('database.db1.username');
$password = $this->container->getParameter('database.db1.password')

$pdo = new \PDO($dsn, $username, $password);

It works, but it’s awful. We store the database credentials in the service container but we aren’t using the service container properly. So we can do one small improvement. We will create a new configuration file called databases.yml and we will include this new file within the services.yml:

# services.yml
imports:
- { resource: databases.yml }

And create our databases.yml:

# databases.yml
parameters:
  db.class: Gonzalo123\AppBundle\Db\Db

  database.db1.dsn: sqlite::memory:
  database.db1.username: username
  database.db1.password: password

  database.db2.dsn: pgsql:host=127.0.0.1;port=5432;dbname=testdb
  database.db2.username: username
  database.db2.password: password

services:
  db1:
    class: %db.class%
    calls:
      - [setDsn, [%database.db1.dsn%]]
      - [setUsername, [%database.db1.username%]]
      - [setPassword, [%database.db1.password%]]
  db2:
    class: %db.class%
    calls:
      - [setDsn, [%database.db2.dsn%]]
      - [setUsername, [%database.db2.username%]]
      - [setPassword, [%database.db2.password%]]

As we can see we have created two new services in the dependency injection container called db1 (sqlite in memory) and db2 (one postgreSql database) that use the same class (in this case ‘Gonzalo123\AppBundle\Db\Db’). So we need to create our Db class:

<?php

namespace Gonzalo123\AppBundle\Db;

class Db
{
    private $dsn;
    private $username;
    private $password;

    public function setDsn($dsn)
    {
        $this->dsn = $dsn;
    }

    public function setPassword($password)
    {
        $this->password = $password;
    }

    public function setUsername($username)
    {
        $this->username = $username;
    }

    /** @return \PDO */
    public function getPDO()
    {
        $options = array(\PDO::ATTR_ERRMODE => \PDO::ERRMODE_EXCEPTION);
        return new \PDO($this->dsn, $this->username, $this->password, $options);
    }
}

And that’s all. Now we can get a new PDO object from our service container with:

$this->container->get('db1')->getPDO();

Better, isn’t it? But it’s still ugly. We need one extra class (Gonzalo123\AppBundle\Db\Db) and this class creates a new instance of PDO object (with getPDO()). Do we really need this class? the answer is no. We can change our service container to:

# databases.yml
parameters:
  pdo.class: PDO
  pdo.attr_errmode: 3
  pdo.erromode_exception: 2
  pdo.options:
    %pdo.attr_errmode%: %pdo.erromode_exception%

  database.db1.dsn: sqlite::memory:
  database.db1.username: username
  database.db1.password: password

  database.db2.dsn: pgsql:host=127.0.0.1;port=5432;dbname=testdb
  database.db2.username: username
  database.db2.password: password

services:
  db1:
    class: %pdo.class%
    arguments:
      - %database.db1.dsn%
      - %database.db1.username%
      - %database.db1.password%
      - %pdo.options%
  db2:
    class: %pdo.class%
    arguments:
      - %database.db2.dsn%
      - %database.db2.username%
      - %database.db2.password%
      - %pdo.options%

Now we don’t need getPDO() and we can get the PDO object directly from service container with:

$this->container->get('db1');

And we can use something like this within our controllers (or maybe better in models):

<?php

namespace Gonzalo123\AppBundle\Controller;

use Symfony\Bundle\FrameworkBundle\Controller\Controller;

class DefaultController extends Controller
{
    public function indexAction($name)
    {
        // this code should be out from controller, in a model object.
        // It is only an example
        $pdo = $this->container->get('db1');
        $pdo->exec("CREATE TABLE IF NOT EXISTS messages (id INTEGER PRIMARY KEY, title TEXT, message TEXT)");
        $pdo->exec("INSERT INTO messages(id, title, message) VALUES (1, 'title', 'message')");
        $data = $pdo->query("SELECT * FROM messages")->fetchAll();
        //

        return $this->render('AppBundle:Default:index.html.twig', array('usuario' => $data));
    }
}

Inject dependencies via PhpDoc

Last month I attended to Codemotion conference. I was listening to a talk about Java and I saw the “@inject” decorator. I must admit I switched off my mind from the conference and I started to take notes in my notebook. The idea is to implement something similar in PHP. It’s a pity we don’t have real decorators in PHP. I really miss them. We need to use PhpDoc. It’s not the same than real decorators in other programming languages. That’s my prototype. Let’s go.

Imagine this simple class:

class User
{
    private $db;

    public function getInfo($uid)
    {
        $sql = "select * from users where uid=:UID";
        $stmp = $this->db->prepare($sql);
        $stmp->execute(array('UID' => $uid));
        return $stmp->fetchAll();
    }

    public function getDb()
    {
        return $this->db;
    }
}

It doesn’t work. Private property $db must be an instance of PDO object. We can solve it with dependency injection:

class User
{
    private $db;
    public function __construct(PDO $db)
    {
        $this->db = $db;
    }

    public function getInfo($uid)
    {
        $sql = "select * from users where uid=:UID";
        $stmp = $this->db->prepare($sql);
        $stmp->execute(array('UID' => $uid));
        return $stmp->fetchAll();
    }
}

Now it works but we are going to create a simple PDO Wrapper to obtain the PDO connection.

class DbConf
{
    const DB1 = 'db1';

    private static $conf = array(
        self::DB1 => array(
            'dsn'  => 'pgsql:dbname=db;host=localhost',
            'user' => 'gonzalo',
            'pass' => 'password',
        )
    );

    public static function getConf($key)
    {
        return self::$conf[$key];
    }
}

class Db extends PDO
{
    private static $dbInstances = array();

    /**
     * @static
     * @param string $key
     * @return PDO
     */
    static function getDb($key)
    {
        if (!isset($dbInstances[$key])) {
            $dbConf = DbConf::getConf($key);
            $dbInstances[$key] = new PDO($dbConf['dsn'], $dbConf['user'], $dbConf['pass']);
        }
        return $dbInstances[$key];
    }
}

I like to use this kind of classes because I normally work with different databases and I need to use different connection depending on the SQL. It helps me to mock the database connection within the different environments (development, production, QA). Now We can use our simple class:

$user = new User(Db::getDb(DbConf::DB1));
print_r($user->getInfo(4));

The idea is to change the class into something like this:

class User extends DocInject
{
    /**
     * @inject Db::getDb(DbConf::DB1)
     * @var PDO
     */
    private $db;

    public function getInfo($uid)
    {
        $sql = "select * from users where uid=:UID";
        $stmp = $this->db->prepare($sql);
        $stmp->execute(array('UID' => $uid));
        return $stmp->fetchAll();
    }
}

Now we are going to inject the PDO connection to $db private property in the constructor:

class DocInject
{
    public function __construct()
    {
        $reflection = new ReflectionClass($this);
        foreach ($reflection->getProperties() as $property) {
            /** @var ReflectionProperty $property */
            $docComment = $property->getDocComment();
            $docComment = preg_replace('#[ \t]*(?:\/\*\*|\*\/|\*)?[ ]{0,1}(.*)?#', '$1', $docComment);
            $docComment = trim(str_replace('*/', null, $docComment));
            foreach (explode("\n", $docComment) as $item) {
                if (strpos($item, '@inject') !== false) {
                    $inject = trim(str_replace('@inject', null, $item));
                    $value = null;
                    eval("\$value = {$inject};"); // yes, eval. uggly, isnt't?
                    $property->setAccessible(true);
                    $property->setValue($this, $value);
                }
            }
        }
    }
}

If you have read “Clean Code” (if not, you must do it) you noticed that uncle Bob doesn’t like this class. The method is too long, so we are going to refactor a little bit.

class DocInject
{
    public function __construct()
    {
        $reflection = new ReflectionClass($this);
        foreach ($reflection->getProperties() as $property) {
            $this->processProperty($property);
        }
    }

    private function processProperty(ReflectionProperty $property)
    {
        $docComment = $this->cleanPhpDoc($property->getDocComment());
        foreach (explode("\n", $docComment) as $item) {
            if ($this->existsInjectDecorator($item)) {
                $this->performDependencyInjection($property, $item);
            }
        }
    }

    private function cleanPhpDoc($docComment)
    {
        $docComment = preg_replace('#[ \t]*(?:\/\*\*|\*\/|\*)?[ ]{0,1}(.*)?#', '$1', $docComment);
        $docComment = trim(str_replace('*/', null, $docComment));
        return $docComment;
    }

    private function existsInjectDecorator($item)
    {
        return strpos($item, '@inject') !== false;
    }

    private function performDependencyInjection(ReflectionProperty $property, $item)
    {
        $injectString = $this->removeDecoratorFromPhpDoc($item);
        $value = $this->compileInjectString($injectString);
        $this->injectValueIntoProperty($property, $value);
    }

    private function removeDecoratorFromPhpDoc($item)
    {
        return trim(str_replace('@inject', null, $item));
    }

    private function compileInjectString($injectString)
    {
        $value = null;
        eval("\$value = {$injectString};"); // yes, eval. uggly, isnt't?
        return $value;
    }

    private function injectValueIntoProperty(ReflectionProperty $property, $value)
    {
        $property->setAccessible(true);
        $property->setValue($this, $value);
    }
}

So now we don’t need to pass the new instance of PDO connection in the constructor with DI:

$user = new User();
print_r($user->getInfo(4));

It works but there’s something that I don’t like. We need to extend our User class with DocInject. I like plain classes. Because of that we are going to use the new feature of PHP5.4: traits

Instead of extend our class with DocInject we are going to create:

trait DocInject
{
    public function parseDocInject()
    {
        $reflection = new ReflectionClass($this);
        foreach ($reflection->getProperties() as $property) {
            $this->processProperty($property);
        }
    }

    private function processProperty(ReflectionProperty $property)
    {
        $docComment = $this->cleanPhpDoc($property->getDocComment());
        foreach (explode("\n", $docComment) as $item) {
            if ($this->existsInjectDecorator($item)) {
                $this->performDependencyInjection($property, $item);
            }
        }
    }

    private function cleanPhpDoc($docComment)
    {
        $docComment = preg_replace('#[ \t]*(?:\/\*\*|\*\/|\*)?[ ]{0,1}(.*)?#', '$1', $docComment);
        $docComment = trim(str_replace('*/', null, $docComment));
        return $docComment;
    }

    private function existsInjectDecorator($item)
    {
        return strpos($item, '@inject') !== false;
    }

    private function performDependencyInjection(ReflectionProperty $property, $item)
    {
        $injectString = $this->removeDecoratorFromPhpDoc($item);
        $value = $this->compileInjectString($injectString);
        $this->injectValueIntoProperty($property, $value);
    }

    private function removeDecoratorFromPhpDoc($item)
    {
        return trim(str_replace('@inject', null, $item));
    }

    private function compileInjectString($injectString)
    {
        $value = null;
        eval("\$value = {$injectString};"); // yes, eval. uggly, isnt't?
        return $value;
    }

    private function injectValueIntoProperty(ReflectionProperty $property, $value)
    {
        $property->setAccessible(true);
        $property->setValue($this, $value);
    }
}

And now:

class User
{
    use DocInject;

    public function __construct()
    {
        $this->parseDocInject();
    }

    /**
     * @inject Db::getDb(DbConf::DB1)
     * @var PDO
     */
    private $db;

    public function getInfo($uid)
    {
        $sql = "select * from users where uid=:UID";
        $stmp = $this->db->prepare($sql);
        $stmp->execute(array('UID' => $uid));
        return $stmp->fetchAll();
    }
}

This implementation has a little problem. If our class User needs a constructor we have a problem. As far as I know we cannot use parent::__construct() with a trait. We can solve this problem changing the code a little bit:

class User
{
    use DocInject {parseDocInject as __construct;}

    /**
     * @inject Db::getDb(DbConf::DB1)
     * @var PDO
     */
    private $db;

    public function getInfo($uid)
    {
        $sql = "select * from users where uid=:UID";
        $stmp = $this->db->prepare($sql);
        $stmp->execute(array('UID' => $uid));
        return $stmp->fetchAll();
    }
}

A simple unit test

    public function testSimple()
    {
        $user = new User();
        $this->assertTrue(count($user->getInfo(4)) > 0);
    }

If we use different DbConf file for each environment we can easily use one database or another without changing any line of code.

And that’s all. What do you think?

(Files available as gist here and here)

How to protect from SQL Injection with PHP

Security is a part of our work as developers. We need to ensure our applications against malicious attacks. SQL Injection is one of the most common possible attacks. Basically SQL Injection is one kind of attack that happens when someone injects SQL statements in our application. You can find a lot of info about SQL Injection attack. Basically you need to follow the security golden rule

Filter input
Escape output

If you work with PHP problably you work with PDO Database abstraction layer.
Let’s prepare our database for the examples (I work with PostgreSQL):

CREATE TABLE users
(
  uid integer NOT NULL,
  name character varying(50),
  surname character varying(50),
  CONSTRAINT pk_users PRIMARY KEY (uid)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE users OWNER TO gonzalo;

INSERT INTO users(uid, name, surname) VALUES (0, 'Gonzalo', 'Ayuso');
INSERT INTO users(uid, name, surname) VALUES (1, 'Hans', 'Solo');
INSERT INTO users(uid, name, surname) VALUES (2, 'Luke', 'Skywalker');

OK our database is ready. Now let create a simple query

$dbh = new PDO('pgsql:dbname=db;host=localhost', 'gonzalo', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$stmt = $dbh->prepare('select uid, name, surname from users where uid=:ID');
$stmt->execute(array('ID' => 0));
$data = $stmt->fetchAll();

It works. We are using bind parameters, so we need to prepare one statement, execute and fetch the recordset. The use of prepared statements is strongly recommended. We can also use query() function:

$dbh = new PDO('pgsql:dbname=db;host=localhost', 'gonzalo', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$uid = 0;
$stmt = $dbh->query("select uid, name, surname from users where uid={$uid}");
$data = $stmt->fetchAll();

But what happens if $id came from the request and it’s not propertly escaped

$dbh = new PDO('pgsql:dbname=db;host=localhost', 'gonzalo', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$uid = "0; drop table users;";
$stmt = $dbh->query("select uid, name, surname from users where uid={$uid}");
$data = $stmt->fetchAll();

basically nothing: SQLSTATE[42601]: Syntax error. That’s because is not allowed to use two prepared statements in a single statement.

If we use an insert:

$dbh = new PDO('pgsql:dbname=db;host=localhost', 'gonzalo', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$uid     = 20;
$name    = 'Gonzalo';
$surname = "Ayuso'); drop table users; select 1 where ('1' = '1";

$count = $dbh->exec("INSERT INTO users(uid, name, surname) VALUES ({$uid}, '{$name}', '{$surname}')");

Now we have a problem. Our user table will be deleted. Why? That’s because of the user we are using to connect to the database. It’s important especially at production servers.
It’s very important not to use a database superuser in production. Superusers are very comfortable in our development servers, because you don’t need to grant privileges to every tables but if you forget this issue in production you could have Sql-Injection problems. The solution is very simple:

GRANT SELECT, UPDATE, INSERT, DELETE ON TABLE users TO gonzalo2;

And now

$dbh = new PDO('pgsql:dbname=db;host=localhost', 'gonzalo2', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$uid     = 20;
$name    = 'Gonzalo';
$surname = "Ayuso'); drop table users; select 1 where ('1' = '1";

$count = $dbh->exec("INSERT INTO users(uid, name, surname) VALUES ({$uid}, '{$name}', '{$surname}')");

Now we are safe, at least with this possible attack.
Sumarizing:

  • Filter input
  • Escape output
  • Take care about the database users. Don’t use one user that it allowed to perform “not-allowed” operations within our application. It sounds like a pun but is important.

Database abstraction layers in PHP. PDO versus DBAL

I normally use PDO in my PHP projects. I like it because it’s a PHP extension easy to use and shares the same interface between all databases. Normally I use PostgreSQL but if I change to mySql or Oracle I don’t need to use different functions to handle the database connections.

PHP has a great project called Doctrine2. Doctrine2 is a ORM and it uses its own database abstraction layer called DBAL. In fact DBAL isn’t a pure database abstraction layer. It’s built over PDO. It’s a set of PHP classes we can use that gives us features not available with ‘pure’ PDO. If we use Doctrine2 we’re using DBAL behind the scene, but we don’t need to use Doctrine2 to use DBAL. We can use DBAL as a database abstraction layer without any ORM. Obiously this extra PHP layer over our PDO extension needs to pay a fee. I will have a look to this fee in this post. I will take one of my old post about PDO and I will do the same with DBAL to see the performance differences. Let’s start:

The PDO version:

error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();

$dbh = new PDO('pgsql:dbname=mydb;host=localhost', 'gonzalo', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$dbh->beginTransaction();

$smtp = $dbh->prepare('INSERT INTO test.tbl1 (id, field1) values (:ID, :FIELD1)');

for ($i=0; $i<1000; $i++) {
    $smtp->execute(array('ID' => $i, 'FIELD1' => "field {$i}"));
}

$dbh->commit();

$stmt = $dbh->prepare('SELECT * FROM test.tbl1 limit 10000');
$stmt->execute();

$i=0;
while ($row = $stmt->fetch()) {
	$i++;
}
echo '<h1>PDO</h1>';
echo "<strong>{$i} </strong>";

print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'seconds' => microtime(TRUE) - $time));

$dbh->beginTransaction();
$smtp = $dbh->prepare('delete from test.tbl1');
$smtp->execute();
$dbh->commit();

The DBAL version:

error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();

use Doctrine\DBAL\DriverManager;

$connectionParams = array(
    'dbname'   => 'mydb',
    'user'     => 'gonzalo',
    'password' => 'password',
    'host'     => 'localhost',
    'driver'   => 'pdo_pgsql',
    );

$dbh = DriverManager::getConnection($connectionParams);

$dbh->beginTransaction();

$smtp = $dbh->prepare('INSERT INTO test.tbl1 (id, field1) values (:ID, :FIELD1)');

for ($i=0; $i<1000; $i++) {
    $smtp->execute(array('ID' => $i, 'FIELD1' => "field {$i}"));
}

$dbh->commit();

$stmt = $dbh->prepare('SELECT * FROM test.tbl1 limit 10000');
$stmt->execute();

$i=0;
while ($row = $stmt->fetch()) {
	$i++;
}
echo '<h1>DBAL</h1>';
echo "<strong>{$i} </strong>";

print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'seconds' => microtime(TRUE) - $time));

As we can see DBAL is slower than pure PDO (obiously). Anyway the most of the extra time of DBAL is the time we need to include php classes (remember PDO is a PHP extension and we dont need to include any file). If we take times excluding the include time, the memory usage is almost the same and the execution time a little slower.

Autoload for DBAL version:

spl_autoload_register(function ($class) {
        $class = str_replace('\\', '/', $class) . '.php';
        require_once($class);
    }
);

or hardcoded includes for this example

require_once('Doctrine/DBAL/Driver.php');
require_once('Doctrine/DBAL/Driver/Connection.php');
require_once('Doctrine/DBAL/Platforms/AbstractPlatform.php');
require_once('Doctrine/DBAL/Driver/Statement.php');

require_once('Doctrine/DBAL/DriverManager.php');
require_once('Doctrine/DBAL/Configuration.php');
require_once('Doctrine/Common/EventManager.php');
require_once('Doctrine/DBAL/Driver/PDOPgSql/Driver.php');
require_once('Doctrine/DBAL/Driver.php');
require_once('Doctrine/DBAL/Connection.php');
require_once('Doctrine/DBAL/Driver/Connection.php');
require_once('Doctrine/DBAL/Query/Expression/ExpressionBuilder.php');
require_once('Doctrine/DBAL/Platforms/PostgreSqlPlatform.php');
require_once('Doctrine/DBAL/Platforms/AbstractPlatform.php');
require_once('Doctrine/DBAL/Driver/PDOConnection.php');
require_once('Doctrine/DBAL/Driver/PDOStatement.php');
require_once('Doctrine/DBAL/Driver/Statement.php');
require_once('Doctrine/DBAL/Events.php');
require_once('Doctrine/DBAL/Statement.php');

Outcomes of the tests:

With pure PDO:

  • memory: 0.0044288635253906
  • seconds: 0.24748301506042

With DBAL and autoload:

  • memory: 0.97610473632812
  • seconds: 0.29042816162109

With DBAL and hardcoded requires:

  • memory: 0.97521591186523
  • seconds: 0.31192088127136

With DBAL bypassing the include part:

  • memory: 0.0099525451660156
  • seconds: 0.30333304405212

The fee we paid for using DBAL gives us some extra features. OK we don’t need DBAL to get those features. If we code a bit we can get them (remember DBAL is nothing but a PHP extra layer). But DBAL has a great interface a well documented. Now I’m going to list a few extra features from DBAL very interesting, at least for me:

Transactional mode

I really like it. It allows us to create scripts like that:

$dbh->transactional(function($conn) {
    $smtp = $conn->prepare('INSERT INTO wf.tbl1 (id, field1) values (:ID, :FIELD1)');

    for ($i=0; $i<1000; $i++) {
        $smtp->execute(array('ID' => $i, 'FIELD1' => "field {$i}"));
    }
});

A simple closure will make the code more concise and it will commit/rollback our transaction for us. In fact I borrowed this function in my PDO projects to use this interface. I love Open source.

Snippet from DBAL library:

    /**
     * Executes a function in a transaction.
     *
     * The function gets passed this Connection instance as an (optional) parameter.
     *
     * If an exception occurs during execution of the function or transaction commit,
     * the transaction is rolled back and the exception re-thrown.
     *
     * @param Closure $func The function to execute transactionally.
     */
    public function transactional(Closure $func)
    {
        $this->beginTransaction();
        try {
            $func($this);
            $this->commit();
        } catch (Exception $e) {
            $this->rollback();
            throw $e;
        }
    }

Types conversion

Really useful, at least for when I work with dates:

$date = new \DateTime("2011-03-05 14:00:21");
$stmt = $conn->prepare("SELECT * FROM articles WHERE publish_date > ?");
$stmt->bindValue(1, $date, "datetime");
$stmt->execute();

List of Parameters Conversion

It’s a cool feature too available in DBAL since Doctrine 2.1

$dbh->executeQuery('SELECT * FROM wf.tbl1 WHERE id IN (?)',
    array(array(1, 2, 3, 4, 5, 6)),
    array(\Doctrine\DBAL\Connection::PARAM_INT_ARRAY));

Bind parameters with IN clause with PDO is a bit ugly. We need to create a series of bind parameters depending on our list to map them within the SQL. It’s possible but DBAL interface is smarter.

Transaction Nesting

Another cool feature:

$dbh->beginTransaction();
try {
    $dbh->beginTransaction();
    try {
        $smtp = $dbh->prepare('INSERT INTO wf.tbl1 (id, field1) values (:ID, :FIELD1)');

        for ($i=0; $i<1000; $i++) {
            $smtp->execute(array('ID' => $i, 'FIELD1' => "field {$i}"));
        }

        } catch (Exception $e) {
            $dbh->rollback(); //transaction marked for rollback only
            throw $e;
        }
    $smtp = $dbh->prepare('INSERT INTO wf.tbl1 (id, field1) values (:ID, :FIELD1)');

    for ($i=0; $i<1000; $i++) {
        $smtp->execute(array('ID' => $i, 'FIELD1' => "field {$i}"));
    }

    $dbh->commit(); // real transaction committed
} catch (Exception $e) {
    $dbh->rollback(); // transaction rollback
    throw $e;
}

This piece of code with PDO will throw the following error:
There is already an active transaction
but it works with DBAL. If we need to do this kind of things with PDO we need to use savepoints and things like that. DBAL does the ugly part for us.

Performance analysis of Stored Procedures with PDO and PHP

Last week I had an interesting conversation on twitter about the usage of stored procedures in databases. Someone told stored procedure are evil. I’m not agree with that. Stored procedures are a great place to store business logic. In this post I’m going to test the performance of a small piece of code using stored procedures and using only PHP code.

Without stored procedures

// Without stored procedures
$time = microtime(TRUE);
$mem = memory_get_usage();

$dsn = 'pgsql:host=localhost;dbname=gonzalo;port=5432';
$user = 'user';
$password = 'password';
$conn = new PDO($dsn, $user, $password);
$conn->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$conn->beginTransaction();
$stmt = $conn->prepare('delete from web.tbltest');
$stmt->execute();

$stmt = $conn->prepare('INSERT INTO web.tbltest (field1) values (?)');
foreach (range(0,1000) as $i) {
    $stmt->execute(array($i));
}
$conn->commit();

print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'seconds' => microtime(TRUE) - $time));

With stored procedures:

// With stored procedures:
/*
CREATE OR REPLACE FUNCTION web.method1()
  RETURNS numeric AS
$BODY$
BEGIN
   DELETE FROM web.tbltest;
   FOR i IN 0..1000 LOOP
     INSERT INTO web.tbltest (field1) values (i);
   END LOOP;
   RETURN 1;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;
*/
$time = microtime(TRUE);
$mem = memory_get_usage();

$dsn = 'pgsql:host=localhost;dbname=gonzalo;port=5432';
$user = 'user';
$password = 'password';
$conn = new PDO($dsn, $user, $password);
$conn->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$conn->beginTransaction();
$stmt = $conn->prepare('SELECT web.method1()');
$stmt->execute();
$stmt->setFetchMode(PDO::FETCH_ASSOC);
$out = $stmt->fetchAll();
$conn->commit();

print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'seconds' => microtime(TRUE) - $time));
without stored procedures with stored procedures
memory: 0.0023880004882812
seconds: 0.31109309196472
memory: 0.0020713806152344
seconds: 0.065021991729736

So my conclusion: Stored procedures are not evil. The performance is really good. I know maybe it can be a bit mess if we place business logic within database and outside database at the same time, but with a good design and architecture this problem is easy to solve. What do you think?

Performance analysis fetching data with PDO and PHP.

Fetching data from databases is a common operation in our work as developers. There are many drivers (normally I use PDO), but the usage of all of them are similar and switch from one to another is not difficult (they almost share the same interface). In this post I will focus on fetching data. Basically we’ve got two functions: fetch and fetchAll. I’ve created two examples. One with fetch and another one with fetchAll:

// Example with fetch
error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();

$dbh = new PDO('pgsql:dbname=mydb;host=localhost', 'username', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$stmt = $dbh->prepare('SELECT * FROM tableName limit 10000');
$stmt->execute();

$i=0;
while ($row = $stmt->fetch()) {
	$i++;
}
echo '
<h1>fetch()</h1>
';
echo '
<strong>{$i} </strong>

';
print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'seconds' => microtime(TRUE) - $time));
// Example with fetchAll
error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();

$dbh = new PDO('pgsql:dbname=mydb;host=localhost', 'username', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$stmt = $dbh->prepare('SELECT * FROM tableName limit 10000');
$stmt->execute();

$i=0;
$data = $stmt->fetchAll();
foreach ($data as $row) {
	$i++;
}

echo '
<h1>fetchAll()</h1>
';
echo '
<strong>{$i}</strong>

';
print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'seconds' => microtime(TRUE) - $time));

if we execute the test we obtain:

fetchAll: [memory] => 31.305999755859
fetch: [memory] => 0.002532958984375

OK. It’s obvious. If we approach to the data extraction with fetchAll method we will use more memory. That’s because we’re mapping the whole recorded to a variable ($data) at once. With the fetch loop we are mapping only on row per iteration. By the way if we change the fetch loop to:

$data = array();
while ($row = $stmt->fetch()) {
	$i++;
	$data[] = $row;
}

We will use almost the same amount of memory than the fetchAll method
[memory] => 31.267543792725

Conclusion:
Is it better fetch than fetchAll? The answer is simple: No. We only need to take care what are we doing and use the best solution that fix to our need. If we’re handling small recordset, they’re similar, but if we work with big ones we need to realize that the memory usage we are using changes drastically if we use one method or another.

Database connection pooling with PHP and gearman

Handling Database connections with PHP is pretty straightforward. Just open database connection, perform the actions we need, and close it. There’s a little problem. We cannot create a pull of database connections. We need to create the connection in every request. Create and destroy, again and again. There are some third-party solutions like SQL relay or pgpool2 (if you use PostgreSQL like me). In this post I’m going to try to explain a personal experiment to create a connection pooling using gearman. Another purpose of this experiment is create a prepared statements’ cache not only for the current request but also for all ones. Let’s start.

This is an example of SELECT statement with PDO and PHP

$dbh = new PDO('pgsql:dbname=pg1;host=localhost', 'user', 'pass');
$stmt = $dbh->prepare($sql);
$stmt->execute();
$data = $stmt->fetchAll();

Basically the idea is to use a gearman worker to perform every database operations. As far as I known we cannot pass PDO instances from gearman worker to gearman client. Even with object serialization (PDO objects cannot be serialized). That’s a problem.

My idea is use the same interface than using PDO but let the database work to the worker and obtaining a connection id instead of a real PDO connection.

That’s is the configuration class. We can see It’s defined one database connection and two gearman servers at the same host:

class PoolConf
{
    const PG1 = 'PG1';
    static $DB = array(
        self::PG1 => array(
            'dsn'      => "pgsql:dbname=gonzalo;host=localhost",
            'username' => 'user',
            'password' => 'pass',
            'options'  => null),
    );

    static $SERVERS = array(
        array('127.0.0.1', 4730),
        array('127.0.0.1', 4731),
    );
}

We start the workers:

gearmand -d --log-file=/var/log/gearman --user=gonzalo -p=4730
gearmand -d --log-file=/var/log/gearman --user=gonzalo -p=4731

How many workers we need to start? Depends on your needs. We must realize gearman is not a pool. It’s a queue. But we can start as many servers as we want  (OK it’s depends on our RAM) and create a poll of queues. We need to remember that if we start only one gearman server we only can handle only one database operation each time (it’s a queue) and it will be a huge bottleneck if the application scales. So you need to assess your site and evaluate how many concurrent operations you normally have and start as many gearman server as you need.

Maybe is difficult to explain but the final outcome will be something like that:

use Pool\Client;
$conn = Client::singleton()->getConnection(PoolConf::PG1);

$sql = "SELECT * FROM TEST.TBL1";
$stmt = $conn->prepare($sql);

$stmt->execute();
$data = $stmt->fetchall();
echo "<p>count: " . count($data) . "</p>";

We must take into account that $stmt is not a “real” PHP statement. The real PHP statement is stored into a static variable within the worker. Our $stmt is an instance of Pool\Server\Stmt class. This class has some public methods with the same name than the real statement (because of that it behaves as a real statement), and internally those methods are calls to gearman worker. The same occurs with $conn variable. It’s not a real PDO connection. It’s am instance of Pool\Server\Connection Class.

// Pool/Server/Stmt.php
namespace Pool\Server;

class Stmt
{
    private $_smtpId = null;
    private $_cid    = null;
    private $_client = null;

    function __construct($stmtId, $cid, $client)
    {
        $this->_stmt = $stmtId;
        $this->_cid = $cid;
        $this->_client = $client;
    }

    public function execute($parameters=array())
    {
        $out = $this->_client->do('execute', serialize(array(
            'parameters' => $parameters,
            'stmt'       => $this->_stmt,
            )));
        $error = unserialize($out);
        if (is_a($error, '\Pool\Exception')) {
            throw $error;
        }
        $this->_stmt = $out;
        return $this;
    }

    public function fetchAll()
    {
        $data = $this->_client->do('fetchAll', serialize(array(
            'stmt' => $this->_stmt,
            )));
        return unserialize($data);
    }
}

The worker. The following code is an extract. You can see the full code here

# Create our worker object.
$worker= new GearmanWorker();
foreach (PoolConf::$SERVERS as $server) {
    $worker->addServer($server[0], $server[1]);
}

\Pool\Server::init();

$worker->addFunction('getConnection', 'getConnection');
$worker->addFunction('prepare', 'prepare');
$worker->addFunction('execute', 'execute');
$worker->addFunction('fetchAll', 'fetchAll');
$worker->addFunction('info', 'info');
$worker->addFunction('release', 'release');
$worker->addFunction('beginTransaction', 'beginTransaction');
$worker->addFunction('commit', 'commit');
$worker->addFunction('rollback', 'rollback');

while (1) {
    try {
        $ret = $worker->work();
        if ($worker->returnCode() != GEARMAN_SUCCESS) {
            break;
        }
    } catch (Exception $e) {
        echo $e->getMessage();
    }
}

function fetchAll($job)
{
    echo __function__."\n";
    $params = unserialize($job->workload());
    $stmtId = $params['stmt'];
    return serialize(\Pool\Server::fetchAll($stmtId));
}

function execute($job)
{
    echo __function__."\n";
    $params = unserialize($job->workload());
    $stmtId = $params['stmt'];
    $parameters = $params['parameters'];
    return \Pool\Server::execute($stmtId, $parameters);
}
...

The heart of the worker is \Pool\Server class. This class performs every real PDO operations and stores statements and connections into static private variables.

And now we can use the database pool reusing connections and prepared statements. You can see here a small performance test of reusing prepared statements in an older post.

I’ve also implemented a small error handling. Errors in the worker are serialized and thrown on the client simulating the normal operation of standard PDO usage.

Now a set of examples:

Simple queries. And a simple error handling:

include('../conf/PoolConf.php');
include('../lib/Pool/Client.php');
include('../lib/Pool/Server.php');
include('../lib/Pool/Exception.php');
include('../lib/Pool/Server/Connection.php');
include('../lib/Pool/Server/Stmt.php');

use Pool\Client;
$conn = Client::singleton()->getConnection(PoolConf::PG1);

$sql = "SELECT * FROM TEST.TBL1";
$stmt = $conn->prepare($sql);

$stmt->execute();
$data = $stmt->fetchall();
echo "<p>count: " . count($data) . "</p>";

try {
    $sql = "SELECT * TEST.NON_EXISTENT_TABLE";
    $stmt = $conn->prepare($sql);

    $stmt->execute();
    $data = $stmt->fetchall();
    echo "<p>count: " . count($data) . "</p>";

} catch (Exception $e) {
    echo "ERROR: " . $e->getMessage();
}

print_r(Client::singleton()->info(PoolConf::PG1));

Now with bind parameters:

<?php
include('../conf/PoolConf.php');
include('../lib/Pool/Client.php');
include('../lib/Pool/Server.php');
include('../lib/Pool/Exception.php');
include('../lib/Pool/Server/Connection.php');
include('../lib/Pool/Server/Stmt.php');

use Pool\Client;
$conn = Client::singleton()->getConnection(PoolConf::PG1);

$data = $conn->prepare("SELECT * FROM TEST.TBL1 WHERE SELECCION=:S")->execute(array('S' => 1))->fetchall();

echo count($data);

print_r(Client::singleton()->info(PoolConf::PG1));

And now a transaction:

<?php
include('../conf/PoolConf.php');
include('../lib/Pool/Client.php');
include('../lib/Pool/Server.php');
include('../lib/Pool/Exception.php');
include('../lib/Pool/Server/Connection.php');
include('../lib/Pool/Server/Stmt.php');

use Pool\Client;
$conn = Client::singleton()->getConnection(PoolConf::PG1);

$conn->beginTransaction();
$data = $conn->prepare("SELECT * FROM TEST.TBL1 WHERE SELECCION=:S")->execute(array('S' => 1))->fetchall();
$conn->rollback();

print_r(Client::singleton()->info(PoolConf::PG1));

Conclusion.

That’s a personal experiment. It works, indeed, but probably it’s crowed by bugs. You can use it in production if you are a brave developer :).

Source code.
The full sourcecode is available here at google code

Follow

Get every new post delivered to your Inbox.

Join 869 other followers