Performing UPSERT (Update or Insert) with PostgreSQL and PHP

That’s a typical situation. Imagine you’ve got one table

CREATE TABLE PUBLIC.TBUPSERTEXAMPLE
(
  KEY1 CHARACTER VARYING(10) NOT NULL,
  KEY2 CHARACTER VARYING(14) NOT NULL,
  KEY3 CHARACTER VARYING(14) NOT NULL,
  KEY4 CHARACTER VARYING(14) NOT NULL,

  VALUE1 CHARACTER VARYING(20),
  VALUE2 CHARACTER VARYING(20) NOT NULL,
  VALUE3 CHARACTER VARYING(100),
  VALUE4 CHARACTER VARYING(400),
  VALUE5 CHARACTER VARYING(20),

  CONSTRAINT TBUPSERTEXAMPLE_PKEY PRIMARY KEY (KEY1, KEY2, KEY3, KEY4)
)

And you need to update one record. You can perform a simple UPDATE statement but what happens the first time?

You cannot update the record basically because the record doesn’t exists. You need to create an INSERT statement instead. We can do it following different ways. You can create first a SELECT statement and, if the record exists, perform an UPDATE. If it doesn’t exists you perform an INSERT. We also can perform an UPDATE and see how many records are affected. If no records are affected then we perform an INSERT. Finally we can perform one INSERT and it it throws an error then perform an UPDATE.

All of these techniques works in one way or another but PostgreSQL gives us one cool way of doing this operation with one SQL sentence. We can use CTE (Common Table Expression) and execute something like this:

WITH upsert AS (
    UPDATE PUBLIC.TBUPSERTEXAMPLE
    SET
        VALUE1 = :VALUE1,
        VALUE2 = :VALUE2,
        VALUE3 = :VALUE3,
        VALUE4 = :VALUE4,
        VALUE5 = :VALUE5
    WHERE
        KEY1 = :KEY1 AND
        KEY2 = :KEY2 AND
        KEY3 = :KEY3 AND
        KEY4 = :KEY4
    RETURNING *
)
INSERT INTO PUBLIC.TBUPSERTEXAMPLE(KEY1, KEY2, KEY3, KEY4, VALUE1, VALUE2, VALUE3, VALUE4, VALUE5)
SELECT
    :KEY1, :KEY2, :KEY3, :KEY4, :VALUE1, :VALUE2, :VALUE3, :VALUE4, :VALUE5
WHERE
    NOT EXISTS (SELECT 1 FROM upsert);

Since PostgreSQL 9.5 we also can do another technique to do this UPSERT operations. We can do something like this:

INSERT INTO PUBLIC.TBUPSERTEXAMPLE (key1, key2, key3, key4, value1, value2, value3, value4, value5)
  VALUES ('key2', 'key2', 'key3', 'key4', 'value1',  'value2',  'value3',  'value4',  'value5')
ON CONFLICT (key1, key2, key3, key4)
DO UPDATE SET 
  value1 = 'value1', 
  value2 = 'value2', 
  value3 = 'value3', 
  value4 = 'value4', 
  value5 = 'value5'
WHERE 
  TBUPSERTEXAMPLE.key1 = 'key2' AND 
  TBUPSERTEXAMPLE.key2 = 'key2' AND 
  TBUPSERTEXAMPLE.key3 = 'key3' AND 
  TBUPSERTEXAMPLE.key4 = 'key4';

To help me writing this sentence I’ve created a simple PHP wrapper:

Here one example using PDO

use G\SqlUtils\Upsert;

$conn = new PDO('pgsql:dbname=gonzalo;host=localhost', 'username', 'password');
$conn->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$conn->beginTransaction();
try {
    Upsert::createFromPDO($conn)->exec('PUBLIC.TBUPSERTEXAMPLE', [
        'KEY1' => 'key1',
        'KEY2' => 'key2',
        'KEY3' => 'key3',
        'KEY4' => 'key4',
    ], [
        'VALUE1' => 'value1',
        'VALUE2' => 'value2',
        'VALUE3' => 'value3',
        'VALUE4' => 'value4',
        'VALUE5' => 'value5',
    ]);
    $conn->commit();
} catch (Exception $e) {
    $conn->rollback();
    throw $e;
}

And another one using DBAL

use Doctrine\DBAL\DriverManager;
use G\SqlUtils\Upsert;

$connectionParams = [
    'dbname'   => 'gonzalo',
    'user'     => 'username',
    'password' => 'password',
    'host'     => 'localhost',
    'driver'   => 'pdo_pgsql',
];

$dbh = DriverManager::getConnection($connectionParams);
$dbh->transactional(function ($conn) {
    Upsert::createFromDBAL($conn)->exec('PUBLIC.TBUPSERTEXAMPLE', [
        'KEY1' => 'key1',
        'KEY2' => 'key2',
        'KEY3' => 'key3',
        'KEY4' => 'key4',
    ], [
        'VALUE1' => 'value1',
        'VALUE2' => 'value2',
        'VALUE3' => 'value3',
        'VALUE4' => null,
        'VALUE5' => 'value5',
    ]);
});

And that’s all. Library is available in my github and it’s also at packagist.

Advertisement

Yet another Database Abstraction layer with PHP and DBAL

I’m not a big fan of ORMs. I feel very confortable working with raw SQLs and because of that I normally use DBAL (or PDO in old projects). I’ve got one small library to handle my dayly operations with databases and today I’ve written this library

First of all imagine one DBAL connection. I’m using a sqlite in-memomy database in this example but we can use any database supported by DBAL (aka “almost all”):

use Doctrine\DBAL\DriverManager;

$conn = DriverManager::getConnection([
    'driver' => 'pdo_sqlite',
    'memory' => true
]);

We can also create one DBAL connection from a PDO connection. It’s usefull to use DBAL within legacy applications instead of creating a new connection (remember that DBAL works over PDO)

use Doctrine\DBAL\DriverManager;

$conn = DriverManager::getConnection(['pdo' => new PDO('sqlite::memory:')]);

Now we set up the database for the example

$conn->exec("CREATE TABLE users (
            userid VARCHAR PRIMARY KEY  NOT NULL ,
            password VARCHAR NOT NULL ,
            name VARCHAR,
            surname VARCHAR
            );");
$conn->exec("INSERT INTO users VALUES('user','pass','Name','Surname');");
$conn->exec("INSERT INTO users VALUES('user2','pass2','Name2','Surname2');");

Our table “users” has two records. Now we can start to use our library.

First we create a new instance of our library:

use G\Db;

$db = new Db($conn);

Now a simple query from a string:

$data = $db->select("select * from users");

Sometimes I’m lazy and I don’t want to write the whole SQL string and I want to perform a select * from table:

use G\Sql;
$data = $db->select(SQL::createFromTable("users"));

Probably we need to filter our Select statement with a WHERE clause:

$data = $db->select(SQL::createFromTable("users", ['userid' => 'user2']));

And now something very intersting (at least for me). I want to iterate over the recordset and maybe change it. Of course I can use “foreach” over $data and do whatever I need, but I preffer to use the following sintax:

$data = $db->select(SQL::createFromTable("users"), function (&$row) {
    $row['name'] = strtoupper($row['name']);
});

For me it’s more readable. I iterate over the recordset and change the row ‘name’ to uppercase. Here you can see what is doing my “select” function:

/**
* @param Sql|string $sql
* @param \Closure $callback
* @return array
*/
public function select($sql, \Closure $callback = null)
{
    if ($sql instanceof Sql) {
        $sqlString = $sql->getString();
        $parameters = $sql->getParameters();
        $types = $sql->getTypes();
    } else {
        $sqlString = $sql;
        $parameters = [];
        $types = [];
    }

    $statement = $this->conn->executeQuery($sqlString, $parameters, $types);
    $data = $statement->fetchAll();
    if (!is_null($callback) && count($data) > 0) {
        $out = [];
        foreach ($data as $row) {
            if (call_user_func_array($callback, [&$row]) !== false) {
                $out[] = $row;
            }
        }
        $data = $out;
   }

   return $data;
}

And finally transactions (I normally never use autocommit and I like to handle transactions by my own)

$db->transactional(function (Db $db) {
    $userId = 'temporal';

    $db->insert('users', [
        'USERID'   => $userId,
        'PASSWORD' => uniqid(),
        'NAME'     => 'name3',
        'SURNAME'  => 'name3'
    ]);

    $db->update('users', ['NAME' => 'updatedName'], ['USERID' => $userId]);
    $db->delete('users', ['USERID' => $userId]);
});

The “transactional” function it’s very simmilar than DBAL’s transactional function

public function transactional(\Closure $callback)
{
    $out = null;
    $this->conn->beginTransaction();
    try {
        $out = $callback($this);
        $this->conn->commit();
    } catch (\Exception $e) {
        $this->conn->rollback();
        throw $e;
    }

    return $out;
}

I change a little bit because I like to return a value within the closure and allow to do things like that:

$status = $db->transactional(function (Db $db) {
    $userId = 'temporal';

    $db->insert('users', [
        'USERID'   => $userId,
        'PASSWORD' => uniqid(),
        'NAME'     => 'name3',
        'SURNAME'  => 'name3'
    ]);

    $db->update('users', ['NAME' => 'updatedName'], ['USERID' => $userId]);
    $db->delete('users', ['USERID' => $userId]);

    return "OK"
});

The other functions (insert, update, delete) only bypass the calls to DBAL’s funcitons:

private $conn;

public function __construct(Doctrine\DBAL\Connection $conn)
{
    $this->conn = $conn;
}

public function insert($tableName, array $values = [], array $types = [])
{
    $this->conn->insert($tableName, $values, $types);
}

public function delete($tableName, array $where = [], array $types = [])
{
    $this->conn->delete($tableName, $where, $types);
}

public function update($tableName, array $data, array $where = [], array $types = [])
{
    $this->conn->update($tableName, $data, $where, $types);
}

And that’s all. You can use the library with composer and download at github.

BTW I’ve test the new Sensiolabs product (SensioLabs Insight) to analyze the code and verify good practices and I’ve got the Platinum medal #yeah!

Handling several DBAL Database connections in Symfony2 through the Dependency Injection Container with PHP

(This post is the second part of my previous post: Handling several PDO Database connections in Symfony2 through the Dependency Injection Container with PHP. You can read it here)

OK. We can handle PDOs connections inside a Symfony2 application, but what happens if we prefer DBAL. As we know DBAL is built over PDO and adds a set of “extra” features to our database connection. It’s something like PDO with steroids.

If we read the documentation, we will see how to use DBAL:

<?php
$config = new \Doctrine\DBAL\Configuration();
//..
$connectionParams = array(
    'dbname' => 'mydb',
    'user' => 'user',
    'password' => 'secret',
    'host' => 'localhost',
    'driver' => 'pdo_mysql',
);
$conn = DriverManager::getConnection($connectionParams, $config);

As we can see to obtain a DBAL connection we use a factory method in DriverManager class. We can easily implements it in our service container:

# databases.yml
parameters:
  doctrine.dbal.configuration: Doctrine\DBAL\Configuration
  doctrine.dbal.drivermanager: Doctrine\DBAL\DriverManager

  database.db1:
    driver: pdo_sqlite
    memory: true
  database.db2:
    driver: pdo_pgsql
    dbname: testdb
    user: username
    password: password
    host: 127.0.0.1

services:
  dbal_configuartion:
    class: %doctrine.dbal.configuration%
  db1:
    factory_class: %doctrine.dbal.drivermanager%
    factory_method: getConnection
    arguments: [%database.db1%]
  db2:
      factory_class: %doctrine.dbal.drivermanager%
      factory_method: getConnection
      arguments: [%database.db2%]

But if we run again our example Symfony will throws us one error:

RuntimeException: Please add the class to service “db1” even if it is constructed by a factory since we might need to add method calls based on compile-time checks.

If we use this service container configuration outside Symfony2 application it works (remember we can use Symfony’s Dependency Injection Container outside Symfony application as a component. Example here). But if we want to use it with Symfony2 we need to set the “class”, even here when we only need the static constructor, so we change it to:

# databases.yml
parameters:
  doctrine.dbal.configuration: Doctrine\DBAL\Configuration
  doctrine.dbal.drivermanager: Doctrine\DBAL\DriverManager

  database.db1:
    driver: pdo_sqlite
    memory: true
  database.db2:
    driver: pdo_pgsql
    dbname: testdb
    user: username
    password: password
    host: 127.0.0.1

services:
  dbal_configuartion:
    class: %doctrine.dbal.configuration%
  db1:
    class: %doctrine.dbal.drivermanager%
    factory_class: %doctrine.dbal.drivermanager%
    factory_method: getConnection
    arguments: [%database.db1%]
  db2:
    class: %doctrine.dbal.drivermanager%
    factory_class: %doctrine.dbal.drivermanager%
    factory_method: getConnection
    arguments: [%database.db2%]

And that’s all. We can use DBAL instead of PDO in our database connections.

UPDATE:

After publishing this post someone comment me Doctrine allows us to do it “out of the box” within Symfony with its DoctrineBundle:

https://github.com/doctrine/DoctrineBundle/blob/master/Resources/doc/configuration.rst#doctrine-dbal-configuration

Database abstraction layers in PHP. PDO versus DBAL

I normally use PDO in my PHP projects. I like it because it’s a PHP extension easy to use and shares the same interface between all databases. Normally I use PostgreSQL but if I change to mySql or Oracle I don’t need to use different functions to handle the database connections.

PHP has a great project called Doctrine2. Doctrine2 is a ORM and it uses its own database abstraction layer called DBAL. In fact DBAL isn’t a pure database abstraction layer. It’s built over PDO. It’s a set of PHP classes we can use that gives us features not available with ‘pure’ PDO. If we use Doctrine2 we’re using DBAL behind the scene, but we don’t need to use Doctrine2 to use DBAL. We can use DBAL as a database abstraction layer without any ORM. Obiously this extra PHP layer over our PDO extension needs to pay a fee. I will have a look to this fee in this post. I will take one of my old post about PDO and I will do the same with DBAL to see the performance differences. Let’s start:

The PDO version:

error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();

$dbh = new PDO('pgsql:dbname=mydb;host=localhost', 'gonzalo', 'password');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$dbh->beginTransaction();

$smtp = $dbh->prepare('INSERT INTO test.tbl1 (id, field1) values (:ID, :FIELD1)');

for ($i=0; $i<1000; $i++) {
    $smtp->execute(array('ID' => $i, 'FIELD1' => "field {$i}"));
}

$dbh->commit();

$stmt = $dbh->prepare('SELECT * FROM test.tbl1 limit 10000');
$stmt->execute();

$i=0;
while ($row = $stmt->fetch()) {
	$i++;
}
echo '<h1>PDO</h1>';
echo "<strong>{$i} </strong>";

print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'seconds' => microtime(TRUE) - $time));

$dbh->beginTransaction();
$smtp = $dbh->prepare('delete from test.tbl1');
$smtp->execute();
$dbh->commit();

The DBAL version:

error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();

use Doctrine\DBAL\DriverManager;

$connectionParams = array(
    'dbname'   => 'mydb',
    'user'     => 'gonzalo',
    'password' => 'password',
    'host'     => 'localhost',
    'driver'   => 'pdo_pgsql',
    );

$dbh = DriverManager::getConnection($connectionParams);

$dbh->beginTransaction();

$smtp = $dbh->prepare('INSERT INTO test.tbl1 (id, field1) values (:ID, :FIELD1)');

for ($i=0; $i<1000; $i++) {
    $smtp->execute(array('ID' => $i, 'FIELD1' => "field {$i}"));
}

$dbh->commit();

$stmt = $dbh->prepare('SELECT * FROM test.tbl1 limit 10000');
$stmt->execute();

$i=0;
while ($row = $stmt->fetch()) {
	$i++;
}
echo '<h1>DBAL</h1>';
echo "<strong>{$i} </strong>";

print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'seconds' => microtime(TRUE) - $time));

As we can see DBAL is slower than pure PDO (obiously). Anyway the most of the extra time of DBAL is the time we need to include php classes (remember PDO is a PHP extension and we dont need to include any file). If we take times excluding the include time, the memory usage is almost the same and the execution time a little slower.

Autoload for DBAL version:

spl_autoload_register(function ($class) {
        $class = str_replace('\\', '/', $class) . '.php';
        require_once($class);
    }
);

or hardcoded includes for this example

require_once('Doctrine/DBAL/Driver.php');
require_once('Doctrine/DBAL/Driver/Connection.php');
require_once('Doctrine/DBAL/Platforms/AbstractPlatform.php');
require_once('Doctrine/DBAL/Driver/Statement.php');

require_once('Doctrine/DBAL/DriverManager.php');
require_once('Doctrine/DBAL/Configuration.php');
require_once('Doctrine/Common/EventManager.php');
require_once('Doctrine/DBAL/Driver/PDOPgSql/Driver.php');
require_once('Doctrine/DBAL/Driver.php');
require_once('Doctrine/DBAL/Connection.php');
require_once('Doctrine/DBAL/Driver/Connection.php');
require_once('Doctrine/DBAL/Query/Expression/ExpressionBuilder.php');
require_once('Doctrine/DBAL/Platforms/PostgreSqlPlatform.php');
require_once('Doctrine/DBAL/Platforms/AbstractPlatform.php');
require_once('Doctrine/DBAL/Driver/PDOConnection.php');
require_once('Doctrine/DBAL/Driver/PDOStatement.php');
require_once('Doctrine/DBAL/Driver/Statement.php');
require_once('Doctrine/DBAL/Events.php');
require_once('Doctrine/DBAL/Statement.php');

Outcomes of the tests:

With pure PDO:

  • memory: 0.0044288635253906
  • seconds: 0.24748301506042

With DBAL and autoload:

  • memory: 0.97610473632812
  • seconds: 0.29042816162109

With DBAL and hardcoded requires:

  • memory: 0.97521591186523
  • seconds: 0.31192088127136

With DBAL bypassing the include part:

  • memory: 0.0099525451660156
  • seconds: 0.30333304405212

The fee we paid for using DBAL gives us some extra features. OK we don’t need DBAL to get those features. If we code a bit we can get them (remember DBAL is nothing but a PHP extra layer). But DBAL has a great interface a well documented. Now I’m going to list a few extra features from DBAL very interesting, at least for me:

Transactional mode

I really like it. It allows us to create scripts like that:

$dbh->transactional(function($conn) {
    $smtp = $conn->prepare('INSERT INTO wf.tbl1 (id, field1) values (:ID, :FIELD1)');

    for ($i=0; $i<1000; $i++) {
        $smtp->execute(array('ID' => $i, 'FIELD1' => "field {$i}"));
    }
});

A simple closure will make the code more concise and it will commit/rollback our transaction for us. In fact I borrowed this function in my PDO projects to use this interface. I love Open source.

Snippet from DBAL library:

    /**
     * Executes a function in a transaction.
     *
     * The function gets passed this Connection instance as an (optional) parameter.
     *
     * If an exception occurs during execution of the function or transaction commit,
     * the transaction is rolled back and the exception re-thrown.
     *
     * @param Closure $func The function to execute transactionally.
     */
    public function transactional(Closure $func)
    {
        $this->beginTransaction();
        try {
            $func($this);
            $this->commit();
        } catch (Exception $e) {
            $this->rollback();
            throw $e;
        }
    }

Types conversion

Really useful, at least for when I work with dates:

$date = new \DateTime("2011-03-05 14:00:21");
$stmt = $conn->prepare("SELECT * FROM articles WHERE publish_date > ?");
$stmt->bindValue(1, $date, "datetime");
$stmt->execute();

List of Parameters Conversion

It’s a cool feature too available in DBAL since Doctrine 2.1

$dbh->executeQuery('SELECT * FROM wf.tbl1 WHERE id IN (?)',
    array(array(1, 2, 3, 4, 5, 6)),
    array(\Doctrine\DBAL\Connection::PARAM_INT_ARRAY));

Bind parameters with IN clause with PDO is a bit ugly. We need to create a series of bind parameters depending on our list to map them within the SQL. It’s possible but DBAL interface is smarter.

Transaction Nesting

Another cool feature:

$dbh->beginTransaction();
try {
    $dbh->beginTransaction();
    try {
        $smtp = $dbh->prepare('INSERT INTO wf.tbl1 (id, field1) values (:ID, :FIELD1)');

        for ($i=0; $i<1000; $i++) {
            $smtp->execute(array('ID' => $i, 'FIELD1' => "field {$i}"));
        }

        } catch (Exception $e) {
            $dbh->rollback(); //transaction marked for rollback only
            throw $e;
        }
    $smtp = $dbh->prepare('INSERT INTO wf.tbl1 (id, field1) values (:ID, :FIELD1)');

    for ($i=0; $i<1000; $i++) {
        $smtp->execute(array('ID' => $i, 'FIELD1' => "field {$i}"));
    }

    $dbh->commit(); // real transaction committed
} catch (Exception $e) {
    $dbh->rollback(); // transaction rollback
    throw $e;
}

This piece of code with PDO will throw the following error:
There is already an active transaction
but it works with DBAL. If we need to do this kind of things with PDO we need to use savepoints and things like that. DBAL does the ugly part for us.