How To Export a Large Dataset With Fat-Free Framework Without Running Out of Memory and Use Underlaying PDO Methods

The task is to export 50000+ records from a MySQL database into a sitemap.xml

There are several ways to work with a database in Fat-Free Framework.

ORM

The most convenient way is to use their ORM.

$f3->set('DB', new DB\SQL('mysql:host=localhost;port=3306;dbname=myDatabase','user','password'));
$f3->get('DB')->log(FALSE); // disable log if you deal with large datasets.

$pages = new \DB\SQL\Mapper($f3->get('DB'), 'pages');

// this loads ALL records if there is no filter specified like `$user->load(['userID=?','tarzan'])`
$pages->load();

while(!$pages->dry()){
  $smb->add($pages->url,'D',0.5,strtotime($pages->stamp));
  $pages->next();
}

This will most likely fail with a Fatal error: Allowed memory size of 8388608 bytes exhausted (tried to allocate 4096 bytes), because all 50000+ database entries are being loaded into the mapper and depending on the data, PHP runs out of memory.

One way to solve that is to paginate the data and retrieve a LIMITed set of records. But that's ugly, unless you rely on the F3 mapper magic.

Plain SQL the Fat-Free Way

In F3 it's also possible to use plain SQL with exec.


$pages = $f3->get('DB')->exec('SELECT url,stamp FROM pages');

foreach($pages as $page){
  $smb->add($page['url'],'D',0.5,strtotime($page['stamp']));
}

Same here, PHP runs out of memory, because all 50000+ records are being loaded into memory.

Plain Unbuffered Queries PDO Style

F3's SQL class is a superset of PHP's PDO Class so there is access to all the native PDO functions and settings.

By default, queries are buffered, means PHP loads the whole result set into memory, to work with it.

Queries are using the buffered mode by default. This means that query results are immediately transferred from the MySQL Server to PHP and then are kept in the memory of the PHP process. This allows additional operations like counting the number of rows, and moving (seeking) the current result pointer. It also allows issuing further queries on the same connection while working on the result set. The downside of the buffered mode is that larger result sets might require quite a lot memory. The memory will be kept occupied till all references to the result set are unset or the result set was explicitly freed, which will automatically happen during request end the latest.

https://www.php.net/manual/en/mysqlinfo.concepts.buffering.php

To get around the memory limitation unbuffered queries are an option.

Unbuffered MySQL queries execute the query and then return a resource while the data is still waiting on the MySQL server for being fetched. This uses less memory on the PHP-side, but can increase the load on the server. Unless the full result set was fetched from the server no further queries can be sent over the same connection.

The solution is to use native PDO methods and settings in Fat-Free Framework via $f3->get('DB')->pdo(), to retrieve all records with minimal memory consumption.

TL;DR

// Set the connection to unbuffered queries
// https://www.php.net/manual/en/mysqlinfo.concepts.buffering.php
$f3->get('DB')->pdo()->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);

$uresult = $this->f3->get('DB')->pdo()->query('SELECT url,stamp FROM pages');

if ($uresult) {
  while ($row = $uresult->fetch(\PDO::FETCH_ASSOC)) {
    $smb->add($row['url'],'D',0.5,strtotime($row['stamp']));
  }
}

I tested this with setting the memory artificially low with ini_set("memory_limit","8M"); and it just worked.

Potential Pitfalls

Using unbuffered queries comes with downsides and I found myself on Stack Overflow as soon as I tried to reuse the same db connection.

Means, one can only use one unbuffered connection. it's not possible to loop through the result and try to write something back. The table is locked. Solutions are to set up two database connections one for unbuffered and another one for buffered queries. There are pros and cons and it always depends on the use case.