I recently found myself in a task where I had to process 500.000+ rows from a database, each with a while set of columns. The first thing I tried to do was fetch it all in a single query and have PDO sent me a result. My server didn’t like this very much and gave me a subtle Killed -message, halting further execution of my script. Nothing more, nothing less…
Some little research told me that this most likely cause of this error message was the Linux-Out-Of-Memory (OOM) error. Needless to say, the large query was the source of this error. I changed my script a bit to work around this problem, so if you encounter a similar issue, this article might be of some help for you.
Identify the bottleneck
In my case the bottleneck was a SQL query that tried to fetch 500.000+ rows from a database and return it in a set:
// Fetch data from a huge table:
$result = $pdoInstance->query('SELECT * FROM some_huge_table;');
// This function will throw the 'Killed' -error:
$rows = $result->fetchAll(PDO::FETCH_ASSOC);
What the fetch() -method does is return all the rows in the $rows -array. This would cause PHP to use a lot of memory: each row is stored in the array and when the dataset is big enough this will clog up your memory.
Now we’ve identified the problem: the $rows -array simply gets too big for PHP. The solution is we have to make it smaller, so it consumes less memory. How can we do that? Well that’s easy: by limiting our SQL-query. But we still want to parse all of our records! So how do we do this?
Well, the answer is easy: instead of creating one big query, we separate our logic in multiple queries that are limited.
Step 1: Count it all
The first thing we need to do is get a count of all the rows we want to parse. This can be simply done with a count-query:
// Get a count of the huge table:
$countResult = $pdoInstance->query('SELECT COUNT(*) AS `count` FROM some_huge_table;');
$row = $countResult->fetch(PDO::FETCH_ASSOC);
$totalCount = $row['count'];
Step 2: Fetch the data in chunks
Now we know the total count of rows in our table, we can fetch our data in smaller chunks:
$chunkSize = 10000;
$from = 0;
// Execute a loop until we've reached our total count
while($from < $totalCount)
// Execute a limited query:
$query = $pdoInstance->query('SELECT * FROM some_huge_table LIMIT ' . $from . ', ' . $chunkSize . ';');
$rows = $query->fetchAll(PDO::FETCH_ASSOC);
foreach($rows as $row)
// Do your awesome magical stuff here
// Increase $from:
$from += $chunkSize;
You might have to fiddle with the $chunkSize -parameter to check what works best for you, but this simple approach will give your PHP script a drastic drop in memory consumption.
Visitors give this article an average rating of 2.8 out of 5.
How would you rate this article?
★ ★ ★ ★ ★