I recently found myself in a task where I had to process 500.000+ rows from a database, each with a while set of columns. The first thing I tried to do was fetch it all in a single query and have PDO sent me a result. My server didn’t like this very much and gave me a subtle Killed -message, halting further execution of my script. Nothing more, nothing less…
Some little research told me that this most likely cause of this error message was the Linux-Out-Of-Memory (OOM) error. Needless to say, the large query was the source of this error. I changed my script a bit to work around this problem, so if you encounter a similar issue, this article might be of some help for you.
Identify the bottleneck
In my case the bottleneck was a SQL query that tried to fetch 500.000+ rows from a database and return it in a set:
1 2 3 4 |
// Fetch data from a huge table: $result = $pdoInstance->query('SELECT * FROM some_huge_table;'); // This function will throw the 'Killed' -error: $rows = $result->fetchAll(PDO::FETCH_ASSOC); |
What the fetch() -method does is return all the rows in the $rows -array. This would cause PHP to use a lot of memory: each row is stored in the array and when the dataset is big enough this will clog up your memory.
Optimizing it
Now we’ve identified the problem: the $rows -array simply gets too big for PHP. The solution is we have to make it smaller, so it consumes less memory. How can we do that? Well that’s easy: by limiting our SQL-query. But we still want to parse all of our records! So how do we do this?
Well, the answer is easy: instead of creating one big query, we separate our logic in multiple queries that are limited.
Step 1: Count it all
The first thing we need to do is get a count of all the rows we want to parse. This can be simply done with a count-query:
1 2 3 4 |
// Get a count of the huge table: $countResult = $pdoInstance->query('SELECT COUNT(*) AS `count` FROM some_huge_table;'); $row = $countResult->fetch(PDO::FETCH_ASSOC); $totalCount = $row['count']; |
Step 2: Fetch the data in chunks
Now we know the total count of rows in our table, we can fetch our data in smaller chunks:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
$chunkSize = 10000; $from = 0; // Execute a loop until we've reached our total count while($from < $totalCount) { // Execute a limited query: $query = $pdoInstance->query('SELECT * FROM some_huge_table LIMIT ' . $from . ', ' . $chunkSize . ';'); $rows = $query->fetchAll(PDO::FETCH_ASSOC); foreach($rows as $row) { // Do your awesome magical stuff here } // Increase $from: $from += $chunkSize; } |
You might have to fiddle with the $chunkSize -parameter to check what works best for you, but this simple approach will give your PHP script a drastic drop in memory consumption.
Visitors give this article an average rating of 3.0 out of 5.
How would you rate this article?
★ ★ ★ ★ ★