How to iterate big dataset in Laravel without memory exhausted
Contents
Imagine that you have to iterate a big table in Laravel, the words “big table” means that the table not only thousands of rows, but also millions of rows.
Intuitively, we may use the all()
or get()
method of the model to retrieve all the rows, and then iterate them.
But our memory is limited, use those function may cause a memory exhausted error.
Here, I will give you three ways to do this.
1. Chunk and ChunkById
The Chunk function fetch a given number of rows each time, so even your table has millions of rows, you can load a mini dataset each time. It’s a good way to avoid memory exhausted.
The code example:
|
|
But if your query is filtering by a column and you need to modify the column, maybe like this:
|
|
In this senario, could lead to unexpected and inconsistent results. Because the chunk method will not guarantee the order of the rows.
Instead, Laravel prepared a chunkById
method, which is similar to chunk
, but it orders the rows by the primary key, so can guarantee the order of the rows.
|
|
2. lazy and lazyById
The lazy and lazyById is very similar to chunk and chunkById. Instead of give the dataset to the callback function, the lazy and lazyById return a generator, so you can iterate the dataset directory.
|
|
The problem of chunk and lazy
Because the chunk and lazy method load multiple rows each time, in our previous example is 1000. So when you iterate the 1000 items, it can not guarantee the every item is same as the latest, maybe some items are updated by other process or request, especially in a high concurrency environment.
so if you do have the high concurrency issue in your system, here is another option
Cursor
The cursor
method like lazy. They both use the generator
. But the different is cursor
fetch only one row each time, so it can guarantee the every item is same as the latest.
|
|
Conclusion
Here I introduced three ways to iterate big dataset in Laravel. (Which is also shown in the Laravel official document, but in my own words)
If you don’t worry about the concurrent modify of data, you can use chunk
or lazy
to fetch multi rows each time.
Otherwise, you should use the cursor
to fetch only one row each time.
Author JeelyWu
LastMod 2023-04-13