Requests in WordPress and PHP have a limited time to run. I recently had a problem with processing a huge file. What I wanted to do, didn't fit in a single request so I'd done a little research.
At first, I thought about the cronjobs and it was the right solution but you have to just tweak it a little.
The Problem
I had a CPT and a CSV file, containing the information about every post. Every file had about 1000 rows and 40 columns so it was very big for a single request.
Noticing that one column was an array of remote images that had to be uploaded to my website. When I tested it, only 3 posts were imported and this was not the result I expected.
Solution 1: Background Processing
deliciousbrains / wp-background-processing
WordPress background processing class
This plugin can run your tasks in the background. The full tutorial for using its classes is on their GitHub page.
What this plugin does is simply saving all the data you need for your requests in the database. Every row contains the information for a single request and then runs a cronjob every 5 minutes that grabs the latest row from the database and runs the request. Sound very simple but yet powerful and with a lot of features.
Solution 2: Batch Processing
gdarko / wp-batch-processing
๐ Easily process large batches of data in WordPress. Provide the data, setup the processing procedure, run the batch processor from the admin dashboard. Profit.
This plugin helps you to do your large requests too. It saves the data in the database and has a user interface! You can go to its page and run the batch and see how many of the requests are done.
But the actual difference is that this plugin does not use cronjob. It just runs an ajax call which in the last row of the database runs. Then at that ajax call, another ajax call runs, and the information about the success or the failure of the previous call shows in the interface. One other thing to mention, you can't leave the page or the process will be paused.
My Code
So enough theoretical and let's see some code. I went with the batch processing plugin.
As the tutorial on the GitHub page says, you should create an instance of the class but my point was to show you how I handled my file.
Further Problem: One of the major problems I've faced was that my file was so large that even the batch processor couldn't handle the first step of the work; which would be saving the items in the database. So if you're facing the same problem, what did you do to overcome this?
Top comments (1)
Thanks for this post, its helpful. I'm in the position where I'm building a plugin and it requires the ability to process potentially thousands of posts.
I've actually used the same batch processing plugin but I also encountered the same problem as you did, where the initial setup method caused a timeout of my environment. I was pulling in 50K ids, and ordering them by date, but it was just too much for my test site to handle.
One way around this I can see is by reducing the initial set up to query only a chunk of the items you want to process, by checking for the lack of existence of some post meta. The post meta would only be added to processed items, so when the query ran a second time, it would ignore the already processed items.
It has its downsides, such as the need to keep re-running the batch, but it might serve as a workaround.
I also found this option actionscheduler.org/ but I've not really figured out how to implement it yet.