I'm working on a Ruby CLI gem that will automate adding WordPress blog data to AirTable. It uses the WordPress API to collect a blog's title, date published, ID and URL, formats the data, and then uses the AirTable API to create new rows in a specified table.
Last time I discussed my new workflow. Instead of adding all the data received from the WordPress to AirTable every time the sync method was called, the gem will compare the AirTable's ID column to the ids of the WordPress posts, and only add new posts.
collect_row_data()
I shared my collect_row_data() method in my last post. This method calls call_at(), and collects the HTTP response data in a usable format for the rest of the gem.
I've revised it so that it isn't requesting data it doesn't need, speeding up the process. This is accomplished by passing in the fields I want (ID and Last Modified) as query parameters. The responses are also collected into a hash, instead of an array. I made this change so that later, when comparing our two datasets, I can easily get the ID column data by calling .keys on row_data.
def collect_row_data
row_data = {}
offset = ""
loop do
at_response = call_at("fields%5B%5D=ID&fields%5B%5D=Last+Modified",offset)
at_response.parsed_response["records"].collect{|post| row_data[post["fields"][@current_settings.headers[:id]]] = [post["id"], post["fields"]["Last Modified"]]}
offset = at_response.parsed_response["offset"]
break if !at_response.parsed_response["offset"]
end
row_data
end
sync()
When a user wants to update the WordPress data in AirTable, they call the sync method.
In its latest form, sync pings WordPress to get the total number of result pages and ensure the user has entered the blog post URL correctly.
It then gathers the Wordpress data (collect_post_data) and AirTable data (collect_row_data) into two hashes. The datasets are compared by passing arrays of their ids into compare_datasets() and a hash is returned. The all_data hash will always have a key :current, and sometimes have a of :new. If the hash has a key of :new, we filter the WordPress data, keeping only the posts whose ids appear in all_data[:new]. That data is added to AirTable.
def sync
if ping_wp
post_data = collect_post_data()
rows = collect_row_data
all_data = compare_datasets(rows.keys, post_data[:ids])
if all_data[:new]
data = prep_data(post_data[:posts].keep_if{|post| all_data[:new].include? post["id"]})
add_to_at(data, @@at_api)
else
puts "All data up-to-date"
end
else
puts "There was an issue. Try correcting your blog's URL"
end
end
compare_datasets()
This method takes our two arrays of ids as arguments. We get all the new posts by subtracting the AirTable id array from the WordPress id array. We are left with any values that appear only on the WordPress array. These are the ids of our new posts to be added to AirTable!
def compare_datasets(at_arr, wp_arr)
new_posts = wp_arr - at_arr
post_data = {:current => at_arr}
if new_posts.count > 0
post_data[:new] = new_posts
end
post_data
end
Next Steps
I now have all of the post data in data structures that make them easy to compare and change. The next step will be altering the WordPress API call so that we get the last modified date for each post. We'll compare these to the last modified date of each AirTable row. If WordPress has been modified later than AirTable, we'll update AirTable with the Wordpress info.
Top comments (0)