TLDR;
I'm building a widget to provide fun quizzes, polls and much more within Blog posts on the major platforms. In previous parts we've covered building out a router for the client side and a data model for the content and reporting.
In this part we will look at the API that the widget supports and how that's put together with Firebase Functions. To avoid this getting over long, we will first look at view tracking and recommendation and then in the next part, we will cover responses.
Motivation
I'm building the interactive widget below to act as a way of making posts more interesting for all of us.
Vote Below!
Requirements
I wanted to build a straightforward API for the widget that would do a number of useful things for content creators, like recommending articles that fit with the one they are writing (so theirs will also receive recommendations), providing a mechanism to robustly respond to quizzes and polls and a way of creating some basic gamification with points and achievements.
Thanks to comments on previous posts I will probably do another version of this in the future using Cloud Run so we can all see the pros and cons.
Here's what the API is aiming to support:
- Register a view of an article
- Get a list of recommended articles that match the current one and promote recent content that is popular
- Flag that a recommended article was clicked
- Register a response for a quiz, poll or something a plugin developer decides they want
- Add points and achievements
The way we implement these functions will have a significant impact on the cost of running the system as the database charges for every "read" including the reads of items in lists - so it can get expensive quickly
The API
Firstly we need to create a file to contain our functions, as this file is going to use Firestore database collections then we also get and initialize that and make a global reference to the db
we can use in our functions:
const functions = require("firebase-functions")
const admin = require("firebase-admin")
admin.initializeApp()
const db = admin.firestore()
view
Let's start off with the principle of view
. We want to be able to record that an article has been seen, we want to ensure that we know the number of unique user views and the total number of views, and for the sake of making recommendations later we also want to record some other factors: the first time the article had a unique viewer and the last time, so we can use these to sort.
Let's look at that a moment: my current choice of algorithm is to use recency of publishing, recency of a new unique visitor, popularity overall and then a match of the tags in the recommendation versus the tags in the current article.
We'll see the algorithm in detail next, but in view
we need to create data that helps with this. I decided that the first and last dates should be rounded into UTC days to provide a level of stability and fairness, so that calculation is a key part of working out view.
Ok so here is the view
function:
exports.view = functions.https.onCall(async ({ articleId }, context) => {
We declare an API function in Firebase Functions like this - exporting it with a name and saying that it is an https.onCall
. We then get our parameters we pass to the call in an object and a context
that contains information about the caller and other things we might have set.
I use App Check to ensure that the calls are only coming from valid locations (the website) to avoid someone hacking and sending random data. This also runs a Recaptcha v3 (the one you can't see) and scores each call, if the call passes then the context
has an app
property. I check that and refuse calls it has rejected.
if (context.app === undefined) {
console.error("Not validated")
throw new functions.https.HttpsError(
"failed-precondition",
"The function must be called from an App Check verified app."
)
}
I also ensure that we have a user:
if (!context.auth.uid) {
console.error("No user")
return null
}
Last time I mentioned that Firestore has some serious limits on record updates (1 per second) and that this means you need to "shard" counters in case you have a bunch happening at once. I create 20 shards and update counts in these, choosing the shard at random:
const shard = `__all__${Math.floor(Math.random() * 20)}`
The next job is to get the "article" (see the previous part for more information on the data model) and the "counts" record for the article.
const article =
(await db.collection("articles").doc(articleId).get()).data() || {}
const countRef = db.collection("counts").doc(articleId)
const doc = await countRef.get()
const data = doc.exists ? doc.data() : {}
Now we have the existing counts or an empty object, we are going to want to track unique users so the "counts" record has a map of user.uid
to the date that they were new, we initialise that.
const users = (data.users = data.users || {})
We also work out a value for the current UTC day that we will use for tracking first and last unique user day.
const day = Math.floor(Date.now() / (1000 * 60 * 60 * 24))
With this in hand, we check if we've ever seen this user before and if we haven't, we start to award points - first if the visitor is not the author, we give the auth some points and a "New Unique Reader" achievement:
if (!users[context.auth.uid]) {
if (article.author !== context.auth.uid) {
await awardPoints(article.author, 20, "New Unique Reader")
}
Next we give the reader a bonus set of 50 points if this is a new article for them, and an extra 100 points if this is there first article.
await awardPoints(
context.auth.uid,
50,
"Read New Article",
({ achievements }) => {
if (!achievements["Read New Article"]) {
return [100, "Read First Article"]
}
}
)
Having awarded points, we update the unique user map so we don't do it again for this article, and then update the unique counts for both the article and the articles tags. Note how we use the "shard" we created earlier here, it's updating one of 20 possible counters we will add together when we want to report on the total number of unique visits to the widget:
users[context.auth.uid] = Date.now()
data.uniqueVisits = (data.uniqueVisits || 0) + 1
data.lastUniqueVisit = Date.now()
data.lastUniqueDay = day
data.firstUniqueDay = data.firstUniqueDay || day
for (let tag of article.processedTags || []) {
await incrementTag(tag, "uniqueVisits")
}
await incrementTag(shard, "uniqueVisits")
}
Now we've exited the code specific to unique visits, we busy ourselves updating the other counters and award 1 point for viewing an article. Note the use of "shard" again
data.visits = (data.visits || 0) + 1
data.responses = data.responses || {}
await countRef.set(data) // Save the counts
for (let tag of article.processedTags || []) {
await incrementTag(tag, "visits")
}
await incrementTag(shard, "visits")
await awardPoints(context.auth.uid, 1, "Viewed an article")
return null
})
incrementTag
I'm going to leave awardPoints
until next time as it has to deal with cheating, but let's look at the incrementTag
that was used frequently in the view
code. The idea of this is to make a simple to increment counter with a name.
async function incrementTag(tag, value, amount = 1, options = {}) {
const tagRef = db.collection("tags").doc(tag)
const tagDoc = await tagRef.get()
const tagData = tagDoc.exists
? tagDoc.data()
: {
...options,
tag,
special: tag.startsWith("__"),
event: tag.startsWith("__event_")
}
tagData[value] = (tagData[value] || 0) + amount
await tagRef.set(tagData)
}
It uses the "tags" collection and sets up a couple of useful booleans for special
and event
which helps with finding the right records for reporting. Otherwise, it's pretty simple, we get a record with the tag name, and increment a named value by a specified amount.
recommend
The recommend
function produces a list of articles that should be shown in the widget. As previously mentioned the algorithm favours newly published content, that is recently popular and matches the tags of the current article (in that order).
To do this we want to perform as few queries as possible to save cost. For this reason (and as mentioned in the previous article) we copy data from the article to the "counts" collection records so we don't have to read both the "counts" and the "articles" for each recommendation to do this step.
exports.recommend = functions.https.onCall(
async ({ articleId, number = 10 }, context) => {
First we have our parameters, an articleId
for the current article and a number of recommendations to make.
Next we check that we should be allowing this call:
if (context.app === undefined) {
throw new functions.https.HttpsError(
"failed-precondition",
"The function must be called from an App Check verified app."
)
}
Next we lookup the current article so we can get its current tags. The user enters tags as a comma separated string, but there is a trigger which converts them into a unique array of strings, in lowercase, for this function. We turn the tags into a Set
const articleSnap = await db.collection("articles").doc(articleId).get()
const tags = articleSnap.exists
? new Set(articleSnap.data().processedTags)
: new Set()
Next comes the expensive bit. We run a compound query on the "counts" collection for enabled
articles that are not comment
type and then sort it by the unique days and the number of visits, selecting double the number we will return (so we can post process with tags).
const rows = []
const rowSnap = await db
.collection("counts")
.where("enabled", "==", true)
.where("comment", "!=", true)
.orderBy("comment", "desc")
.orderBy("firstUniqueDay", "desc")
.orderBy("lastUniqueDay", "desc")
.orderBy("visits", "desc")
.limit(number * 2)
.get()
Firestore has all kinds of rules - firstly we are going to need an index for a query with a compound sort - next and important is that if we use a !=
we must include that field in the index and the sort!
The easiest way to deploy Firebase stuff is with the CLI, that has a firebase.json
file that tells it where to find things, mine has a reference to a file containing my Firestore indexes. Here is the contents of that file, which enables the above query:
{
"indexes": [{
"collectionGroup": "counts",
"queryScope": "COLLECTION",
"fields": [
{ "fieldPath": "enabled", "order": "DESCENDING" },
{ "fieldPath": "comment", "order": "DESCENDING" },
{ "fieldPath": "firstUniqueDay", "order": "DESCENDING" },
{ "fieldPath": "lastUniqueDay", "order": "DESCENDING" },
{ "fieldPath": "visits", "order": "DESCENDING" }
]
}],
"fieldOverrides": []
}
This says make an index on the specified fields for the "counts" collection.
With that index and the query above we now have rowSnap
as a collection of records that matched. We use that to add a score
for each matching tag in the new article, versus the one that is being viewed. We sort by this score
and then return the requested number of article ids that will be rendered as recommendations in the widget.
rowSnap.forEach((row) => {
let record = row.data()
if (row.id === articleId) return
let score = record.processedTags.reduce(
(a, c) => (tags.has(c) ? a + 1 : a),
0
)
rows.push({ id: row.id, score })
})
rows.sort((a, b) => b.score - a.score)
return rows.slice(0, number).map((r) => r.id)
}
)
wasClicked
If an article is clicked in the widget we just record that fact in the "counts" collection for the article.
exports.wasClicked = functions.https.onCall(async ({ articleId }, context) => {
if (context.app === undefined) {
throw new functions.https.HttpsError(
"failed-precondition",
"The function must be called from an App Check verified app."
)
}
const countRef = db.collection("counts").doc(articleId)
const doc = await countRef.get()
const data = doc.exists ? doc.data() : {}
data.clicks = (data.clicks || 0) + 1
await countRef.set(data)
})
Deploying
Once we've built this file, using the Firebase CLI you just type firebase deploy
and it sends the whole lot to the cloud. You can make adjustments for "where" functions will live, by default it is "us-central-1" and I've left mine there.
Conclusion
In this part we've seen how to make sharded counters and API calls using Firebase functions, we've also covered the principles of article recommendations and the need for indexes in Firestore if you use more complicated queries. Next time we'll cover scoring and achievements.
Top comments (2)
Good job my friend. The only thing I would change is to use Cloud Run instead of Cloud Functions because pricing is going to get real high (Cloud Functions doesn't support simultaneous calls in a single instance)
Yes I'm going to do a version with Cloud Run next and then compare.