Page view counting is used throughout the web as a metric to gauge the popularity of posts. Depending on how accurate it needs to be and how much traffic a page gets, it can be as easy as +1'ing a record with SQL or grabbing it through Google Analytics.
For higher traffic sites, the constant writing and reading to the database on every view can cause issues. Additionally, for cases where accuracy is crucial, incrementing a column everytime someone refreshes a page will cause stats to be far from accurate.
One solution is to use browser fingerprinting with Redis. In this post, I want to run through an example on how to do this with fingerprintjs, Redis, and Ruby on Rails.
Wikipedia describes browser fingerprinting as information collected about a remote computing device for the purpose of identification. Browser fingerprinting looks at a large amount of factors and settings on a browser to calculate a unique fingerprint for a user, including user agent, language, screen resolution, and installed fonts and plugins. It can be very accurate for determining uniqueness.
A popular way to calculate a visitor's browser fingerprint is with fingerprintjs. Fingerprintjs currently uses 24 unique characteristics with additional sources planned in the future.
Recently I needed to implement view count tracking and I figured I'd share how we decided to do it. Accurate view count tracking is important to us for two main reasons:
- Users watch the view count on their posts like hawks because it an important way to gauge how popular their posts are to both internal users and external traffic from social media sites and search results.
- The site shares ad revenue with users based on views, so it's important to have accurate view tracking to do this correctly.
Our stack uses Ruby on Rails, so I'll use it for examples on this post, but this implementation can easily be adapted for other setups.
We will also use jquery-cookie to easily set and read cookies, so additionally include that.
This is straightforward -- it grabs the fingerprint and sets it as a cookie that we can then read server-side. When we initialize our JS, we then run
App.fingerprint.setFingerprintCookie() and the fingerprint is set.
Now that we have a fingerprint set as a cookie, we can read it server-side with Rails.
Let's jump to our posts controller, where we want to track the unique number of visits for a post. Here's an example of a basic setup for a posts controller.
Next we will use the after_action callback available through
ApplicationController to run a method that will track the page view.
Note that there are two methods used here:
fingerprint_cookie_value is used to read the fingerprint value that we set in the browser.
track_page_view is where we insert the browser fingerprint into Redis. This uses Redis's SADD function.
According to Redis's docs, SADD will
add the specified members to the set stored at key. Specified members that are already a member of this set are ignored. If key does not exist, a new set is created before adding the specified members.
This is perfect for us: if the set doesn't exist yet, we want to create it and then add the value. If it does exist, we want to add the fingerprint. If the fingerprint already exists in the set want to ignore it and prevent duplicates to maintain uniqueness.
The set can be named anything. For our purposes, the key starts with
posts, is followed with the post id, and ends with
uniques to signify what the set is used for. A colon is used to separate the different parts of the key.
We will also use SADD to insert the post id into a set to keep track of which posts are currently being tracked.
Now that we have stored the fingerprints in Redis whenever a new visitors visits a page, we can read directly from the set to get a unique count for views on posts using SCARD.
SCARD will return the set cardinality (number of elements) of the set stored at key, which, for our purposes, is the number of fingerprints in the set, also known as page views.
If we weren't concerned about an expanding memory size in Redis due to large sets of fingerprints for each post, we could call it a day and always read directly from the set, which would give the most accurate and timely results. Depending on the amount of traffic to your site and how much you want to pay for memory, this could be a feasible option.
On our site, we need to track the views on millions of posts, so the sets could quickly become portly. We also have a
views_count column on post records in our Postgres database that we read from, so there isn't a need to keep view counts in memory long-term.
For our purposes, we decided to run a rake task every 10 minutes that loops through new posts (posts within the last week) and increments the
views_count column. For older posts, we do this nightly because real-time stats aren't as important. Additionally, every 24 hours we loop through the posts and empty the set.
Here is an example rake task:
When looping through each post, a separate class called
IncreaseViews is used and pushed into a background job.
And that's an example of using browser fingerprinting with Redis and Rails for accurate and scalable view counting! If you have any questions, feel free to hit me up on Twitter.