How to build your own analytics (part one)

May 09, 2020

I have a blog (you’re reading it). It’s just a personal site I occasionally share some stuff on. That’s all it is, and all it ever should be. Naturally, like most humans on the internet, I do get a little positive sensation when people visit my site. But that’s really just a bonus. It’s not a reason to post more stuff or to post things I normally wouldn’t care so much about.

That is my official stance.

Knowing myself, I sometimes tend to deviate from my official stance; especially when I share content that people actually consume and really seem to appreciate. So I need to protect myself against these urges to play to an (imagined) audience.

I found analytics plays a key part in this process (negatively!). It’s a nice little feeling to know someone from another part of the world has visited your website. It’s a great feeling to know that a hundred people read your stuff while you were sleeping. But I definitely get the “I NEED TO MAKE MORE OF THIS”-jitters whenever I find out article X was read a lot by people who came from site X and used search term U and now probably want more.

In other words, when you get all Google Analytics about your website, it changes your mindset, and potentially your approach to what you were doing. And that’s not necessarily bad. However, when you just want to keep your thing your thing, I do experience it as detrimental.

So, what’s next. No more analytics? It’s a valid option, but also boring.

It led me to think about my early days on the web. When I still had websites with a little visible counter. Or in a more advanced stage a little Nedstat button that people could click on to publicly view my stats.

Well, you can guess what happened next. Nostalgia took over. I decided to make my own analytics. How hard could it be right? Just add another entry in the database whenever someone hits your site. And as an added bonus I also donate a little less data into the arms of the tech giants (or less directly).

So let’s make a little recipe for this analytics app. What do we need to get this thing running. I made a pact with myself though. This couldn’t suck up all of my time or stress me out. So if a thing seems too difficult or convoluted, I just leave it out. The bare minimum is just to count visitors, anything extra is a bonus.

The recipe

a server to handle business logic
I'm familiar enough with Javascript, so a NodeJS server seems appropriate.
a database to save data
Not really a DB-guy, but PostgreSQL I recently installed on my PC for other reasons. Might as well use it.
a way to log data on my blog and send it to the server
My blog uses Gatsby, so that will probably mean a thing or two.

Let’s log some stuff

If we can’t log a visitor, we don’t even have to start setting up the other stuff. So let’s start with looking what we can log using the browser. Now as I said, my blog is made with Gatsby, it’s a static site generator that uses React. If you don’t use Gatsby, that isn’t much of a problem. I’ll point out where things deviate.

The important thing is to log someone as soon as they enter your site. After that we need to log when they click on a link to another page. So if you’re site consists of five separate pages, each page would need a script that runs when the page has loaded.

Modern websites and apps however do not necessarily have different pages in the traditional sense. They often ‘fake’ other pages by changing the url, and then have the script on the single page show different content programmatically.

Gatsby uses a system like that. So we need to have a way to access that system so we can plant our little loggers. Thankfully Gatsby provides a file called gatsby-browser.js. This file contains two functions that are of use to us:

onClientEntry
onPreRouteUpdate

The first function runs only when you initially open the website, while the latter runs on each subsequent navigation to another page on the website. Both are thus very usable for us. Let’s start with onClientEntry. What kind of information would we be able to gather at this point?

The browser provides us with a Window object, which in turns gives us access to other objects which all have tiny pieces of information that might interest us.

const language = navigator.language
const browser = getBrowser()
const page = window.location.pathname
const referrer = document.referrer
const dimensions = window.screen.width + " x " + window.screen.height

Let’s look at the code. There is navigator object that provides us with the browser language, there is a location object that gives us a pathname, there is also a document object which can give us a referring website. Lastly there is a screen object that provides us the width and height of the browser.

And then there is the browser itself, which we might want some information about. This is however always a bit murky, since things change often in browserland which makes any code that tries to identify browsers unreliable. Yet, you can always make an attempt, and I would suggest to do a google search for the latest logic and/or library which can help you with this.

With all this information we now have a faint clue about our visitor. We know their language, whether they were referred from another website and we can estimate whether they were on mobile, tablet or desktop by combining the browser and dimensions information.

So just to be clear, all this information can also be gathered on non-Gatsby websites. Each website provides those objects. So for a regular multi-page website, you can make one small script that you run on each separate page. Back to Gatsby.

For the onPreRouteUpdate things are not that different. Only this function provides us with a prevLocation object. So this is useful to check if it provide us with a referrer. Or in other words: on which link did my visitor click to end up on this page.

const referrer = document.referrer
    ? document.referrer
    : prevLocation
    ? prevLocation.href
    : "Unknown"

So now we gathered most of the information we want to have, we have to send it to a server to process it and save it to a database. We can create an addVisit function for that.

async function addVisit({ page, browser, language, referrer, dimensions }) {
    try {
        const response = await fetch(url, {
        headers: {
            Accept: "application/json",
            "Content-Type": "application/json",
        },
        method: "post",
        body: JSON.stringify({ page, browser, language, referrer, dimensions }),
        })
    } catch (error) {
    console.log(error)
    }
}

As you can see it’s a quite straightforward fetch that sends a JSON package to a server we have yet to build, which is referred to by the url variable. To finish up our frontend work it’s a good idea to already use an environment variable to prevent us from logging our local visits to the future server.

const url =
    process.env.NODE_ENV === "development"
    ? "http://localhost:8002/stats"
    : "https://serverweneedtobuild.com/stats"

Now is there anything we missed or any information we still might want to know? Well, obviously we want to have the date and time of the visit, and other than that I always like to know which country the visit is from. Both of these however we’re going to take care of on the server. So that’s exactly what the next article is going to be about: the server.

We’ll be setting up a nodejs server that takes our information package, processes it, adds some extra info and then saves it to a database. And then when that is done, we’re going to make a nice little public stats dashboard so you can see how many people visited my blog (yikes!). But that’s for part three.