πŸ“‘Coffee's Blog

Making a Simple, Serverless Analytics Solution

CoffeebankπŸ—“οΈDecember 27, 2022

The Case Study 🐈

While making websites, I found that I happened to need only three things analytics-wise:

  • Time of site visit: Am I getting a flood of traffic at a certain time?
  • User agent: What browsers do I need to prioritize? Are there a lot of Mobile Safari users?
  • Pages visited: What is a normal user journey like? Are they running into 404 pages frequently?

The most important was the last one. I needed to know specifically if users were getting stuck moving from one page to another, not only basic page view counts.

However, many of the market solutions were either not enough, or too much. Microanalytics.io and Beampipe were nice, but didn't track sessions. Matomo was too large, and depended on upkeeping a server.

So I decided to build my own. A simple, privacy-friendly analytics solution that logs user trips across pages. Self-hostable, serverless, and scalable.

Getting Started

Services

  • MongoDB (free)
  • Netlify Functions (free)

My approach follows the Jamstack philosophy. The data is stored in MongoDB. To receive and process the write requests, I create an API using Netlify Functions. Both of these are extremely lightweight and the free tier is plenty for small sites.

On the frontend, a script (hosted statically) is loaded on all pages and runs on every page load. The script caches a timestamp of first load in SessionStorage (which clears when the tab is closed).

Every time the script is run, it sends a POST request to the Netlify Function with the timestamp (as a de-facto session ID) and the basic data I'm looking for (in this case, the page path and user agent).

On the backend, MongoDB Compass can parse and analyze the data directly from the database. You can also always create a frontend for the data in your database. There is full data independence.

1. MongoDB

Create a database and a collection. Then, create a user scoped to only read/write to that database/collection.

2. Netlify Functions

Using GitHub, I create a basic repo with a /netlify/functions/postDb.js file. Following the Netlify docs, I write some code to accept a JSON payload, and then send it into my database.

Note that if your website isn't hosted here, you will need to manually allow the domain(s) you're using, due to CORS.

/netlify/functions/postDb.js
const { MongoClient } = require("mongodb");

// To enable CORS
const headers = {
  'Access-Control-Allow-Origin': '🟨🟨🟨🟨🟨',
  'Access-Control-Allow-Credentials': 'true',
  'Access-Control-Allow-Headers': 'Content-Type',
  'Access-Control-Allow-Methods': 'POST, OPTIONS'
};

async function postData(jsonSession, jsonTime, jsonUserAgent, jsonPage) {
  const uri = 🟨🟨🟨🟨🟨;
  const client = new MongoClient(uri, {
    useNewUrlParser: true,
    useUnifiedTopology: true,
  });
  try {
    await client.connect();
    const commitData = await client
      .db('🟨🟨🟨🟨🟨')
      .collection('🟨🟨🟨🟨🟨')
      .insertOne({
        session: parseInt(jsonSession),
        time: jsonTime,
        useragent: jsonUserAgent,
        page: jsonPage
      })
    return commitData;
  } catch (err) {
    console.log(err); // output to netlify function log
  } finally {
    await client.close();
  }
}

function errOutput(statusCode, err) {
  return {
    statusCode: statusCode,
    headers: {
      'Access-Control-Allow-Origin': '🟨🟨🟨🟨🟨',
      'Access-Control-Allow-Headers': 'Content-Type'
    },
    body: JSON.stringify({ error: err })
  }
}

exports.handler = async function(event, context) {
  // Add CORS support
  switch (event.httpMethod) {
    case 'POST':
      if (!event.body || event.httpMethod !== 'POST') {
        return errOutput(500, "No data in json");
      }
      try {
        const data = JSON.parse(event.body);
        if (!data.sessionId || !data.time || !data.useragent || !data.page ) {
          return errOutput(500, "Improper data in json");
        }
        const commitData = await postData(data.sessionId, data.time, data.useragent, data.page);
        console.log(commitData);
        return {
          statusCode: 200,
          headers,
          body: 'Success'
        };
      } catch (error) {
        return {
          statusCode: 500,
          headers,
          body: JSON.stringify({ error: 'Failed' }),
        };
      }
    case 'OPTIONS':
      return {
        statusCode: 200, // <-- Must be 200 otherwise pre-flight call fails
        headers,
        body: '200'
      };
  }
};

3. Frontend Script

Save this script with your website, then load it on every page. It uses Navigator.sendBeacon() to send data, even after the tab closes.

Note that you can ignore a device by setting "analyticsIgnore" to "true" in LocalStorage. Some adblockers, such as those using EasyPrivacy, may also block Navigator.sendBeacon().

For frameworks (e.g. React, Vue, Svelte), you can set middleware to post a Navigator.sendBeacon() on every route change.

index.js
if (JSON.parse(localStorage.getItem("analyticsIgnore")) !== "true" && window.location.hostname !== "localhost") {
  let varSession;
  let varTime;
  let dateObj;
  let varUserAgent;
  let varPage;

  try {
    // sessionStorage is cleared/reset when tab closes
    // use iso timestamp as session id
    let varId = sessionStorage.getItem('varId');
    
    if (varId) {
      // ongoing session
      varSession = varId;
      dateObj = new Date();
      varTime = dateObj.toString();
    } else {
      // new session
      dateObj = new Date();
      varSession = Date.parse(dateObj);
      varTime = dateObj.toString();
      sessionStorage.setItem("varId", JSON.stringify(varSession));
    }
    console.log(varSession);

    varUserAgent = navigator.userAgent;
    varPage = window.location.pathname;
  } catch (error) {
    console.log({ error });
  }

  // send to database
  try {
    const options = new Blob(
      [JSON.stringify({
        sessionId: parseInt(varSession),
        time: varTime,
        useragent: varUserAgent,
        page: varPage,
      })],
      {type : 'application/json'}
    );
    navigator.sendBeacon('🟨🟨🟨🟨🟨', options);
  } catch (error) {
    console.log({ error })
  }
} else {
  console.log("analyticsIgnore");
}

Results

I was pleasantly surprised at how nicely this setup worked. The benefits are great:

  • MongoDB comes with powerful tools, so everything works out-of-the-box.
  • You own the database, so you have full data independence, and can host it anywhere.
  • You create the API, so you can easily customize what data you want to collect and store.
  • Since there is no server, nothing ever goes "out of date", go "down" for maintenance, or racking up expensive server costs on traffic spikes.

This journey has walked us through NoSQL databases, protecting your MongoDB login using Node.js (Netlify Functions), and using browser APIs to persist and send data across pages.

Some enhancements to consider are ratelimiting the Netlify Function, mandating MongoDB schema, and validating data.

About

Cats.js stands for "Coffeebank Analytics Tracking Solution - JavaScript".

Yes, I also happen to like cats.

License: MIT - See License. Hope this is helpful for your analytics needs!

Last Updated: December 27, 2022