I want to create a link shortener and I'm still trying to figure out how I could track every single view. I'm using NodeJS, MongoDB and mongoose. The main reason that my following schemas are so deeply nested is that I want to have a system which can show data about a specific link, the whole user and in both cases about specific days. All time stats can be instantly accessed without iterating over thousands of database entries too.
Everyone who wants to share links has to create an account with the following schema:
{ // User.js // some other unimportant properties // all time stats for the user stats: { views : { type : Number, default: 0 }, uniqueViews : { type : Number, default: 0 } }, days: { // This is explained below type : mongoose.Schema.Types.Mixed, default: {} } }
The link schema looks like this:
{ // Link.js // some other unimportant properties user: { type: String, ref : 'User' }, stats: { views : { type : Number, default: 0 }, uniqueViews: { type : Number, default: 0 } }, days: { // This is explained below type : mongoose.Schema.Types.Mixed, default: {} } }
Every time when a user visits a link generated by user, I want to track the impression. To see statistics about every day or for example the last 7 days, I have a (dynamic)
days
field in Link.js and User.js.
The day
schemas would look similar to this if you would have taken a snapshot from the database:
// User.js days: { '1456095600000': { // new Date() with cleared time and getTime() views : 20, validViews: 18, device: { 'desktop': { views : 20, validViews: 18, }, 'tablet': { views : 14, validViews: 12, }, 'mobile': { views : 45, validViews: 40, } }, country: { // etc. } } // a bunch of other days } // Link.js days: { '1456095600000': { views : 3, validViews: 2, earnings : 0.0045, device: { 'desktop': { views : 20, validViews: 18, }, 'tablet': { views : 14, validViews: 12, }, 'mobile': { views : 45, validViews: 40, } }, country: { // etc. } } // a bunch of other days }
Now finally the route where the actual tracking is happening:
app.get('/:id', function (req, res, next) { // redirect the user first, then track res.redirect(link.destination); var today = new Date(); today.setHours(0,0,0,0); var day = today.getTime(); // now we have "today 0:00" in ms // some (magic) tracking data, like country or device var device = req.device, country = req.country; // checking for a (magic) "valid" view (a valid view is one view per session) var isValid = !!req.valid; Link.findById(req.params.id).populate('user').exec(function (err, link) { // user first: link.user.stats.views++; if (isValid) link.user.stats.uniqueViews++; link.save(); // assume that the following objects were initialized before. link.user.days[day].views++; if (isValid) link.user.days[day].validViews++; link.user.days[day].device[device].views++; if (isValid) link.user.days[day].device[device].validViews++; // and so on with `country` and `referrer` // the same now for the link: link.stats.views++; if (isValid) link.stats.uniqueViews++; // assume that the following objects were initialized before. link.days[day].views++; if (isValid) link.user.days[day].validViews++; link.days[day].device[device].views++; if (isValid) link.days[day].device[device].validViews++; // and so on with `country` and `referrer` // mark the mixed objects as modified and save link.markModified('days'); link.markModified('user.days'); link.save(); link.user.save(); }); });
Am I doing it right or would you do it completely different? A friend of mine asked whether MongoDB is even the right database for that, but I think this is not a problem, is it? When doing it like I described, the variables with dozens of links are eventually really big. Can this be a problem (assuming I have 6 powerful cores (with the app clustered of course), 40GB RAM and a SSD)?