back-end

Project Architecture

I’m going to go over a few of the points of my project architecture as it currently exists - this won’t include any token based authorization (which I’m in the process of adding). Click here to view my github repo and see the project as it currently exists.

index.js

The server is set up with the following files - index.js, handlers.js, data.js, config.js and helpers.js.

We begin by exporting the correct config object from our config.js based on the environment specified - the code defaults to staging unless PRODUCTION is specified as the process.env.NODE_ENV parameter when starting the survey. The config object contains some other properties to be discussed later - the important part for now is that based on production or staging environments, we have a specific port our server listens on.

Once a request comes in at the port we’re on (pre-deployment here), our server receives it and passes it along to a server logic function.

That function uses the Node.js URL module to parse the request object

  • It first grabs the pathname and trim it. We use this object later to check that our router has the specified path.

  • Next, it gets the query string (for GET requests, since our POST requests will be coming in the request body).

  • Next, it gets the HTTP request method (GET, POST, PUT, DELETE).

  • Finally, it gets the request headers (content-type, for example).

The same server logic function then handles collecting body data (for POST and PUT requests) using an event emitter (req.on). It currently uses a string decoder to append data to an empty string but this can likely be changed by preemptively specifying utf-8 encoding.

Once the request data has been collected, we implement our routing logic with a small router object. This is where the trimmed pathname comes in to play - if the path exists in our router object, we assign that to a handler variable. If it doesn’t exist, we deal with a “not found” scenario.

Our handler variable now contains a reference to a handler function name in our handlers.js file - we then create an object using the information gathered from the request object - this includes parsing our gathered data string to an object using a helper function from our helpers.js file.

Our handler variable (which is a function) calls the appropriate function in our handler.js file and passes along our data object (containing the information gathered from the client request) and will receive a status code and a payload from a callback.

data.js

Before we dive into the handlers.js file, we need to look at the data storage functions we have here. These are what ultimately get called by the handler functions once they extract the appropriate data from the client request data object sent from the index.js file.

It’s important to note that we have a folder within our project directory called “.data” - this is where we store our data. Additionally, that folder has subfolders such as “token” and “users” for specific pieces of data.

Our data object that we export from the data.js file has four functions - create, read, update and delete. These all correlate to handler functions that work with CRUD functionality. This project currently only write to local file stores (soon to be updated to use MongoDB or a hosted PostgreSQL DB - I haven’t decided on which) - as such, the first thing the data.js file does is grab the current project directory. We also create a .data file to store data files (one per user) and add that on to the project directory filepath so that we can write files to it.

The create function takes a directory, filename to create, data object and a callback (which returns an error and data - data in the event that we’re reading a file and need to return the contents). We open the file using the project direcotry, file directory and concatenating the filename and “.json” and pass the appropriate I/O flags - they allow for writing to the file and also create the file if it doesn’t exist. If there is no error, we have a file descriptor ready to use.

We take the date object to write (which is a javascript object), stringify it and call fs.write, passing along our file descriptor and the stringified data object. If there is no error, we close the file. We also handle all possible errors sent from callbacks along the way.

The read function accepts as parameters a directory, a filename and sends a callback to the handler that invokes it (responding with an error and the data we’re reading from the file system). The subfunction in the handlers.js file that calls the data.read function passes along a data object containing identifying data used to find the right file to read from (we’re using a phone number in this case). Just as with the create function, we use the node fs module to read the file using the project directory + file directory + filename passed + “.json” - if there is no error and there IS data from the callback, we parse the JSON to an object and include it in the callback.

The update function accepts the exact same parameters as the create function - we need to check that the file exists so we first open it. If it exists, we truncate the file using the file descriptor passed from opening it - if there is no error, we use fs.writeFile and stringify the data before passing it to the file. Then we close the file and we’re all done.

Our delete function takes the same parameters as our read function (since, at its core, it performs the same task of reading the file). We open the file using out project directory variable + directory + filename + “.json” - we pass that to fs.unlink and if there is no error, we callback “false” as an error.

handlers.js

Our entire file is a large exported function with subfunctions - this is similar to our data.js file. Our file has several objects - a handlers object (which is what we export) and a handlers.users object which houses all of our route handlers. This is somewhat encapsulation - files importing our handlers.js file only have access to the handlers object, not the handlers._users property of handlers (which itself is an object). We access handlers.users FROM the handlers object. To illustrate this point :

var handlers = {}
// our index.js router is going to hit this function here but has NO knowledge of handlers._users
handlers.users = function (data, callback) {
    var acceptedMethods = [‘get’, ‘post’, ‘put’, ‘delete’]
    var requestMethod = data.method.toLowerCase()
    if (acceptedMethods.indexOf(requestMethod) > -1) {
        handlers._users.requestMethod (data, callback)
    }  else { callback (405) 
    }
}
var handlers._users = {}
var handlers._users.post = function (data, callback) { // logic here…}
module.exports = handlers

Here’s what the handler object that is exported would look like as an object literal -

var handlers = {
    users : function (data, callback) { // put request method validation code here} , 
    _users: { 
                 post = function (data, callback) { // post route logic here } ,
                 get = function (data, callback) { // get route logic here } ,
         }
}

The users property of the handlers object is itself an object - however, that is abstracted away from our index.js file which only calls the handlers.users method (which in turn calls the requisite handlers._users method.

The post function is passed data and a callback method from our index.js file - we’ll return a status code and a payload object (an error, if we have one) back to the calling function in index.js. The data object, remember, has a querystring or a payload we collected using an event listener on the client request object.

It’s important to note that we’re going to consistently use an inputted phone number as our FILENAME - this will be used for all lookups. We have several required fields that must be present in our data payload in order to create a user - phone number, first name, last name, password and tosAgreement. We extract all of those values from data.payload - if we have successfully extracted ALL of them, we can continue. First we sanity check to see if the user exists - we call data.read and pass in the phone number as our filename and “users” as our file directory. IF THERE IS AN ERROR, THAT MEANS THE USER DOES NOT EXIST - we may proceed with creating the user.

First we need to hash our password - we create a hash function in our helpers.js file that hashes our user password. You can use an HMAC and store the hash key in the config.js file (add it to your .gitignore directory) or just use a SHA256 hash without a hash key. We sanity check the length and type of the object to be hashed and then return the hash. Back in our post function, within our call to data.read, if we were successfully returned a hashed password, we build an object to store on file. It contains the phone number, first name, last name, tosAgreement and the HASHED password - we then call data.create, pass in the users directory and the extracted phone number as the filename and pass in our object we created. We then handle all err first callback responses accordingly and we ourselves return 200 as the statusCode in our callback to the index.js handler function.

Our get function accepts data and a callback - as with all other handler functions, we’re going to call back a status code and a payload (which is an optional error object). We are looking for the query string here in the URL since we aren’t doing anything with data collection from the request object - we need to extract the phone number from data.query.phone. This SHOULD be the same as the filename in our .data/users folder so we can use that for lookup. If we successfully got the phone number from the query, we initiate a call to data.read and pass in the users directory and the phone number as the filename. If there is no error and there is data, we can remove the hashed password from the data object before returning it in a callback along with a 200 status code.

Our put function is a little bit more involved in terms of logic - we only want to overwrite fields that need to be changed. While we’re not going to compare new values to the original ones, we can always add in later. As it stands, if a user submitted a new first name as “Karan” and the first name on file is “Karan”, we’ll still replace it. A phone number is a required field here - we need it to read the file contents. If we managed to extract a phone number, we see if there is a first name, last name or password to update. If ANY of those fields have been sent in the request body, we need to change them. We call data.read and pass in our phone number as the filename - if there is no error and we ARE returned a data object, we replace the data objects property values with our new ones if they have been included in the request body. We then call data.update and pass along our new data object and proceed to handle all err first callback cases.

Our delete function is very straightforward - we take in data and return a callback as with all other handlers methods thus far. We check the data query for a phone number - if we manage to extract it, we read the user’s file to make sure they exist. If we can read the file, we then call data.delete and handle all err first callback cases.

That’s it so far! Follow up posts will include more writing and code samples for the next set of features to add in - token based authentication, building out a simple front end with vanilla Javascript and (probably) replacing all callbacks with async/await since it comes standard out of the box with the new V8 engine now.

Lessons Learned Building a Pure Node Project

This is an article I’ll be adding to as I (inevitably) come across more issues working with node.js - this is a list of problems encountered and the solutions that worked for me.

I’ve been building out an API for the last few weeks whenever I get a chance - it started as a CLI to check website responses (think isitdown) but I’m now going to build out a vanilla JS front end for it. The API is in pure node - no external dependencies, no package.json file, nothing but the local modules bundled with the node.js installation package.

This was originally a project meant to teach me more about node.js itself and general basic backend work. Little did I know how much I’d gain from working on this - working with pure node.js is pretty fun. The biggest downside I’ve experienced so far is callback hell - I wanted to work without promises or async/await and thus have had no choice but to deal with callbacks. Even in a file with as few as 150 LOC, it’s difficult at first glance to understand where each callback fits in with the program flow.

Besides my index.js file which creates an instance of an HTTP and HTTPS server, I have a few other typical files - a config.js file, a data.js file which contains my local file storage methods using the fs module, and most importantly, a handler.js file which is just a layer over the data.js functions and routes requests to the appropriate data.js function (with a little extra code in there).

The First Problem

One thing I spent far too much time on was an undefined object value in JavaScript - this could be due to a million different things, so combing through the source code was necessary. Was I performing an asynchronous operation somewhere (either setting or getting the object property/value pair) and accessing it before the operation completed? No. This was the most likely culprit to me, but it didn’t seem to be happening anywhere.

Essentially I was trying to access a configuration property value - it was my HMAC secret key. The config file had it right there in front of my face so I knew I wasn’t trying to access something that just didn’t exist. Additionally, and THIS was the confusing part - I could get the value of any other config object property. The only one that came back undefined every time was my secret key.

I was stumped. At some point I realized I was probably referencing the config file incorrectly (this was a shot in the dark). I found my answer - I had changed the directory structure of my project and had put the config.js file in a new spot. I had forgotten to change my require() statement at the top of my file referencing the config object to point to the new file location.

How was it possible that I was still able to reference PART of my original config object even though I wasn’t pointing my require statement to the correct location? Cache, cache, cache. I’ve been using VSCode and, according to what I’ve read, it makes heavy use of caching. It had cached the old file and was keeping it around in the editor - whatever changes I made to that file (adding in the HMAC key) were SHOWING that they were being made to the code, but they were saving to the NEW location that wasn’t actually being referenced.

Once I changed the require() statement, everything worked perfectly.

Weird Errors in JS

I was met today with two error messages when continuing to build my Node.js project - one I’d seen before, the other was new. Javascript, as a dynamically-typed and interpreted language, can display strange behavior that can be tricky to understand at times.

Today I received the following error when trying to log an object:

This is me recreating the issue using a null prototype object

This is me recreating the issue using a null prototype object

This happened when I tried to concatenate a string I was printing to the console with a URL query string object:

console.log('this was the URL query string contained in the client request:' + queryString)  

To try and pin down what was happening, since I knew I could make no assumptions about the queryString object being received since I didn’t create it, I created a mock queryString object:

var queryString = {‘foo’ : ‘bar’}
console.log('this was the URL query string contained in the client request:' + queryString)

This time, I managed to print it out without any issue (even though the object didn’t actually print out and instead showed [object Object]). That output is the correct string representation of an object that hasn’t been stringified - I realized that the javascript engine was using type coercion to convert my mock object into a string (since I was concatenating it to an actual string, it was treating the whole thing as one string). I figured the engine must have been calling toString() on that object - why wasn’t this happening to the ACTUAL query object I was getting from the client request?

I opened up Chrome and used the browser console there since it lists out type methods available that can be called on a given variable. I recreated my mock query string object and, using the dropdown, saw that toString() was available to be called on it. That showed me what was happening, more or less, in my node project - by concatenating to a string, the engine was coercing my object into a string by calling toString() on it, leaving the [object Object] string representation printed to the console.

Again in Chrome, I created a null prototype object with the same key value pair that I had in my original mock queryString object. When I used the dropdown to check the prototype methods available, there was no toString() to be seen - what I guessed was happening was that the prototype of the request URL query string object was inheriting from a prototype that didn’t have the toString() method available. As a result, when trying to coerce the object type to string, the engine possibly tried calling toString(), found that it couldn’t and printed the error to the console about being unable to convert to a primitive value.

The Solution

Instead of concatenating an object of unknown origins to some user printed string, just use a comma instead of a “+” operator! This has the additional bonus of not converting (or attempting to convert) the object to a string

The Next Error

The second error I came across was when I had a block of code that I realized executed two callbacks when dealing with the same request. “Error: Can't set headers after they are sent." was what I was shown on the terminal after my server crashed - I could figure out WHAT was happening from the error message but I wasn’t sure WHY it was happening.

NodeHeaderError.png

The error message itself can be a little confusing - what’s happening in this case is that a handler callback is sending a status request and payload to our index.js file. We send a response header when the first callback is returned - we’re then calling res.end(), closing the process. A little bit further down in our handler function, we’re trying to send ANOTHER callback to the same request (even though the process has ended), resulting in our error.

The Solution

Use log statements or your preferred debugging method to figure out which function is causing the issue - once you’ve pinned it down, look for multiple callbacks responding to the same request and reduce to just one callback.



An Introduction To Node.js

Node.js is nothing but a runtime for Javascript to run outside of the browser - people often think it’s somehow different from pure JS but it isn’t. NPM is the most commonly used package manager to download and install dependencies for projects. js can get pretty low level - there are plenty of modules (http) that make it easy to create an HTTP server and write backend code.

Node.js was designed to make use of asynchronous I/O - many traditional programing languages utilize synchronous I/O. What happens with synchronous code is that when several tasks are being executed on the same thread, one task must complete before control returns to the thread and the next task can be executed. To get past this bottleneck, developers can make use of multiple threads and dispatch certain tasks to certain threads, allowing processes to run concurrently. Managing threads is, to understate it, a challenging task.

Node.js gets rid of the complexities of multithreading by allowing for asynchronous code (see the asynchronous section further down)- functions take callbacks (other functions that are passed as parameter variables) that can make use of the results of the original function. When the original function finishes executing, the callback function is called. 

The thing about writing pure Node.js code is that, in using callbacks that are nested within callbacks, we quickly reach “callback hell” (think pyramid of doom in the context of multiple if/else statements). We can tame our indented code using things like promises or a module such as Async.js. 

Let’s talk about routes - enter Express. Express.js is a lightweight framework built on top of Node.js that leverages the asynchronous event-driven programming paradigms of Node.js and makes it easier to build web applications. As a quick example of how it makes life easier, sending a file in pure Node.JS in an HTTP response object can be quite a few lines of code. After creating a server (or just using the http module “get” or “post” method which creates a server under the hood), we need to specify the response header data (content type, etc), possibly create a data stream out of our file and then pipe it to our response object. In express.js, we’d just use the “sendFile” function. 

Streams and Buffers

What are streams and buffers? In general computer science, a buffer represents a temporary place to house data while it’s being moved from one place to another. Once data has been collected and stored in a buffer, it can be manipulated in some way (read from, written to, etc).  In Node.js , a buffer is a structure that allows for manipulation or reading of binary data - much like an immutable array, a buffer cannot be resized. It allows for much lower level access to data (to the binary data that composes a string vs the encoded value of the string itself, for example). If you use buffers, you gain some performance since you can avoid, in our string example, string management functions.  

A stream represents a sequence of objects (sometimes bytes) that are accessed in sequential order. They’re core to I/O processes (file access, networking, processes). Streams access data in chunks instead of all at once - they’re associated with event emitters so that developers can write callbacks for when certain things have happened involving stream data (encountering an error, receiving data, ending the reading of data). 

In contrast to buffers, streams can read data piecemeal. Buffers need to be processed all at once before any action can be taken to alter the data contained in the buffer. 

HTTP uses a request/response paradigm whereas TCP servers utilize bidirectional streams. We can create readable and writeable streams using the filesystem and then pipe that data into an HTTP response (which itself is a writable stream) or we can pipe an HTTP request (a readable stream) into a data stream. TCP sockets are bidirectional meaning there is an open connection that we can both read and write streams to. 

Asynchronous Code

With asynchronous code in Node.js, we don’t have to deal with multiple threads - that complexity is abstracted away from us within the context of the event loop. Instead, we take advantage of the asynchronous nature of Node.js to write our software. With synchronous code, if we want tasks to run in parallel, they must be executed on separate cores or threads. With asynchronous code, once a process begins, we can begin another one without waiting for the original process to complete. We use callback functions to perform operations with the return data after a process finishes. 

Let’s use reading a file as an example. With asynchronous code, once the file starts being read, we can go do some other task. When the file is finished being read (whenever that may be), our callback function that we wrote earlier handles the results. 

Imagine we were to write a program that did the following - [print message to console] -> [read contents of file asynchronously and print contents to console] -> [print end message to console]. In asynchronous code, we don’t know when the contents of the file will actually finish being read. It may very well turn out that our first print statement and our end print statement get logged, THEN the file contents are logged. If we did this synchronously, the first print statement would be logged, then the program would hang while the file was read and the contents were logged, and we would see our end print statement last.

For this reason, being able to serialize asynchronous tasks is an important part of writing code in Node.js. As a very simple illustration of this concept, say you're writing client side code and you need to get data from several APIs - you have the URLs in an array and you want to execute a GET request for each. If your code is asynchronous, you can't be sure of the order in which the requests will be executed. You can solve this by treating your URL array as a queue structure. Shift the first URL off of the array, execute a get request and add the data to a new array. If there are any URLs left in your array, recursively call the function again, this time shifting the next URL off of the array. When the URL array is empty, do something with the data array which will have each response in the correct order. 

This was just a quick primer on writing some I/O code in Node.js! I'll be including more code as I write a backend for a new mobile app and continue learning more about Node.js and Express.js. 

_____________________________________________________________________________

In using Swift, I've enjoyed messing around with pointers to visualize that Swift foundational structures are pass by value and classes are pass by reference (and to help show the idea of Copy-On-Write). In light of learning new language paradigms, it's important to note that Javascript always passes by value.