Node.js performance hooks and measurement APIs

You’ve written and deployed your application and gathered users – congrats! But what’s next?

Improvements, getting rid of bottlenecks, increasing execution speed, and more enhancements are in line.

In order to make these improvements, you first have to be aware of your app’s existing performance characteristics. Only when you’ve identified the slow parts and the bottlenecks of the logic can you effectively improve performance.

However, nobody likes a trial-and-error process of guessing which parts might be slower.

Luckily for you, Node.js provides various built-in performance hooks to measure execution speed, find out what parts of code are worth optimizing, and collect a granular view of your app’s code execution.

In this article, you’ll learn how to use Node.js performance hooks and measurement APIs to identify bottlenecks and enhance your application’s performance for faster response times and improved resource efficiency.

Before reading this piece, you should have some basic knowledge of Node and JavaScript as well as some experience building applications with both.

Overview of Node.js Performance API

We will start by taking a look at why and when we should use the Performance API provided by Node and the various options it provides.

Consider a case where you want to measure the execution time for a specific block of code. For this, you might have used the Date object like this:

let start = Date.now();
for (let i = 0; i < 10000; i++) { } // stand-in for some complex calculation
let end = Date.now();
console.log(end - start);

However, if you run the above and observe, you’ll notice that this is not precise enough.

For example, an empty loop like the above will log either 0 or 1 as the difference and not give us enough granularity. Date class can only offer milliseconds level of granularity, and if the code runs in order of 100 nanoseconds, this will not give us a correct measurement.

For that, we can use the Performance API instead to get a better measurement:

const {performance} = require('node:perf_hooks');
let start = performance.now()
for (let i = 0; i < 10000; i++) {}
let end = performance.now()
console.log(end - start);

With this, we get a more granular value, which on my system is in the range of 0.18 to 0.21 milliseconds, with a precision of up to 15-16 decimal places. This is a simple way we can use the Node Performance API for a better measurement of execution time.

The API also provides a method to precisely mark a point in time during the run of the program. We can use the performance.mark method to get a timestamp of an event with high precision, such as the start of loop iterations.

Let’s run the following code:

let start_mark = performance.mark("loop_start",
  {detail:"starting loop of 1000 iterations"}
);
for (let i = 0; i < 10000; i++) {}
let end_mark = performance.mark("loop_end",
  {detail:"ending loop of 1000 iterations"}
);
console.log( start_mark, end_mark );

When we do, we’ll get this output:

PerformanceMark {
  name: 'loop_start',
  entryType: 'mark',
  startTime: 27.891528000007384,
  duration: 0,
  detail: 'starting loop of 1000 iterations'
}
PerformanceMark {
  name: 'loop_end',
  entryType: 'mark',
  startTime: 28.996093000052497,
  duration: 0,
  detail: 'ending loop of 1000 iterations'
}

The mark function takes the name of the mark as the first parameter. The detail in the second parameter object allows for extra details regarding that mark, such as the number of iterations run, database query parameters, and so on.

The object returned by the mark function can then be used to export the timing data to something like Prometheus using the Prometheus exporter sdk. This allows us to query and visualize the timing info outside the application. As a mark is an instantaneous point in time, the duration field in the returned object is always zero.

Instead of manually calling performance.now and calculating the difference between two events, we can do the same using marks and the measure function. We can use the names given to the marks above to measure the duration between two marks:

performance.mark("loop_start",
  {detail:"starting loop of 1000 iterations"}
);
for (let i = 0; i < 10000; i++) {}
performance.mark("loop_end",
  {detail:"ending loop of 1000 iterations"}
);
console.log(performance.measure("loop_time","loop_start","loop_end"));

The first argument to measure is the name we want to give to the measurement. Then the next two arguments specify the names of the marks to start and end the measurement on, respectively.

Both of these two arguments are optional — if neither is given, performance.measure will return the time elapsed between the application start and the measure call. If we provide only the first argument, the function will return time elapsed between performance.mark with that name and the measure call.

If both are provided, the function will return a high-precision time difference between them. For the above example, we will get an output like this:

PerformanceMeasure {
  name: 'loop_time',
  entryType: 'measure',
  startTime: 27.991639000130817,
  duration: 1.019368999870494
}

This can again be used with Prometheus exporter in order to export custom measurement metrics. If you have a setup which does blue-green or canary deployments, you can compare the performance of the old and new versions to see if your optimization works as expected or not.

Using the Performance hooks to optimize your app

Let us now see how we can use this API to optimize our application. For our example, we will consider a case where we are fetching some data from the database, then manually sorting it and returning it to the user.

We want to see how much time each operation takes, and what would be the best place to optimize first. For this, we will first measure various events that take place:

async function main(){
    const querySize = 10; // ideally this will come from user's request
    performance.mark("db_query_start",{detail:`query size ${querySize}`});
    const data = fetchData(querySize);
    performance.mark("db_query_end",{detail:`query size ${querySize}`});
    performance.mark("sort_start",{detail:`sort size ${querySize}`});
    const sorted = sortData(data);
    performance.mark("sort_end",{detail:`sort size ${querySize}`});
    console.log(performance.measure("db_time","db_query_start","db_query_end"));
    console.log(performance.measure("sort_time","sort_start","sort_end"));
    // clear the marks...
}

We start by declaring the query size, which in a real app would probably come from user’s request.

Then we use the performance.mark function to mark the starts and ends of database fetch and sorting operations. Finally we output the duration between these events using the performance.measure function. We get an output like this:

PerformanceMeasure {
  name: 'db_time',
  entryType: 'measure',
  startTime: 27.811830999795347,
  duration: 1.482880000025034
}
PerformanceMeasure {
  name: 'sort_time',
  entryType: 'measure',
  startTime: 29.31366699980572,
  duration: 0.09800400026142597
}

To see how both of these operations perform with increasing query size, we will change the query size value and note the measurements. On my system, I get the following:

Query size	Database fetch time (ms)	Sort time (ms)
10	1.48	0.098
100	1.65	1.235
1000	2.11	7.214
10000	3.8	21036.7

As we can see here, the sorting time is growing rapidly as the query size increases, and optimizing it first can be more beneficial. By using some different sorting algorithms, we get the below:

Query size	Database fetch time (ms)	Sort time (ms)
10	1.5	0.28
100	1.97	0.4
1000	2.35	5.78
10000	3.5	17.53

While the sorting time is slightly worse for very small query sizes, the time grows slowly compared to the original measurements. Hence, changing the sorting algorithm here would be beneficial if we expect to frequently deal with large query sizes.

Similarly, we can measure the difference in database fetch times before and after creating an index on the queried fields. Then we can decide if the index creation is useful or which fields provide more benefits when used for indices.

Using background workers to offload tasks

When creating UI-based apps, we need the UI to be responsive even when some heavy processing task is in progress. If the UI freezes when processing large data, it would be a bad user experience to deal with. On websites, this can be done using web workers.

For apps running directly using Node, we can use Node’s worker_threads module to offload the computationally intensive tasks to background threads.

Note that this is useful only when the task is CPU-intensive, such as sorting or parsing data. If the task depends on I/O such as reading a file or fetching a network resource, using Node’s async-await is more efficient than using workers.

We can create and use workers as follows:

const { Worker, isMainThread, parentPort, workerData, } = 
    require("node:worker_threads");
async function main() {
  const data = await fetchData(10);
  let sorted = await new Promise((resolve, reject) => {
    const worker = new Worker(__filename, {
      workerData: data,
    });
    worker.on("message", resolve);
    worker.on("error", reject);
    worker.on("exit", (code) => {
      if (code !== 0)
        reject(new Error(`Worker stopped with exit code ${code}`));
    });
  });
}
function worker() {
  const data = workerData;
  sortData(data);
  parentPort.postMessage(data);
}
if (isMainThread) {
  // we are in the main thread of our application
  main().then(() => {
    console.log("done");
  });
} else {
  // we are in the background thread spawned by the main thread
  worker();
}

We start by importing the required functions and variable declarations from the worker_threads module. We then define two functions — main which will run in the main thread and worker which will run in the worker thread.

We then check if the script is being executed as the main thread or as a worker thread and call the main/worker functions accordingly. To keep this example simple, we define all these in a single file, but we can also separate out the main and worker functions in their own files.

In the main function, we fetch the data as before. We then create a promise, and in it we create a new worker. The Worker constructor requires a file path, which will be run as the worker thread.

Here we give it the same file using __filename builtin. In the second parameter, we pass the data to be sorted as workerData . This workerData will be provided to the worker thread by the Node runtime.

Finally, we listen to the events from the worker — on receiving a message, we resolve the promise, and in the case of errors or non-zero exit code, we reject the promise.

In the worker thread, we get the data passed from the main thread in the variable workerData which is imported from the worker_threads module. Here we sort it and post a message to the main thread containing the sorted data.

In the main thread, instead of immediately awaiting the promise, we can keep it in a queue or check on it periodically. This way we can keep our main thread responsive when the worker thread is doing the sorting calculations. We can also send intermediate messages from the worker thread indicating the sorting progress.

Common tips to optimize your Node app

While each app will have its own way to optimize for performance, there are some common starting points for Node.js apps.

Observe before optimizing

You must instrument and measure the performance of your app before you start optimizing it so you can know exactly which functions or API/DB calls need to be optimized.

Trying to do do blind optimization can worsen performance, which is why using Performance hooks and APIs provided by Node to measure is a good starting point.

Have an easy way to repeat measurements

To decide if your optimization works or not, you should have a handy way to measure the performance before and after.

This can be done by having two builds — one with and one without the changes, having a script that runs tests and measurements, and something that can give you a comparison. Having clear before-and-after performance values for changes can help you decide if the changes are worth it.

Try indexing the database and caching request/responses

If your application uses a database and queries it frequently, you should consider creating an index on the queried parameters for improving the retrieval performance.

This will come at a cost of potentially increased storage size and possibly higher insert/update query times, so you should carefully measure the before/after in your use cases and decide if the trade-offs are good or not.

Another way to improve performance is by using some caching scheme in order to quickly respond to database or API queries. This can be used effectively if you can cache the API responses with query parameters and then use this cache to respond to later requests.

Note that caching is a double-edged sword. You need to carefully evaluate how long to keep a cache entry, on what basis to evict the entries, and when to invalidate the cache. Incorrectly doing this can not only worsen your performance but also risk sending incorrect data or leaked data across users.

Reduce your dependencies

If you have ever taken a look at [node_modules](https://blog.logrocket.com/using-clap-rust-command-line-argument-parsing/) or checked the disk size taken by [node_modules](https://blog.logrocket.com/using-clap-rust-command-line-argument-parsing/), you know how heavy dependencies can be in a Node project.

You need to be careful while adding new dependencies, as they can add a lot more transitive dependencies, and parsing all these can impact the startup performance of your app. You can try mitigating this through the following:

Removing unused packages — sometimes there are several packages in the package.json that are not used in the app anymore and can be removed. This can be useful to shrink number of dependencies and the build size of your package
Using a bundler to tree-shake and remove unused modules from the final build — when bundling and packaging your applications, you can use functionality provided by bundlers to remove the unused modules from your dependencies. You keep only the parts of dependencies that are used by your code, and do not include the remaining parts in your final build
Vendoring the specific code needed from the dependency — instead of adding the whole package as a dependency when you only need a small part of it, you can vendor that specific part of the code. Make sure to check and respect the license of original code while doing so
Lazily loading dependencies — you can lazily load the dependencies to improve the starting performance and reduce memory usage in cases where that dependency is not needed

Conclusion

The Performance API provided by Node can help in determining not only which parts are slower but also how much time they take. You can further explore this data by exporting it as traces or metrics to something like Jaeger or Prometheus.

Remember — having a ton of data only makes it harder to explore, so a good strategy would be to first only measure timings of coarse events such as function calls or even end-to-end processing of requests, and then add more and more fine grained measurements to functions which are taking the most time.