Understanding memory leaks in node.js part 1
Intro
The following article comes as a transcription for my talk on "Memory leaks in NodeJs" that I've presented at our local JSMD Community.
When I speak about memory leaks, I have in mind:
when a programmer incorrectly manages memory allocations in such a way that memory that is no longer needed is not released.
There are many ways to leak memory, and no developer has immunity to this. Memory leaks are frequent in:
- v8 garbage collector
- node core
- npm libraries
- your code
In this article the focus will be on your code and how can you leak memory in node.js
Internals
In node.js the status of your memory can be tracked with the following command process.memoryUsage()
which returns the following JSON:
{
rss: 30781440,
heapTotal: 6537216,
heapUsed: 3746720,
external: 8272
}
key | description |
---|---|
rss | resident set size the total amount of RAM allocated for the process |
heapTotal | total space allocated for javascript objects |
heapUsed | total space occupied by javascript objects |
external | memory used by C++ internal node.js bounds like (Buffers ...) |
Streams
Streams are one of the most powerful features of node.js. Many devs are afraid of them but when you befriend them your node understanding skyrocket.
const destination = // get somehow the writable
const source = fs.createReadStream('./big.file');
source.pipe(destination);
This simplest code may leak memory. The problem lays down in the pipe
method which does not automatically destroy source or destination streams when an error occurs or on of the streams closes.
A more production-grade code will need to have the following structure:
const destination = // get somehow the writable
const source = fs.createReadStream('./big.file');
source.pipe(destination);
destination.on('close', () => {
source.destroy();
});
destination.on('error', () => {
source.destroy();
});
source.on('error', () => {
destination.destroy();
});
source.on('end', () => {
destination.destroy();
});
For every failing case, we need to add the boilerplate to catch and destroy the streams.
Imagine if we have 3 or more streams that we need to pipe
then we need to do a combination of cases to destroy every other stream, and very rapidly the amount of boilerplate code will be unmaintainable.
It is not Advisable to use
pipe
in production. Usepump
orpipeline.
pipeline
was added in Node v10.0.0 it is accessible from the stream
namespace. Pump
is an npm package compatible with every other node version, they both have the same interface.
const { pipeline } = require('stream');
pipeline(source, destination, (err) => {
if (err) handleError(err);
response.end(err.message);
}
);
// pump(readable,transform1, transform2, stream2, onEndOrErrorCallback);
Promises
Promises came as an abstraction to solve all the callback issues, but they are not entirely safe against memory leaks.
The most common example is to have promises which are never solved.
async function handleServer(req, res) {
await unSolvablePromise();
res.end('Done');
}
Assuming this will never be solved so the user will never receive the message Done
. This code will leak file descriptors also the unSolvablePromise handler.
For particularly this case you will need to make sure your promise is resolved an easy fix can be the following one.
async function handleServer(request, response) {
await Promise.race([unSolvablePromise, timeout(500)]);
response.end('Done');
}
function timeout(timer) {
return new Promise((resolve) => {
setTimeout(() => {
resolve();
}, timer)
});
}
We are using Promise.race
which will return the first result of the first promise executed, in our case, it will return the timeout and user will receive the proper response.
End of part one. Part two contains examples of leaking memory in event emitters, cached objects, and the tools I use for identifying and handling memory leaks in node.js