This may come as a surprise, but I can be very lazy. If I can get good results without much effort, then π₯³! So the idea of adding debug logging to a whole codebase sounds absolutely exhausting to me, and I want to find a better (lazier) way...
The problem:
I have a tool that I've been working on for a while, and debugging it can be kind of a pain - especially when it's running inside VS Code. It'd be nice to have an easy way to get information about what is going on when the tool runs, without having to manually write a bunch of debug logging code and release a new version. That means that the usual approaches are not going to work:
- no straight up
console.log()
statements - no
debug(function foo () {})
wrappers - no fancy
@debug
decorators
I want something I can set and forget, and probably turn on and off via an environment variable. Is this even possible?
Attempt 1 - override Function.prototype.constructor
:
I know that in the past I've done something where I wrapped every function call in some logging, so I want to try to get that working again. I'm pretty sure I did it by overwriting Function.prototype.constructor
, which is a pretty insane hack π
JavaScript is a very flexible language, so I can just do stuff like this:
Function.prototype.constructor = (() => {
const original = Function.prototype.constructor;
return function (...args) {
// do stuff...
return original.call(this, ...args);
};
})();
I can store a reference to the original constructor
function, and replace it with a new function that does some debug logging, and then calls the original function. Seems pretty smart! Job done!
Unfortunately, this doesn't actually work... π
function foo () { return 'foo'; } <- Function.prototype.constructor doesn't get called here
foo();
So perhaps I can overwrite Function.prototype.call
instead?
Function.prototype.call = (() => {
const original = Function.prototype.call;
return function (...args) {
// do stuff...
return original.apply(this, args);
};
})();
function foo () { return 'foo'; }
foo(); // <- Function.prototype.call doesn't get called here either
I know I've done this in the past, so I must just be missing something. Off to Stack Overflow I go!
I found a few relevant questions/answers, like this one, which does exactly what I just tried? I also found an answer for this exact problem that I wrote 7 years ago - thanks past-Craig! π¦
This is one alternative, not entirely sure how reliable it is though, it feels a bit wrong:
(function () {
var oldCall = Function.prototype.call;
var newCall = function(self) {
Function.prototype.call = oldCall;
console.log('Function called:', this.name);
var args = Array.prototype.slice.call(arguments, 1);
Function.prototype.call = newCall;
this.apply(self, args);
}
Function.prototype.call = newCall;
})();
β¦
It even handles a few edge cases, like handling the correct version of this
that is passed in, and swapping back to the original for the actual call to console.log
... But I tried it again and still no cigar. Maybe it used to work but doesn't now?
I used n
to go back to an older version of Node.js (0.12), and it still doesn't seem to work. Maybe my old answer was just wrong?
Now that I think about it, this approach is probably a bit too much anyway - I would get debug logs for every single function call, including internal Node.js functions! Probably a bit too verbose.
Attempt 2 - Wrap require
in a Proxy
?
JavaScript has evolved quite a bit since I answered that question in 2013 - and one of the things that has been added is meta-programming capabilities. There's a special JavaScript object called a Proxy which is designed for this kind of function wrapping. But I still don't want to have to wrap all functions in a call to new Proxy
...
One of the cool things about Node.js is that a lot of it is written in JavaScript, which means I can mess with the core functionality, for example require
. I can do the same monkey-patching kind of thing from before.
global.require = (() => {
const original = global.require;
return function (...args) {
const result = original.apply(this, args);
// do stuff...
return result;
};
})();
So now I can examine the result of every call to require
and mess with it!
I want to look at each of the object on the result (the module.exports
object of the module) and find which ones are functions:
function getFunctions (exports) {
const functions = {};
Object.keys(exports)
.filter((functionName) => typeof exports[functionName] === 'function')
.forEach((functionName) => {
functions[functionName] = map[functionName];
});
return functions;
}
So I can change the // do stuff
to something like:
global.require = (() => {
const original = global.require;
return function (...args) {
const exports = original.apply(this, args);
const functions = getFunctions(exports);
// wrap functions:
return exports;
};
})();
And the wrap functions
code will look something like this:
Object.keys(functions).forEach((functionName) => {
Object.defineProperty(exports, functionName, {
value: new Proxy(exports[functionName], {
// ... Proxy options:
})
});
});
This is looking pretty promising I think! But right now it is still going to wrap every single exported function from every module, and that's quite a lot of functions. It'd be great to exclude all dependencies, and only instrument the code from my library. So how does require work?
require('fs'); // Built in module
require('express'); // Third-party code in the node_modules folder
require('./some/thing'); // Local code for my library
I only want to intercept the last case. I can just intercept any call to require where the argument starts with a .
, that should get ./
and ../
calls.
global.require = (() => {
const original = global.require;
return function (id) {
const exports = original.apply(this, [id]);
if (shouldWrap(id) {
const functions = getFunctions(exports);
// wrap functions:
}
return exports;
};
})();
function shouldWrap (id) {
return id.startsWith('.');
}
Ah, so slight problem. Third-party code in node_modules
will refer to other local files in the same module, which also start with .
, so I'm still getting too many functions. I need some better way of filtering the require ids!
I need to dig into require
a bit more. It turns out that the global require
function actually delegates to the Node.js Module
class, which has a require
function on its prototype. That function internally calls a _resolveFilename
function which turns the require id
into a file path. We can call that directly, and check if the path contains node_modules
. We can also use the list of builtinModules
to filter out things like path
and fs
:
immport { buildinModules, Module } from 'module';
Module.prototype.require = (() => {
const original = Module.prototype.require;
return function (id) {
const requirePath = Module._resolveFilename(id, this);
const module = original.apply(this, [id]);
if (module && shouldWrap(requirePath)) {
// wrap module:
}
return module;
};
})();
function shouldWrap (requirePath) {
const isNodeModule = builtinModules.includes(requirePath) || requirePath.includes('node_modules');
return !isNodeModule;
}
One more little issue is that the require
function isn't just a function - it's also an object that has some other stuff on it like require.resolve
and require.cache
. Right now I am just clobbering everything, so I have to do that a bit better:
immport { buildinModules, Module } from 'module';
const original = Module.prototype.require;
const debugRequire = function (id) {
// ...
};
Module.prototype.require = Object.assign(debugRequire, original);
Okay, that's sorted, so how will the actual wrapping work? So far I have this:
Object.keys(functions).forEach((functionName) => {
Object.defineProperty(exports, functionName, {
value: new Proxy(exports[functionName], {
// ... Proxy options:
})
});
});
The way a Proxy
works is that you have a set of traps that describe different operations in JavaScript. I care about functions, so there are two traps that I care about, apply
and construct
.
Proxy apply
trap:
The apply
trap gives you control to what happens when a function is called - exactly what I want! The basic form of an apply
trap handler is like this:
function apply (target, thisArg, argumentsList) {
return target.apply(thisArg, argumentsList);
}
I can add a bit more stuff to make it more useful:
import { performance } from 'perf_hooks';
function apply (target, thisArg, args) {
const startTime = performance.now();
console.log(args);
const result = target.apply(thisArg, args);
const endTime = performance.now();
console.log(result);
console.log(endTime - startTime);
return result;
}
One other thing is that result
could be a promise, so I need to handle that case as well. Usually I'd prefer to not use the .then()
syntax, but using async
would make this function always return a Promise
, which I don't want!
import { types } from 'util';
function apply (target, thisArg, args) {
// ...
const result = target.apply(thisArg, args);
if (types.isPromise(result)) {
return result.then((result) => {
const endTime = performance.now();
console.log(result);
console.log(endTime - startTime);
return result;
});
}
// ...
}
Nice, that's pretty rough but it mostly works!
Proxy construct
trap:
The construct
trap gives you control to what happens when a function is called with new
! The basic form of a construct
trap handler is like this:
function construct (target, args) {
return new target(...args);
}
I'm going to do basically the same thing as the apply
trap, but I don't need to worry about Promises
, since constructors
are synchronous.
import { performance } from 'perf_hooks';
function construct (target, args) {
const startTime = performance.now();
console.log(args);
const instance = new target(...args);
const endTime = performance.now();
console.log(endTime - startTime);
console.log(instance);
return instance;
}
It'd be great if I could also wrap any methods of the instance too! I can access them via target.prototype
and use the getFunctions
function again:
function construct (target, args) {
// ...
const proto = target.prototype;
const prototypeFunctions = getFunctions(proto);
Object.keys(prototypeFunctions).forEach((functionName) => {
Object.defineProperty(proto, functionName, {
value: new Proxy(proto[functionName], {
apply: applyTrapHandler
})
});
});
// ...
return instance;
}
Okay so now I have Proxies that wrap Proxies π€―, but it should still work!
Adding function names:
One thing that would be good to add is for the traps to log the name
of the functions. I have access to the name everywhere I need it, but I need the trap to know about it:
function createFunctionCallWrap (name) {
return function apply (target, thisArg, args) {
// ...
console.log(name);
const result = target.apply(thisArg, args);
// ...
return result;
};
}
function createConstructorCallWrap (name) {
return function apply (target, thisArg, args) {
// ...
console.log(name);
const instance = new target(...args);
// ...
return instance;
};
}
Nice! Now I get logs for all function calls, all constructor calls, and all instance method calls!
Handling function arguments:
It would also be good to have logging for any function which are passed as arguments (like callback functions). Both traps give me access to the function arguments, so I just need to add another wrapper:
function wrapArgs(args) {
return args.map((arg) => {
if (typeof arg !== 'function') {
return arg;
}
return new Proxy(arg, {
apply: ...
});
});
}
Okay but now I have a little problem. In all the other cases I have access to the name of the function or constructor that is being wrapped, whether it is from the module.exports
object, or the target.prototype
object. I don't have the names of the arguments to the function, so I can't know the name of any argument that happens to be a function!
I do have access to the function though, which means I can call toString()
and get the full text of the function.
function foo (callback) { callback(); };
foo.toString() // "function foo (callback) { callback(); };"
I could do some RegExp magic and read the argument name?
const [argsString] = foo.toString().match(/function .* \((.*?)\)/);
const argNames = argsString.split(', ');
This is going to fall apart very quickly - I'd have to handle async functions, generators, arrow functions, optional arguments, etc. etc.
But I could use a proper parser and it could handle all that for me!
import { parseScript } from 'esprima';
function getFunctionArgumentNames(func) {
const AST = parseScript(`const f = ${func.toString()}`)
// find argument names in AST
}
I can use esprima
to get an Abstract Syntax Tree for the function. If func
is an anonymous function (like function () {}
) the parser will throw, so I need the const f =
bit to handle that case.
Once I've got the AST, I can query it for the bits I care about:
import { parseScript } from 'esprima';
import { query } from 'esquery';
function getFunctionArgumentNames(func) {
const [func] = query(parseScript(`const f = ${func.toString()}`), '[type=/Function/]');
return func.params.map((param) => {
const [identifier] = query(param, 'Identifier');
return identifier.name;
});
}
I find the first (and only) function in the AST (which is the representation of the target function), and get its list of params
. This could be normal parameters like a, b, c
, or complex parameters like d = 6, e = getE()
. In either case I want to grab the first Identifier
, and return its name. That would be 'a'
, 'b'
, 'c'
, 'd'
, and 'e'
in my above examples.
Now I can finish off wrapArgs
function wrapArgs(argNames, args) {
return args.map((arg, index) => {
if (typeof arg !== 'function') {
return arg;
}
return new Proxy(arg, {
apply: createFunctionCallWrap(argNames[index])
});
});
}
And change createFunctionCallWrap
and createConstructorCallWrap
:
function createFunctionCallWrap (name) {
return function apply (target, thisArg, args) {
// ...
const result = target.apply(thisArg, wrapArgs(args));
// ...
return result;
};
}
function createConstructorCallWrap (name) {
return function construct (target, args) {
// ...
const instance = new target(...wrapArgs(args));
// ...
return instance;
};
}
Nicer logs:
Okay, so now I have wrapped all functions, constructors, prototype methods, and any of their arguments that happen to be functions. And each time any of those are called I have access to their name, arguments, return value, and start and end times. It would be awesome to render these as a list with each entry and exit on their own line so I can scan it and see how the program ran.
It would also be cool to show somehow how deep in the call stack the program is. I can track that by incrementing a count before executing the real function, and decrementing it once it has run. I'm going to wrap all this logic up into logStart
and logEnd
functions:
let depth = 0;
function logStart (name, args) {
depth += 1;
let debugString = printDepth(name);
debugString += ` args: ${JSON.stringify(args)}`;
process.stdout.write(`${debugString}\n`);
}
function logEnd (name, startTime, endTime, result) {
let debugString = printDepth(name);
debugString += ` time: ${endTime - startTime}ms`;
debugString += ` return: ${JSON.stringify(result)}`;
process.stdout.write(`${debugString}\n`);
depth -= 1;
}
function printDepth (name) {
return `${'βΈ'.repeat(depth)} ${name}`;
}
One issue with this is that if the args
or result
have any circular references in them, then JSON.stringify
will throw. Node.js has already solved that for me though, so I can replace JSON.stringify
with util.inspect
.
I'm going to wrap that in a little helper so I can have all the options in one place:
import { inspect } from 'util';
function printObject(object) {
return inspect(object, { getters: true, depth: Infinity }).replace(/\n/g, '');
}
This will print all the details about an object that I need, and then strip any new lines so that each log only uses one line.
So with some dummy code this is the output I get:
βΈ init args: []
βΈβΈ Fixture args: [ 1, [ 'hello' ], [Function] ]
βΈβΈ Fixture return: Fixture { _a: 1, _b: [ 'hello' ], _c: [Function] }
βΈ init return: Fixture { _a: 1, _b: [ 'hello' ], _c: [Function] }
βΈ Fixture.func args: []
βΈβΈ Fixture._func args: [ 1 ]
βΈβΈβΈ _c args: [ 1 ]
βΈβΈβΈ _c return: 1
βΈβΈ Fixture._func return: [ 'hello' ]
βΈ Fixture.func return: undefined
Adding this into my library:
So this is all going great and I want to try to add it to library. Since I want to turn it on and off with a environment variable, I have wrapped it all in a function like this:
function debug () {
if (process.env.DEBUG) {
// Overwrite require so it wraps everything.
}
};
I've added the debug()
call inside library and now when I run it nothing happens... The code looks something like this:
import { blah } from './blah';
import { whatever } from './whatever';
export function library () {
debug();
// run library
}
And when I put a debugger;
inside the overwritten require
call, it is never hit... π
So there are two things going on here.
1) All the import ...
statements are converted into const { blah } = require('./blah');
calls.
2) The whole dependency graph is traversed long before my call to debug()
happens.
All the calls to require
happen before I've even turned on my debug mode...
To check that I'm right, I move the debug()
call to the first line of my index file, so now it looks more like:
debug();
import { blah } from './blah';
import { whatever } from './whatever';
export function betterer () {
// run Betterer
}
And cool, now it works! π
This is probably fine, but it means that the tool is much less configurable, and I can't change settings in between different Betterer runs. Maybe that's good enough?
Wrapping require.cache
:
I think I can do better than that! As a Node.js application executes, and the dependency graph is traversed, a special cache on the require
function is populated. I can access this cache using require.cache
, and it contains the complete exports of any given dependency.
When I call debug()
I can iterate over that cache and wrap all the functions that have already been required:
Object.keys(require.cache).forEach((requirePath) => {
const module = require.cache[requirePath];
if (module && shouldWrap(requirePath)) {
wrap(module.exports);
}
});
Boom, just like that. Now I can have my debug()
call wherever I want and all past and future requires will be wrapped correctly!
Wrapping up π
So that's pretty neat and is working quite well! I'm quite happy with how it ended up, cause it looked pretty unlikely here and there! I've taken my prototype from here, rewritten it in TypeScript, added a bunch of options, and released it on npm at @phenomnomnominal/debug
. Maybe that will be of some use to someone else?!
Hit me up with questions/comments/suggestions π
Top comments (0)