Building a user scriptable decision engine in Node.js - {recursion}

As part of the revamp of the DCO engine, we have been adding support for quite a few highly customizable scenarios for dynamic creatives - including storyboarding, geo-location, weather etc. If you haven’t seen them yet then you should give them a dekho - story-board demo, e-commerce demo and geo-loaction demo.

Considering we allow highly customized dynamic creatives, one of the challenges we faced was the trade-off between making our algorithms generic enough vs. flexibility to accommodate advertiser use cases.

Let’s a take an example for an Advertiser ‘A1’ who wants to show different creatives based on the current weather - sunny, cloudy, snow or rain. Another advertiser ‘A2’ who would want to show different creatives based on the current temperature - cold, pleasant or hot. Here is how these algorithms would look like in pseudo-code.

//Advertiser 'A1'

IF humidity < 30 and temparature > 25 THEN
  return 'sunny'
ELSE IF precipitation > 95 and temperature > 5 THEN
  return 'snow'
ELSE IF precipitaition > 95 and temparature > 5 THEN
  return 'rain'
IF humidity > 30 and precipitaition < 95 and temparature > 25 THEN
  return 'cloudy'
END

//Advertiser 'A2'

IF temparature < 15 THEN
  return 'cold'
ELSE IF 15 <= temparature < 30 THEN
  return 'pleasant'
ELSE
  return 'hot'
END

Even if we look at these two really simple use cases - the decision making for both has no commonality and in most systems would end up becoming two separate algorithms. Imagine the number of custom algorithms that would need to be supported. Another side-effect of having separate algorithms for each would mean additional development and deploy cycles for each new use-case. This is exactly why we started thinking of building a user scriptable decision engine - which would allow supporting new use cases with ease and flexibility.

Overall we evaluated 3 different approaches to achieve this:

1. Custom parser for our decision engine

Overall, this is an interesting approach to take where we build our own grammar and a parser. In essence, a parser should check language constructs specified in the grammar and in the background build a parse tree representation of the program. In Javascript, we could also re-use some existing parsers like Jison or PEG.js with their own limitations. By limiting the syntax and use-cases there is a possibility of building a custom parser of our use-case. Interestingly, as I write Rohith has also built a decision tree parser of his own as a proof of concept.

// decision engine
while b ≠ 0
if a > b
a := a − b
else
b := b − a
return a

Source: Wikipedia

Pros:

Considering the decision engine is expressed in an abstract language and not programming language code we can sanitize it and ensure that it’s safe even if it’s coming from an untrusted source.
If we were to build a GUI to allow non-programmers to write the decision logic - this approach would probably be the only choice which would give us control.

Cons:

Building a grammar from scratch would mean it will have it’s own limitation with regards to flexibility. Being able to build and support things that we take for granted in a programming language like loops, iterators, data types, scope etc. would now be needed to be accounted.
Considering this would mean an abstract language it would need learning something new from scratch.

2. Using `eval()`

Wouldn’t it be interesting to write code to read a string from a database field during runtime, parse it, interpret it and then invoke it as a first-grade citizen in your code base arbitrarily whenever needed? What this would mean is - any time you want to change the logic all you would need to do is to update code in your database and your algorithm is updated. Storing this as a setting field at a campaign level and every new campaign of yours can support a custom algorithm as and when needed. Sounds like fantasy in any other statically typed language but when it comes to Javascript and Node.js this is something that is readily available.

// global scope
var current_temp;
// decision engine algo to determine weather
var code = `function weather_engine() {
	if (current_temp < 15) {
		return 'cold';
	} else if (current_temp > 15 && current_temp < 30) {
		return 'pleasant';
	} else {
		return 'hot';
	}
}`;

eval(code);

current_temp = 19
console.log(weather_engine());
// pleasant

current_temp = 31
console.log(weather_engine(31));
// hot

current_temp = 6
console.log(weather_engine(6));
// cold

Pros:

The decision logic will be expressed in Javascript - which is a familiar language to developers.
All the language constructs which are available in Javascript are readily available. In the current usecase, it would allow flexibility and extensibility.

Cons:

eval is considered evil in the JS community and rightly so. eval executes in the scope of your current app so using eval for untrusted code is a complete no-no. In our case, even when it comes to unsuspecting users it can potentially mess with the code of your main app.
There is no way of exposing only selective scope or namespace to eval.

3. Sanboxing `eval`

Knowing that eval has it’s own quirks and issues; I started exploring how we could sandbox the code that is evaluated at runtime such that we get all the flexibility of eval but at the same time be able to protect ourselves from it’s side-effects. That’s when I stumbled upon the Node.js vm module which is part of the standard library.

The VM module allows to safely run arbitrary Javascript code much like eval without it’s quirks. Internally it’s like running your code within V8 Virtual Machine contexts and at the same time allowing pre-compilation, sandboxing and contexts. Also, by default the scripts do not have access to require and few other constructs. More importantly, you can also make available context from the global scope explicitly if needed.

var vm = require('vm');
// global scope
var current_temp;


// decision engine algo to determine weather
var code = `function weather_engine() {
	if (temp < 15) {
		return 'cold';
	} else if (temp > 15 && temp < 30) {
		return 'pleasant';
	} else {
		return 'hot';
	}
}
console.log(weather_engine());
`;

current_temp = 19;
vm.runInNewContext(code, {temp: current_temp, console: console});
// pleasant

current_temp = 31;
vm.runInNewContext(code, {temp: current_temp, console: console});
// hot

current_temp = 6;
vm.runInNewContext(code, {temp: current_temp, console: console});
// cold

Pros:

Safer than using eval directly.
Like with eval all the language constructs which are available in Javascript are readily available.
Can selectively expose aspects from the global scope.

Cons:

Checking for type safety in user input is tricky. Also, would need to accommodate for an erroneous code.
Not suited for non-programmers or for building a GUI driven decision engine.

Closing Thoughts

We still have to take a call on this - open for your thoughts? Please feel free to reach out to me or Rohith. Finally, it’s interesting to see the kind of use-cases that we are coming across while building the DCO engine. On one side we are tackling scale and on the other some ingenious scripting.

1. Custom parser for our decision engine

2. Using eval()

3. Sanboxing eval

Closing Thoughts

2. Using `eval()`

3. Sanboxing `eval`