I recently ran into an issue while trying to represent a nested, discriminator-based schema using Mongoose in a Node.js client project. The goal was to represent a logical formula by creating a hierarchy of “reducers” (&&, ||, etc…) that would reduce a series of nested “checks” down into a single value.

Let’s make that a little more relatable with an example. Imagine what we’re trying to represent the following formula:


x == 100 || (x <= 10 && x >= 0)

If we wanted to store this in MongoDB, we’d have to represent that somehow as a JSON object. Let’s take a stab at that:


{
  type: "reducer",
  reducer: "||",
  checks: [
    {
      type: "check",
      field: "x",
      comparator: "==",
      value: 100
    },
    {
      type: "reducer",
      reducer: "&&",
      checks: [
        {
          type: "check",
          field: "x",
          comparator: "<=",
          value: 10
        },
        {
          type: "check",
          field: "x",
          comparator: ">=",
          value: 0
        }
      ]
    }
  ]
}

What a behemoth!

While the JSON representation is ridiculously more verbose than our mathematical representation, it gives us everything we need to recreate our formula, and lets us store that formula in our database. This is exactly what we want.


The trouble comes when we try to represent this schema with Mongoose.

We can break our entire JSON representation into two distinct “types”. We have a “check” type that has field, comparator, and value fields, and a “reducer” type that has a reducer field, and a checks field that contains a list of either “check” or “reducer” objects.

Historically, Mongoose had trouble with a field in a document adhering to either one schema or another. That all changed with the introduction of “discriminators”, and later, “embedded discriminators”. Embedded discriminators let us say that an element of an array adheres to one of multiple schemas defined with different discriminators.

Again, let’s make that more clear with an example. If we wanted to store our formula within a document, we’d start by defining the schema for that wrapping “base” document:


const baseSchema = new Schema({
  name: String,
  formula: checkSchema
});

The formula field will hold our formula. We can define the shell of our checkSchema like so:


const checkSchema = new Schema(
  {},
  {
    discriminatorKey: "type",
    _id: false
  }
);

Here’s we’re setting the discriminatorKey to "type", which means that Mongoose will look at the value of "type" to determine what kind of schema the rest of this subdocument should adhere to.

Next, we have to define each type of our formula. Our "reducer" has a reducer field and a formula field:


baseSchema.path("formula").discriminator("reducer", new Schema(
  {
    reducer: {
      type: String,
      enum: ['&&', '||']
    },
    checks: [checkSchema]
  },
  { _id: false }
));

Similarly, our "check" type has its own unique set of fields:


baseSchema.path("formula").discriminator("check", new Schema(
  {
    field: String,
    comparator: {
      type: String,
      enum: ['&&', '||']
    },
    value: Number
  },
  { _id: false }
));

Unfortunately, this only works for the first level of our formula. Trying to define a top-level "reducer" or "check" works great, but trying to put a "reducer" or a "check" within a "reducer" fails. Those nested objects are stripped from our final object.


The problem is that we’re defining our discriminators based off of a path originating from the baseSchema:


baseSchema.path("formula").discriminator(...);

Our nested "reducer" subdocuments don’t have any discriminators attached to their checks. To fix this, we’d need to create two new functions that recursively builds each layer of our discriminator stack.

We’ll start with a buildCheckSchema function that simply returns a new schema for our "check"-type subdocuments. This schema doesn’t have any children, so it doesn’t need to define any new discriminators:


const buildCheckSchema = () =>
  new Schema({
    field: String,
    comparator: {
      type: String,
      enum: ['&&', '||']
    },
    value: Number
  }, { _id: false });

Our buildReducerSchema function needs to be a little more sophisticated. First, it needs to create the "reducer"-type sub-schema. Next, it needs to attach "reducer" and "check" discriminators to the checks field of that new schema with recursive calls to buildCheckSchema and buildReducerSchema:


const buildReducerSchema = () => {
    let reducerSchema = new Schema(
        {
            reducer: {
                type: String,
                enum: ['&&', '||']
            },
            checks: [checkSchema]
        },
        { _id: false }
    );
    reducerSchema.path('checks').discriminator('reducer', buildReducerSchema());
    reducerSchema.path('checks').discriminator('check', buildCheckSchema());
    return reducerSchema;
};

While this works in concept, it blows up in practice. Mongoose’s discriminator function greedily consumes the schemas passed into it, which creates an infinite recursive loop that blows the top off of our stack.


The solution I landed on with this problem is to limit the number of recursive calls we can make to buildReducerSchema to some maximum value. We can add this limit by passing an optional n argument to buildReducerSchema that defaults to 0. Every time we call buildReducerSchema from within buildReducerSchema, we’ll pass it an incremented value of n:


reducerSchema.path('checks').discriminator('reducer', buildReducerSchema(n + 1));

Next, we’ll use the value of n to enforce our maximum recursion limit:


const buildReducerSchema = (n = 0) => {
  if (n > 100) {
    return buildCheckSchema();
  }
  ...
};

If we reach one hundred recursions, we simply force the next layer to be a "check"-type schema, gracefully terminating the schema stack.

To finish things off, we need to pass our baseSchema these recursively constructed discriminators (without an initial value of n):


baseSchema.path("checks").discriminator("reducer", buildReducerSchema());
baseSchema.path("checks").discriminator("check", buildCheckSchema());

And that’s it!

Against all odds we managed to build a nested, discriminator-based schema that can fully represent any formula we throw at it, up to a depth of one hundred reducers deep. At the end of the day, I’m happy with that solution.