Nesting Structure Comparison

This post is written as a set of Literate Commits. The goal of this style is to show you how this program came together from beginning to end.

Each commit in the project is represented by a section of the article. Click each section's header to see the commit on Github, or check out the repository and follow along.

Written by Pete Corey on Aug 3, 2016.

Project Setup

The goal of this code kata is to implement a method on the Array prototype that determines when the current array and another array provided as input have the same nested structure.

To get a better idea of what we mean by “nested structure”, these calls to sameStructureAs would return true:


[ 1, 1, 1 ].sameStructureAs( [ 2, 2, 2 ] );
[ 1, [ 1, 1 ] ].sameStructureAs( [ 2, [ 2, 2 ] ] );

And these would return false:


[ 1, [ 1, 1 ] ].sameStructureAs( [ [ 2, 2 ], 2 ] );
[ 1, [ 1, 1 ] ].sameStructureAs( [ [ 2 ], 2 ] );

We’ll start by initializing our project with a basic Babel and Mocha setup.

.babelrc

+{ + "presets": ["es2015"] +}

.gitignore

+node_modules/

package.json

+{ + "main": "index.js", + "scripts": { + "test": "mocha ./test --compilers js:babel-register" + }, + "dependencies": { + "babel-preset-es2015": "^6.9.0", + "babel-register": "^6.9.0", + "chai": "^3.5.0", + "lodash": "^4.12.0", + "mocha": "^2.4.5" + } +}

test/index.js

+import { expect } from "chai"; + +describe("index", function() { + + it("works"); + +});

Extending Array

The simplest test we can write to get going is to compare the structure of an empty array to another empty array:


expect([].sameStructureAs([])).to.be.true;

After writing this test, our test suite fails. A naively simple way to get our suite back into the green is to define a very simple sameStructureAs function on the Array prototype that always returns true.

index.js

+Array.prototype.sameStructureAs = function(other) { + return true; +}

test/index.js

+import "../"; import { expect } from "chai"; -describe("index", function() { +describe("sameStructureAs", function() { - it("works"); + it("works", () => { + expect([].sameStructureAs([])).to.be.true; + });

Comparing Types

Next, we’ll add our first real test. We expect an array who’s first element is a Number to have a different structure than an array who’s first element is an Array:


expect([1].sameStructureAs([[]])).to.be.false;

The most obvious way to make this test pass is to check the types of the first element of each array. If both first elements are arrays, or both are not arrays, we’ll return true. Otherwise, we’ll return false.

index.js

+import _ from "lodash"; + Array.prototype.sameStructureAs = function(other) { - return true; + if (_.isArray(this[0]) && _.isArray(other[0])) { + return true; + } + else if (!_.isArray(this[0]) && !_.isArray(other[0])) { + return true; + } + return false; }

test/index.js

... expect([].sameStructureAs([])).to.be.true; + expect([1].sameStructureAs([[]])).to.be.false; });

Generalizing

Now that we have our base case figured out, it’s time to generalize and check the structure of arrays of any length.

To guide us on this process, we added two new tests:


expect([[], []].sameStructureAs([[], []])).to.be.true;
expect([[], 1].sameStructureAs([[], []])).to.be.false;

The general idea is that we’ll want to compare each element of this and other using the same kind of structural comparison we previously wrote. Lodash’s _.zipWith function is a great way of comparing array elements like this:


let comparisons = _.zipWith(this, other, (a, b) => {
    ...
});

Now, comparisons will hold a list of true/false values representing whether the values at that position shared the same structure.

We can use _.every to return true if all comparisons are true, or false otherwise.

During this process, we also did some refactoring of our comparison code, just to clean things up a bit:


let bothArrays = _.isArray(a) && _.isArray(b);
let bothNot = !_.isArray(a) && !_.isArray(b);
return bothArrays || bothNot;

index.js

... Array.prototype.sameStructureAs = function(other) { - if (_.isArray(this[0]) && _.isArray(other[0])) { - return true; - } - else if (!_.isArray(this[0]) && !_.isArray(other[0])) { - return true; - } - return false; + let comparisons = _.zipWith(this, other, (a, b) => { + let bothArrays = _.isArray(a) && _.isArray(b); + let bothNot = !_.isArray(a) && !_.isArray(b); + return bothArrays || bothNot; + }); + return _.every(comparisons); }

test/index.js

... expect([1].sameStructureAs([[]])).to.be.false; + expect([[], []].sameStructureAs([[], []])).to.be.true; + expect([[], 1].sameStructureAs([[], []])).to.be.false; });

Getting Recursive

So far we’ve only handled a single layer of structure. However, the goal of sameStructureAs is to compare the nested structures of our inputs. Consider these examples:


expect([[], [1]].sameStructureAs([[], [1]])).to.be.true;
expect([[], [1]].sameStructureAs([[], [[]]])).to.be.false;

While at first glance this seems to add a lot of complexity, there’s actually a very elegant solution to this problem. When we find two matching arrays in our inputs, all we need to do is ensure that the contents of those arrays have the same structure:


return (bothArrays && a.sameStructureAs(b)) || bothNot;

This recursive call into sameStructureAs will handle any nested depths that we can throw at it (as long as we don’t blow our stack).

index.js

... let bothNot = !_.isArray(a) && !_.isArray(b); - return bothArrays || bothNot; + return (bothArrays && a.sameStructureAs(b)) || bothNot; });

test/index.js

... expect([[], 1].sameStructureAs([[], []])).to.be.false; + expect([[], [1]].sameStructureAs([[], [1]])).to.be.true; + expect([[], [1]].sameStructureAs([[], [[]]])).to.be.false; });

Bug Fixes

Submitting our solution revealed a few bugs in our solution. The first bug can be pointed out with this failing test:


expect([].sameStructureAs(1)).to.be.false;

We were assuming that other would be an array. Unfortunately in this case, other is 1, which causes _.zipWith to return an empty array.

We can fix this bug by adding an upfront check asserting that other is an Array.

After fixing that issue, another bug reared its ugly head:


expect([1].sameStructureAs([1, 2])).to.be.false;

In the last iteration of _.zipWith, a was undefined and b was 2. While checking if both of these values were not arrays returned true, we didn’t check if both values actually existed.

The fix for this bug is to check that both a and b are not undefined:


let bothDefined = !_.isUndefined(a) && !_.isUndefined(b);
let bothNot = bothDefined && !_.isArray(a) && !_.isArray(b);

With those fixes, out suite flips back to green!

index.js

... Array.prototype.sameStructureAs = function(other) { + if (!_.isArray(other)) { + return false; + } let comparisons = _.zipWith(this, other, (a, b) => { let bothArrays = _.isArray(a) && _.isArray(b); - let bothNot = !_.isArray(a) && !_.isArray(b); + let bothDefined = !_.isUndefined(a) && !_.isUndefined(b); + let bothNot = bothDefined && !_.isArray(a) && !_.isArray(b); return (bothArrays && a.sameStructureAs(b)) || bothNot;

test/index.js

... expect([[], [1]].sameStructureAs([[], [[]]])).to.be.false; + expect([].sameStructureAs(1)).to.be.false; + expect([1].sameStructureAs([1, 2])).to.be.false; });

Final Refactor

To make things a little more readable, I decided to move the undefined check into a guard, rather than including it withing the comparison logic of our zipWith function.

After implementing this refactor, our tests still pass.

index.js

... let comparisons = _.zipWith(this, other, (a, b) => { + if (_.isUndefined(a) || _.isUndefined(b)) { + return false; + } let bothArrays = _.isArray(a) && _.isArray(b); - let bothDefined = !_.isUndefined(a) && !_.isUndefined(b); - let bothNot = bothDefined && !_.isArray(a) && !_.isArray(b); + let bothNot = !_.isArray(a) && !_.isArray(b); return (bothArrays && a.sameStructureAs(b)) || bothNot;

Final Thoughts

Looking at the other submitted solutions to this kata, I realize that it’s a fairly interesting problem with lots of possible solutions. There are shorter solutions out there, but I like ours for its readability.

Brevity doesn’t always lead to better code. This is an important lesson that has taken me many years to fully appreciate.

Be sure to check out the final solution on GitHub. If you spot a bug or see a place ripe for improvement, be sure to submit a pull request!

Method Imports and Exports

Written by Pete Corey on Aug 1, 2016.

Organizing our Meteor code into modules can be a powerful improvement over the old globals-everywhere approach that many of us are used to.

Modules emphasize isolation. Everything within a module is scoped locally within that module, unless its explicitly exported to the outside world. This kind of isolation can lead to better readability and testability.

Unfortunately, many of the core features of the Meteor framework still rely on modifying global state. Defining things like methods, publications and template helpers all update the global state of the application.

This dichotomy can be confusing to Meteor developers new to the module system. How do we define methods in modules? Should we be exporting methods from modules? How do we import those methods into our application?

Importing Methods

When you call Meteor.methods(...), you’re modifying your application’s global state. The methods you pass into Meteor.methods are pushed onto the global list of callable methods within your application.

Because it affects global state in this way, Meteor.methods is said to have side effects.

Imagine that we have a module called paymentMethods.js. The purpose of this module is to define several Meteor methods related to payment processing.

Inside of that module, we have a createPayment method that lets logged in users create new payments based on some provided options. It looks something like this:


import { Meteor } from "meteor/meteor";

Meteor.methods({
    createPayment(options) {
        if (this.userId) {
            ...
        }
        else {
            throw new Meteor.Error("unauthenticated");
        }
    }
});

You’ll notice that nothing is being exported from this module. This is because the call to Meteor.methods is modifying the global state for us. We don’t need to export our methods to make them accessible outside of the module.

We could even define our methods locally and make them accessible globally using Meteor.methods:


import { Meteor } from "meteor/meteor";

const methods = {
    createPayment(options) {
        ...
    }
};

Meteor.methods(methods);

We’ve moved our method definitions into a constant called methods which is only accessible within our paymentMethods.js module. However, our call to Meteor.methods still pulls our methods into our global method map, making them accessible anywhere in our application.


While we don’t need to export our methods from our modules, we still need some piece of our application to import our paymentMethods.js module. If our code never runs, our methods will never be defined and added to our list of global methods.

In our /server/main.js file, we can import our paymentMethods.js module. This will run our module’s code, defining the methods and passing them into Meteor.methods:


import "/imports/paymentMethods.js";

Notice that we’re not importing anything from paymentMethods.js. That’s because there is nothing to import. We just want this module to be executed.

Testing Methods

What if we wanted to unit test our paymentMethods.js module?


it("creates a payment", function() {
    let options = { type: "cc", ... };
    // TODO: call createPayment?
    expect(paymentId).to.be.ok;
});

How would we call createPayment? The obvious answer would be to use Meteor.call:


it("creates a payment", function() {
    let options = { type: "cc", ... };
    let paymentId = Meteor.call("createPayment", options);
    expect(paymentId).to.be.ok;
});

But, our method is expecting a logged in user. This leads to all kinds of difficulties.

To successfully Meteor.call our "createPayment" method, we’ll have to either override Meteor.call with a version that passes in a custom this context that we can control, or go through the process of creating and logging in as an actual user before calling "createPayment".

Following these approaches, this kind of test arguably isn’t a unit test. It relies on the entire Meteor method and accounts infrastructures to execute.

A possible way around these problems is to try to access and call our method functions directly. On the server, Meteor.call moves your method definitions into an internal object accessible through Meteor.server.method_handlers.

In theory, we could call "createPayment" directly through this object, and pass in a custom userId:


it("creates a payment", function() {
    let options = { type: "cc", ... };
    let createPayment = Meteor.server.method_handlers.createPayment;
    let paymentId = createPayment.bind({
        userId: "1234567890"
    })(options);
    expect(paymentId).to.be.ok;
});

While this works, it’s a very fragile solution. Meteor.server.method_handlers is an undocumented API and is subject to change. It’s dangerous to rely on this internal structure when there is no guarantee that it will be around in future versions of Meteor.

There has to be an better way to test our methods!

Exporting Methods

To solve this problem, let’s go back and revisit our paymentMethods.js module:


import { Meteor } from "meteor/meteor";

const methods = {
    createPayment(options) {
        ...
    }
};

Meteor.methods(methods);

We can make the testing of this module vastly easier with one simple change:


import { Meteor } from "meteor/meteor";

export const methods = {
    createPayment(options) {
        ...
    }
};

Meteor.methods(methods);

By exporting the methods object, we can now get a direct handle on our "createPayment" method without having to dig through the Meteor internals. This makes our test much more straightforward:


import { methods } from "./paymentMethods.js";

it("creates a payment", function() {
    let options = { type: "cc", ... };
    let paymentId = methods.createPayment.bind({
        userId: "1234567890"
    })(options);
    expect(paymentId).to.be.ok;
});

This is a clean test.

We’re directly calling our "createPayment" method with a controllable this context and input options. This makes it very easy to test a variety of situations and scenarios that this method might encounter.

We’re not relying on any Meteor infrastructure or internal structures to run our tests; they’re completely independent of the platform.

Beyond Methods

What’s beautiful about this module-based approach is that it’s not limited to just Meteor methods. We could apply this same technique to creating and testing our publications and our template helpers, lifecycle callbacks and event handlers.

As a quick example, imagine defining a template in a module like this:


import { Template } from "meteor/templating";

export function onCreated() {
    this.subscribe("foo");
    ...
};

export const helpers = {
    foo: () => "bar",
    ....
};

export const events = {
    "click .foo": function(e, t) {
        ...
    }
};

Template.foo.onCreated(onCreated);
Template.foo.helpers(helpers);
Template.foo.events(event);

Because we’re defining our template’s helpers, events, and onCreated callback as simple functions and objects, we can easily import them into a test module and test them directly:


import { events, helpers, onCreated} from "./foo.js";

describe("foo", function() {

    it("sets things up when it's created", function() {
        let subscriptions = [];

        onCreated.bind({
            subscribe(name) => subscriptions.push(name);
        })();
        
        expect(subscriptions).to.contain("foo");
    });

    it("returns bar from foo", function() {
        let foo = helpers.foo();
        expect(foo).to.equal("bar");
    });

    ...
});

Our onCreated test creates a custom version of this.subscribe that pushes the subscription name onto an array, and after calling the onCreated callback, asserts that "foo" has been subscribed to.

The helpers.foo test is less complex. It simply asserts that helpers.foo() equals "bar".

Final Thoughts

Hopefully I’ve given you a clear picture of how I approach defining and testing my methods, publications, and template helpers in a post-1.3 world.

Modules are a very powerful tool that can make our lives as developers much easier if we take the time to explore and understand their full potential. Modularity makes code readability, testability, and re-usability significantly easier.

Happy modularizing!

Molecule to Atoms

This post is written as a set of Literate Commits. The goal of this style is to show you how this program came together from beginning to end.

Each commit in the project is represented by a section of the article. Click each section's header to see the commit on Github, or check out the repository and follow along.

Written by Pete Corey on Jul 27, 2016.

Project Setup

Today we’re going to be tackling the Molecule to Atom kata. The goal of this kata is to parse a string representation of a molecule into its component elements, or atoms.

As always, the first step is to get our project set up. We’ll be using Mocha as our test runner and Babel for ES6 support.

.babelrc

+{ + "presets": ["es2015"] +}

.gitignore

+node_modules/

package.json

+{ + "main": "index.js", + "scripts": { + "test": "mocha ./test --compilers js:babel-register" + }, + "dependencies": { + "babel-preset-es2015": "^6.9.0", + "babel-register": "^6.9.0", + "chai": "^3.5.0", + "lodash": "^4.12.0", + "mocha": "^2.4.5" + } +}

test/index.js

+import { expect } from "chai"; + +describe("index", function() { + + it("works"); + +});

Laying the Groundwork

The best place to start is at the beginning. We’ll begin solving this kata by writing the most basic test we can think of. We expect an empty string to be parsed into an empty object:


expect(parseMolecule("")).to.deep.equal({});

After writing this test, it fails. parseMolecule is not defined. We can quickly remedy this by importing parseMolecule into our test file and then exporting it from our main module.

Lastly, we need parseMolecule to return an empty object. Just like that, our tests are green.

index.js

+export function parseMolecule(formula) { + return {}; +}

test/index.js

import { expect } from "chai"; +import { parseMolecule } from "../index"; -describe("index", function() { +describe("Molecule to Atoms", function() { - it("works"); + it("it parses a molecule", () => { + expect(parseMolecule("")).to.deep.equal({}); + });

Introducing Our Abstractions

Knowing this solution is going to get complex fairly quickly, we’d like to leverage some abstractions to make it more testable and maintainable.

The most obvious abstraction we can think of is a molecule. We create a molecule by passing in a formula. From there, we can call parse to get an object-based representation of the molecule.

We can rewrite our parseMolecule function to use molecule in a clean way.

index.js

+export function molecule(formula) { + function parse() { + return {}; + } + + return { + parse + }; +} + export function parseMolecule(formula) { - return {}; + return molecule(formula).parse(); }

Testing Hydrogen

Let’s add a new test case:


expect(parseMolecule("H")).to.deep.equal({ H: 1 });

This test fails. It’s expecting {} to equal { H: 1 }.

The fastest way we can get ourselves to green is to return an object that maps the provided formula to 1.


return {
    [formula]: 1
};

After making this change, our first test fails. We now need to explicitly handle this base case:


if (!formula) {
    return {};
}

And now we’re back to green again.

index.js

... function parse() { - return {}; + if (!formula) { + return {}; + } + return { + [formula]: 1 + }; }

test/index.js

... expect(parseMolecule("")).to.deep.equal({}); + expect(parseMolecule("H")).to.deep.equal({ H: 1 }); });

Multiple Elements

Next we’ll move onto parsing a molecule with multiple types of elements. Let’s try to parse "HMg":


expect(parseMolecule("HMg")).to.deep.equal({ H: 1, Mg: 1 });

As expected, this test fails. parse is returning { HMg: 1 }, rather than correctly splitting "H" and "Mg".

To fix this, we’ll need to implement a new method on molecule that will split a formula into a number of parts. We’ll start things off by splitting formulas into its component elements:


formula.match(/[A-Z][a-z]*/g) || [];

The || [] default is required to keep our base case of parsing "" happy.

After rewriting parse to use this new parts method to assemble the molecule object, our tests flip back to green.

index.js

export function molecule(formula) { + function parts() { + return formula.match(/[A-Z][a-z]*/g) || []; + } + function parse() { - if (!formula) { - return {}; - } - return { - [formula]: 1 - }; + return parts().reduce((result, part) => { + result[part] = 1; + return result; + }, {}); } ... return { - parse + parse, + parts };

test/index.js

import { expect } from "chai"; -import { parseMolecule } from "../index"; +import { parseMolecule, molecule } from "../index"; ... expect(parseMolecule("H")).to.deep.equal({ H: 1 }); + expect(parseMolecule("HMg")).to.deep.equal({ H: 1, Mg: 1 }); + }); + + describe("molecule", () => { + it("splits a formula into parts", () => { + expect(molecule("HMg").parts()).to.deep.equal(["H", "Mg"]); + }); });

Compiling Parts

Things are getting more complicated. Instead of just breaking apart our formula and looking for elements, we need to look for pairs of elements and their corresponding count, or ratio in the molecule.

A nice way to do this is using regular expression match groups and ES6 pattern matching. We can define our regex, and wrap the element and optional count in match groups:


let regex = /([A-Z][a-z]*)(\d*)/g;

As we iterate over our regex matches, we can destructure the result into its corresponding element and count pair.

Finally, we refactored parts to return an object with part and count keys to keep things explicit. Thinking ahead, I think this might come in handy when we need to deal with nested expressions.

index.js

... function parts() { - return formula.match(/[A-Z][a-z]*/g) || []; + let parts, results = []; + let regex = /([A-Z][a-z]*)(\d*)/g; + while (parts = regex.exec(formula)) { + let [_, part, count] = parts; + results.push({ + part, + count: parseInt(count) || 1 + }); + } + return results; } ... function parse() { - return parts().reduce((result, part) => { - result[part] = 1; + return parts().reduce((result, {part, count}) => { + result[part] = count; return result;

test/index.js

... expect(parseMolecule("HMg")).to.deep.equal({ H: 1, Mg: 1 }); + expect(parseMolecule("H2Mg")).to.deep.equal({ H: 2, Mg: 1 }); }); ... it("splits a formula into parts", () => { - expect(molecule("HMg").parts()).to.deep.equal(["H", "Mg"]); + expect(molecule("HMg").parts()).to.deep.equal([ + { part: "H", count: 1 }, + { part: "Mg", count: 1 }, + ]); });

Fixing an Edge Case

One thing I noticed while looking over our solution is that it did not support multiple instances of the same element in a molecule. For example, with "HMgH", the last "H" element would override the previous.

I added a test to confirm my suspicions, and quickly fixed the problem. Instead of overriding the element in the result object with count, increment it (if it exists) by count:


result[part] = ~~result[part] + count;

The ~~ operator is rarely seen, but it’s an incredibly useful operator. In this case, if result[part] is undefined, it will be converted to 0, preventing a NaN result from our addition.

index.js

... return parts().reduce((result, {part, count}) => { - result[part] = count; + result[part] = ~~result[part] + count; return result;

test/index.js

... expect(parseMolecule("H2Mg")).to.deep.equal({ H: 2, Mg: 1 }); + expect(parseMolecule("H2MgH")).to.deep.equal({ H: 3, Mg: 1 }); });

Beginning Nested Expressions

Next up on our feature todo list is adding support for nested expressions. A simple nested expressions to get us going could be "[H]Mg". Let’s set our expectations with a test:


expect(parseMolecule("[H]Mg")).to.deep.equal({ H: 1, Mg: 1 });

Now each “part” in our parts method could be either an element, or a nested expressions within square brackets. Let’s update our regular expression to reflect that:


let regex = /(\[(.*)\]|([A-Z][a-z]*))(\d*)/g;

Thanks to the magic of match groups and destructing, we can assign the nested expression (if it exists) to square:


let [_, __, square, part, count] = parts;

Finally, if square exists, let’s recursively process the nested expression and append its parts onto our list of results.

With those changes, our tests flip back to green. Beautiful.

index.js

... let parts, results = []; - let regex = /([A-Z][a-z]*)(\d*)/g; + let regex = /(\[(.*)\]|([A-Z][a-z]*))(\d*)/g; while (parts = regex.exec(formula)) { - let [_, part, count] = parts; - results.push({ - part, - count: parseInt(count) || 1 - }); + let [_, __, square, part, count] = parts; + if (square) { + let nested = molecule(square).parts(); + results = results.concat(nested); + } + else { + results.push({ + part, + count: parseInt(count) || 1 + }); + } }

test/index.js

... expect(parseMolecule("H2MgH")).to.deep.equal({ H: 3, Mg: 1 }); + expect(parseMolecule("[H]Mg")).to.deep.equal({ H: 1, Mg: 1 }); });

Refactoring

Looking forward, the next feature we’ll want to support is using ratios, or count on nested expressions. This means that we’ll need some notion of “multiplying” molecules by some count.

I imagine calling molcule(...).multiply(2) would return a new molecule. This means that multiply will need access to our parsed interpretation of the formula.

We could have multiply call parts, multiply each element by count, serialize the result back into a formula and then create and return a new molecule, but that’s a very roundabout way of doing things.

Instead, let’s define an internal _parts list that’s created every time a molecule is created. The parts method will simply return _parts, and multiply will be able to modify _parts directly.

After doing our refactoring, all of our tests are still passing. We’re safe.

index.js

export function molecule(formula) { - function parts() { - let parts, results = []; - let regex = /(\[(.*)\]|([A-Z][a-z]*))(\d*)/g; - while (parts = regex.exec(formula)) { - let [_, __, square, part, count] = parts; - if (square) { - let nested = molecule(square).parts(); - results = results.concat(nested); - } - else { - results.push({ - part, - count: parseInt(count) || 1 - }); - } + let _parts = []; + + let matches; + let regex = /(\[(.*)\]|([A-Z][a-z]*))(\d*)/g; + while (matches = regex.exec(formula)) { + let [_, __, square, part, count] = matches; + if (square) { + let nested = molecule(square).parts(); + _parts = _parts.concat(nested); + } + else { + _parts.push({ + part, + count: parseInt(count) || 1 + }); } - return results; + } + + function parts() { + return _parts; }

Multiplying Nested Expressions

With that refactor out of the way, we can easily implement multiply. To give ourselves an explicit objective, let’s write a test first:


expect(molecule("H").multiply(2).parse()).to.deep.equal({ H: 2 });

Implementing multiply is straight forward. We just loop over each of the _parts of the molecule, multiplying its count by the provided count. Finally, we return this, so calls to molecule methods can be chained together.

Next, let’s write a test for handling ratios on nested expressions:


expect(parseMolecule("[HO]2Mg")).to.deep.equal({ H: 2, O: 2, Mg: 1 });

This test fails, as expected, but it’s not difficult to get our suite back to green. We just need to build our nested molecule, multiply is by its corresponding count, and finally append its parts onto our results array.

index.js

... let [_, __, square, part, count] = matches; + count = parseInt(count) || 1; if (square) { - let nested = molecule(square).parts(); - _parts = _parts.concat(nested); + let nested = molecule(square).multiply(count); + _parts = _parts.concat(nested.parts()); } ... part, - count: parseInt(count) || 1 + count }); ... + function multiply(count) { + _parts.forEach((part) => { + part.count *= count; + }); + return this; + } + function parts() { ... parse, - parts + parts, + multiply };

test/index.js

... expect(parseMolecule("[H]Mg")).to.deep.equal({ H: 1, Mg: 1 }); + expect(parseMolecule("[HO]2Mg")).to.deep.equal({ H: 2, O: 2, Mg: 1 }); }); ... }); + it("multiplies an object", () => { + expect(molecule("H").multiply(2).parse()).to.deep.equal({ H: 2 }); + }); });

Regex Woes

Unfortunately, there’s a big problem with our solution. Our regex (\[.*\]) is looking for opening and closing brackets and assuming that everything within them are a nested expression.

This assumption breaks down when we have two nested expressions in the same formula, like "[H]O[H]". Our regex will match on the first and very last square brackets, returning "H]O[H" as the nested expression. We can write this as a test to see the failure in action:


expect(molecule("[H]2O[H]").parts()).to.deep.equal([
    { part: "H", count: 2 },
    { part: "O", count: 1 },
    { part: "H", count: 1 }
]);

Switching to a non-greedy regex (\[.*?\]) would still fail, but this time on nested subexpressions, like "[H[O]]".

We need a better way to parse out our formula parts.

Our new solution uses a regex to match on all opening brackets, closing brackets, and elements with a trailing count:


let regex = /((\[)|(\])|([A-Z][a-z]*))(\d*)/g;

Every time we encounter an opening bracket, we push a new formula onto our new stack:


if (open) {
    stack.push({
        formula: ""
    });
}

A non-empty stack means that we’re collecting the pieces of a nested expression. That means we should just append any elements and counts we find to that sub-expression’s formula:


else if (stack.length) {
    stack[stack.length - 1].formula += part + count;
}

Finally, whenever we encounter a closing bracket, we want to create a new molecule out of the nested expression we just collected, and appends its parts onto our list of parts:


else if (close) {
    let nested = molecule(stack.pop().formula).multiply(count);
    _parts = _parts.concat(nested.parts());
}

Now we’re properly processing sub-expressions and our tests return to green.

index.js

... let matches; - let regex = /(\[(.*)\]|([A-Z][a-z]*))(\d*)/g; + let stack = []; + let regex = /((\[)|(\])|([A-Z][a-z]*))(\d*)/g; while (matches = regex.exec(formula)) { - let [_, __, square, part, count] = matches; + let [_, __, open, close, part, count] = matches; count = parseInt(count) || 1; - if (square) { - let nested = molecule(square).multiply(count); + if (open) { + stack.push({ + formula: "" + }); + } + else if (close) { + let nested = molecule(stack.pop().formula).multiply(count); _parts = _parts.concat(nested.parts()); } + else if (stack.length) { + stack[stack.length - 1].formula += part + count; + } else {

test/index.js

... ]); + expect(molecule("[H]2O[H]").parts()).to.deep.equal([ + { part: "H", count: 2 }, + { part: "O", count: 1 }, + { part: "H", count: 1 } + ]); });

Alternate Groupings and Bug Fixes

Last up on our feature list is the ability to use parentheses and curly brackets in addition to square brackets as sub-expression dividers.

We can add a test for this:


expect(parseMolecule("K4{ON(SO3)2}2")).to.deep.equal({K: 4, O: 14, N: 2, S: 4});

The most straight-forward way to accomplish this is to just replace all "{" and "(" characters with "[", and "}" and ")" characters with "]" in our formula:


formula.replace(/[{(]/g, "[").replace(/[})]/g, "]");

Unfortunately, this test also exposed another bug in our solution. it looks like counts on sub-expressions aren’t being applied to other nested expressions.

We can fix this by giving our stack a bit more state. In addition to building up our current expressions’s formula as we go, we also want to keep track of any nested molecules we encounter, so we can multiply them by our count.

After making this change, all of our tests pass.

index.js

... + formula = formula.replace(/[{(]/g, "[").replace(/[})]/g, "]"); + let matches; ... stack.push({ - formula: "" + formula: "", + molecules: [] }); ... else if (close) { - let nested = molecule(stack.pop().formula).multiply(count); - _parts = _parts.concat(nested.parts()); + let popped = stack.pop(); + popped.molecules.push(molecule(popped.formula)); + popped.molecules.forEach((molecule) => { + molecule.multiply(count); + }); + if (!stack.length) { + popped.molecules.forEach((molecule) => { + _parts = _parts.concat(molecule.parts()); + }); + } + else { + let last = stack[stack.length - 1]; + last.molecules = last.molecules.concat(popped.molecules); + } }

test/index.js

... expect(parseMolecule("[HO]2Mg")).to.deep.equal({ H: 2, O: 2, Mg: 1 }); + expect(parseMolecule("K4{ON(SO3)2}2")).to.deep.equal({K: 4, O: 14, N: 2, S: 4}); + expect(parseMolecule("{[Co(NH3)4(OH)2]3Co}(SO4)3")).to.deep.equal({ + Co: 4, + N: 12, + H: 42, + O: 18, + S: 3 + }); });

Final Thoughts

This was a doozy of a kata.

All in all, there are a lot of ideas flying around here and I fear I didn’t explain myself as well as I could have. Balancing trying to keep things as concise, while staying true to the intent of explaining every change is definitely a challenge.

The limitations of regular expressions are explored in this problem was well. Initially, a totally regex based solution seemed like a good approach, but the nested structure of our data made regexes difficult to use.

When using regexes, it’s always important to keep in mind that there are entire classes of problem where they’re simply not suitable.

Finally, my final solution isn’t as clean and clear as I would have liked. By the time I landed on a working solution, I feared that more refactor commits would have muddied the water. In future problems, I need to focus on refactoring sooner and more often.