Be Careful Using With in Tests

Written by Pete Corey on Jun 4, 2018.

Last week I struck a chord in the Elixir community when I tweeted about a trap I fell into while writing a seemingly simple test using Elixir’s with special form. Based on the reaction to that tweet, I thought it’d be a good idea to explore where I went wrong and how I could have prevented it.

The Test

The test in question was fairly simple. Let’s imagine it looked something like this:


test "foo equals bar" do
  with {:ok, foo} <- do_foo(),
       {:ok, bar} <- do_bar() do
    assert foo == bar
  end
end

We’re using with to destructure the results of our calls to do_foo/0 and do_bar/0 function calls. Next, we’re asserting that foo should equal bar.

If do_foo/0 or do_bar/0 return anything other than an :ok tuple, we’d expect our pattern match to fail, causing our test to fail. On running our test, we see that it passes. Our do_foo/0 and do_bar/0 functions must be working as expected!

The False Positive

Unfortunately, we’re operating under a faulty assumption. In reality, our do_foo/0 and do_bar/1 functions actually look like this:


def do_foo, do: {:ok, 1}
def do_bar, do: {:error, :asdf}

Our do_bar/0 is returning an :error tuple, not the :ok tuple our test is expecting, but our test is still passing. What’s going on here?

It’s easy to forget (at least for me, apparently) that when a with expression fails a pattern match, it doesn’t throw an error. Instead, it immediately returns the unmatched value. So in our test, our with expression is returning the unmatched {:error, :asdf} tuple without ever executing its do block and skipping our assertion entirely.

Because our assertion is never given a chance to fail, our test passes!

The Fix

The fix for this broken test is simple once we recognize what the problem is. We’re expecting our assignments to throw errors if they fail to match. One surefire way to accomplish that is to use assignments rather than a with expression.


test "foo equals bar" do
  {:ok, foo} = do_foo()
  {:ok, bar} = do_bar()
  assert foo == bar
end

Now, the :error tuple returned by our do_bar/0 function will fail to match with our :ok tuple, and the test will fail. Not only that, but we’ve also managed to simplify our test in the process of fixing it.

Success!

The Better Fix

After posting the above fix in response to my original tweet, Michał Muskała replied with a fantastic tip to improve the error messaging of the failing test.

Michał's pro tip.

Currently, our test failure looks like this:


** (MatchError) no match of right hand side value: {:error, :asdf}
code: {:ok, bar} = do_bar()

If we add assertions to our pattern matching assignments, we set ourselves up to receive better error messages:


test "foo still equals bar" do
  assert {:ok, foo} = do_foo()
  assert {:ok, bar} = do_bar()
  assert foo == bar
end

Now our failing test reads like this:


match (=) failed
code:  assert {:ok, bar} = do_bar()
right: {:error, :asdf}

While we’re still given all of the same information about the failure, it’s presented in a way that’s easier to read and internalize, leading to a quicker understanding of how and why our test is failing.

I’ll be sure to incorporate that tip into my tests from now on. Thanks Michał!

Modeling Formulas with Recursive Discriminators

Written by Pete Corey on May 28, 2018.

I recently ran into an issue while trying to represent a nested, discriminator-based schema using Mongoose in a Node.js client project. The goal was to represent a logical formula by creating a hierarchy of “reducers” (&&, ||, etc…) that would reduce a series of nested “checks” down into a single value.

Let’s make that a little more relatable with an example. Imagine what we’re trying to represent the following formula:


x == 100 || (x <= 10 && x >= 0)

If we wanted to store this in MongoDB, we’d have to represent that somehow as a JSON object. Let’s take a stab at that:


{
  type: "reducer",
  reducer: "||",
  checks: [
    {
      type: "check",
      field: "x",
      comparator: "==",
      value: 100
    },
    {
      type: "reducer",
      reducer: "&&",
      checks: [
        {
          type: "check",
          field: "x",
          comparator: "<=",
          value: 10
        },
        {
          type: "check",
          field: "x",
          comparator: ">=",
          value: 0
        }
      ]
    }
  ]
}

What a behemoth!

While the JSON representation is ridiculously more verbose than our mathematical representation, it gives us everything we need to recreate our formula, and lets us store that formula in our database. This is exactly what we want.


The trouble comes when we try to represent this schema with Mongoose.

We can break our entire JSON representation into two distinct “types”. We have a “check” type that has field, comparator, and value fields, and a “reducer” type that has a reducer field, and a checks field that contains a list of either “check” or “reducer” objects.

Historically, Mongoose had trouble with a field in a document adhering to either one schema or another. That all changed with the introduction of “discriminators”, and later, “embedded discriminators”. Embedded discriminators let us say that an element of an array adheres to one of multiple schemas defined with different discriminators.

Again, let’s make that more clear with an example. If we wanted to store our formula within a document, we’d start by defining the schema for that wrapping “base” document:


const baseSchema = new Schema({
  name: String,
  formula: checkSchema
});

The formula field will hold our formula. We can define the shell of our checkSchema like so:


const checkSchema = new Schema(
  {},
  {
    discriminatorKey: "type",
    _id: false
  }
);

Here’s we’re setting the discriminatorKey to "type", which means that Mongoose will look at the value of "type" to determine what kind of schema the rest of this subdocument should adhere to.

Next, we have to define each type of our formula. Our "reducer" has a reducer field and a formula field:


baseSchema.path("formula").discriminator("reducer", new Schema(
  {
    reducer: {
      type: String,
      enum: ['&&', '||']
    },
    checks: [checkSchema]
  },
  { _id: false }
));

Similarly, our "check" type has its own unique set of fields:


baseSchema.path("formula").discriminator("check", new Schema(
  {
    field: String,
    comparator: {
      type: String,
      enum: ['&&', '||']
    },
    value: Number
  },
  { _id: false }
));

Unfortunately, this only works for the first level of our formula. Trying to define a top-level "reducer" or "check" works great, but trying to put a "reducer" or a "check" within a "reducer" fails. Those nested objects are stripped from our final object.


The problem is that we’re defining our discriminators based off of a path originating from the baseSchema:


baseSchema.path("formula").discriminator(...);

Our nested "reducer" subdocuments don’t have any discriminators attached to their checks. To fix this, we’d need to create two new functions that recursively builds each layer of our discriminator stack.

We’ll start with a buildCheckSchema function that simply returns a new schema for our "check"-type subdocuments. This schema doesn’t have any children, so it doesn’t need to define any new discriminators:


const buildCheckSchema = () =>
  new Schema({
    field: String,
    comparator: {
      type: String,
      enum: ['&&', '||']
    },
    value: Number
  }, { _id: false });

Our buildReducerSchema function needs to be a little more sophisticated. First, it needs to create the "reducer"-type sub-schema. Next, it needs to attach "reducer" and "check" discriminators to the checks field of that new schema with recursive calls to buildCheckSchema and buildReducerSchema:


const buildReducerSchema = () => {
    let reducerSchema = new Schema(
        {
            reducer: {
                type: String,
                enum: ['&&', '||']
            },
            checks: [checkSchema]
        },
        { _id: false }
    );
    reducerSchema.path('checks').discriminator('reducer', buildReducerSchema());
    reducerSchema.path('checks').discriminator('check', buildCheckSchema());
    return reducerSchema;
};

While this works in concept, it blows up in practice. Mongoose’s discriminator function greedily consumes the schemas passed into it, which creates an infinite recursive loop that blows the top off of our stack.


The solution I landed on with this problem is to limit the number of recursive calls we can make to buildReducerSchema to some maximum value. We can add this limit by passing an optional n argument to buildReducerSchema that defaults to 0. Every time we call buildReducerSchema from within buildReducerSchema, we’ll pass it an incremented value of n:


reducerSchema.path('checks').discriminator('reducer', buildReducerSchema(n + 1));

Next, we’ll use the value of n to enforce our maximum recursion limit:


const buildReducerSchema = (n = 0) => {
  if (n > 100) {
    return buildCheckSchema();
  }
  ...
};

If we reach one hundred recursions, we simply force the next layer to be a "check"-type schema, gracefully terminating the schema stack.

To finish things off, we need to pass our baseSchema these recursively constructed discriminators (without an initial value of n):


baseSchema.path("checks").discriminator("reducer", buildReducerSchema());
baseSchema.path("checks").discriminator("check", buildCheckSchema());

And that’s it!

Against all odds we managed to build a nested, discriminator-based schema that can fully represent any formula we throw at it, up to a depth of one hundred reducers deep. At the end of the day, I’m happy with that solution.

Spreading Through the Bitcoin Network

Written by Pete Corey on May 21, 2018.

Previously, we beefed up our Elixir-based Bitcoin-node-in-progress to use the Connection behavior to better manage our connection to our peer node. Now that we can robustly connect to a single peer node, let’s broaden our horizons and connect to multiple peers!

Let’s refactor our node to use a dynamic supervisor to manage our collection of connections, and start recursively connecting to nodes in the Bitcoin peer-to-peer network!

Going Dynamic

Each of our connections to a Bitcoin peer node is currently managed through a BitcoinNetwork.Node process. We’ll manage this collection of processes with a new dynamic supervisor called Bitcoin.Node.Supervisor.

Let’s create that new supervisor now:


defmodule BitcoinNetwork.Node.Supervisor do
  use DynamicSupervisor

  def start_link([]) do
    DynamicSupervisor.start_link(__MODULE__, [], name: __MODULE__)
  end

  def init([]) do
    DynamicSupervisor.init(strategy: :one_for_one)
  end
end

The code here is largely boilerplate. Our Node.Supervisor initiates itself with a :one_for_one strategy (the only supervision strategy currently available to a dynamic supervisor). It’s also important to note that like all dynamic supervisors, our Node.Supervisor starts without children.

Back to Where we Started

Next, we’ll go into our BitcoinNetwork.Application supervisor and replace our BitcoinNetwork.Node child specification with a specification for our new dynamic supervisor:


Supervisor.start_link(
  [
    {DynamicSupervisor, strategy: :one_for_one, name: BitcoinNetwork.Node.Supervisor}
  ],
  strategy: :one_for_one
)

After our Application has successfully started its Node.Supervisor child, we’ll go ahead and add our Node process as a child of our new dynamic supervisor:


DynamicSupervisor.start_child(BitcoinNetwork.Node.Supervisor, %{
  id: BitcoinNetwork.Node,
  start:
    {BitcoinNetwork.Node, :start_link,
     [
       {
         Application.get_env(:bitcoin_network, :ip),
         Application.get_env(:bitcoin_network, :port)
       }
     ]},
  restart: :transient
})

We simply moved our BitcoinNetwork.Node child specification out of our old supervisor’s child list, and dropped it into our call to DynamicSupervisor.start_child/2.

What we’re really trying to do here is “connect to a node”, but all of this boilerplate is confusing our intentions. Let’s create a new function in our BitcoinNetwork module called connect_to_node/2 that takes a node’s IP address and a port, and adds a child to our Node.Supervisor that manages the connection to that node:


def connect_to_node(ip, port) do
  DynamicSupervisor.start_child(BitcoinNetwork.Node.Supervisor, %{
    id: BitcoinNetwork.Node,
    start: {BitcoinNetwork.Node, :start_link, [{ip, port}]},
    restart: :transient
  })
end

Now we can replace the start_child/2 mess in the start/2 callback of our Application module with a call to our new connect_to_node/2 function:


BitcoinNetwork.connect_to_node(
  Application.get_env(:bitcoin_network, :ip),
  Application.get_env(:bitcoin_network, :port)
)

That’s much nicer.

Now it’s clear that when our application starts up, it creates a new dynamic supervisor, Node.Supervisor, and then connects to the Bitcoin node specified in our application’s configuration.

At this point, we’re back up to feature parity with our original one-node solution. All we’ve really managed to do it add a supervisor layer into our supervision tree.

Our new supervision tree.

Adding Nodes

Now that we’re equipped with our connect_to_node/2 function and our new dynamic node supervisor, we’re ready rapidly expand our network of known Bitcoin nodes.

Our Node process is currently listening for incoming node addresses in one of our handle_payload/2 functions:


defp handle_payload(%Addr{addr_list: addr_list}, state) do
  log([:bright, "Received ", :green, "#{length(addr_list)}", :reset, :bright, " peers."])

  {:ok, state}
end

We can connect to each of these additional peer nodes by mapping each node address in addr_list over our new connect_to_node/2 function:


Enum.map(addr_list, &BitcoinNetwork.connect_to_node(&1.ip, &1.port))

Let’s clean this up a bit by adding another function head to our connect_to_node/2 function that accepts a single NetAddr struct as a parameter:


def connect_to_node(%NetAddr{ip: ip, port: port}), do: connect_to_node(ip, port)

Now we can simply our map over the list of NetAddr structures we receive in our addr_list variable:


Enum.map(addr_list, &BitcoinNetwork.connect_to_node/1)

Beautiful.

Now our application fires up, connects to our initial Bitcoin peer node, receives that node’s list of peers, and spawns a dynamically supervised process that attempts to connect to each of those peers. If any of those peers successfully connect and return their list of peers, we’ll repeat the process.

So many peers!

Uncontrolled Growth

At this point, our Bitcoin node will happily spreading itself through the Bitcoin peer-to-peer network, introducing itself as a peer to tens thousands of nodes. However, this level of connectivity might be overkill for our node.

We need some way of limiting the number of active peer connections to some configurable value.

We’ll start implementing this limit by adding a max_peers configuration value to our config.exs:


config :bitcoin_network, max_peers: 125

Let’s start with a limit of one hundred twenty five connections, just like the default limit in the Bitcoin core client.

Next, we’ll make a new function in our BitcoinNetwork module to count the number of active peer connections. This is fairly straight forward thanks to the count_children/1 function on the DynamicSupervisor module:


def count_peers() do
  BitcoinNetwork.Node.Supervisor
  |> DynamicSupervisor.count_children()
  |> Map.get(:active)
end

Next, in our connect_to_node/2 function, we’ll wrap our call to DynamicSupervisor.start_child/2 with a check that we haven’t reached our max_peers limit:


if count_peers() < Application.get_env(:bitcoin_network, :max_peers) do
  DynamicSupervisor.start_child(BitcoinNetwork.Node.Supervisor, %{
    ...
  })
else
  {:error, :max_peers}
end

And that’s all there is to it! Now, every time we receive a peer and try to connect to it, our connect_to_node/2 function will first check that we haven’t exceeded the max_peers limit defined in our application’s configuration.

Our Bitcoin node will now limit its pool of peers to a maximum of one hundred twenty five connections.

Final Thoughts

Elixir’s dynamic supervisor is a breeze to work with and made it possible to easily and quickly scale up our pool of peers from one to tens of thousands of connections in the blink of an eye.

While our Bitcoin node is working its way through the Bitcoin peer-to-peer network, it doesn’t actually do anything. We’ll need to spend some time in the future figuring out how to process incoming blocks and transactions. Maybe at some point we’ll even be able to send our own transactions and mine for new blocks!

It sounds like we’ll have to dust off Mastering Bitcoin and finish off the few remaining chapters.