Living the Simple Life with Recursive Parsing and Serialization

I just pushed a massive refactor of my Elixir-powered Bitcoin full node project that considerably simplifies the parsing and serialization of Bitcoin network messages.

I’m a big fan of the solution I landed on, and I wanted to share it with you. The key insight I had was to switch to a recursive solution where each sub-component of every messages handles its own parsing and serialization.

Obviously the devil is in the details, so let’s dive in.

What’s the Problem?

Before I took on this refactor, I was handling the parsing and serialization of Bitcoin network messages entirely manually. For every message, I’d define a parse/1 function and implement a corresponding serialize/1 protocol. Every field within the message was manually parsed and serialized using Elixir’s various binary manipulation operations.

As an example, here’s how a NetAddr message would be parsed using this technique:


def parse(binary) do
  with {:ok, time, rest} <- parse_time(binary),
       {:ok, services, rest} <- parse_services(rest),
       {:ok, ip, rest} <- parse_ip(rest),
       {:ok, port, rest} <- parse_port(rest) do
    {:ok, %NetAddr{time: time, services: services, ip: ip, port: port}, rest}
  end
end

defp parse_time(<<time::32-little, rest::binary>>),
  do: {:ok, time, rest}

defp parse_time(_binary),
  do: {:error, :bad_time}

defp parse_services(<<services::64-little, rest::binary>>),
  do: {:ok, services, rest}

defp parse_services(_binary),
  do: {:error, :bad_services}

defp parse_ip(<<ip::binary-size(16), rest::binary>>),
  do: {:ok, ip, rest}

defp parse_ip(_binary),
  do: {:error, :bad_ip}

defp parse_port(<<port::16-big, rest::binary>>),
  do: {:ok, port, rest}

defp parse_port(_binary),
  do: {:error, :bad_port}

While this was fantastic practice at manipulating binaries within Elixir, it wasn’t a scalable solution. There are simply too many messages in the Bitcoin protocol to implement in this time consuming way. Not only that, but many of the messages share common sub-structures who’s parse/1 and serialize/1 implementations would need to be repeated throughout the project.

Daunted with the task of implementing a parse/1 and serialize/1 function for every message in the protocol’s peer-to-peer vocabulary, I decided I needed a better solution.

Taking Advantage of Sub-Structures

As I mentioned up above, many Bitcoin messages share common sub-structures. Instead of dooming me to tedious repetition, I realized that these repeated structures were actually a blessing from the DRY gods.

If we could architect our parse/1 and serialize/1 implementations in a way that offloads the responsibility of parsing and serializing these shared sub-structures, the parsing and serialization implementations of our top-level messages could be substantially simplified.

Not only that, but we could take the notion of “sub-structures” even further. In many ways, the types of the primitives that compose together to build the protocol’s messages and sub-structures are sub-structures in and of themselves. For example, a uint32_t, which is a C type commonly used to define unsigned integers throughout the protocol’s various messages, is actually a sub-structure that has a single field and specific parsing and serialization rules.

We could implement a UInt32T struct with a corresponding parse/1 function like so:


defmodule BitcoinNetwork.Protocol.UInt32T do
  defstruct value: nil

  def parse(<<value::little-unsigned-integer-32, rest::binary>>),
    do: {:ok, %BitcoinNetwork.Protocol.UInt32T{value: value}, rest}
end

Similarly, we could reverse the process and serialize our newly parsed UInt32T:


defimpl BitcoinNetwork.Protocol.Serialize, for: BitcoinNetwork.Protocol.UInt32T do
  def serialize(%{value: value}),
    do: <<value::little-unsigned-integer-32>>
end

Composing Sub-Structures

Now we have parsing and serialization rules built for these base-level sub-structures like UInt32T and other primitive types. We can build upon the work we’ve done by composing these sub-structures together into more complex structure.

For example, a NetAddr is really just a UInt32T, a UInt64T, a sixteen byte Binary, and a UInt16T representing an addresses’ time, services, ip, and port, respectively. We can write a NetAddr struct complete with a parse/1 function that calls out to the parse/1 functions of these more primitive sub-structures:


defmodule BitcoinNetwork.Protocol.NetAddr do
  defstruct time: nil,
            services: nil,
            ip: nil,
            port: nil

  alias BitcoinNetwork.Protocol.{Binary, NetAddr, UInt32T, UInt64T, UInt16T}

  def parse(binary) do
    with {:ok, time, rest} <- UInt32T.parse(binary),
         {:ok, services, rest} <- UInt64T.parse(rest),
         {:ok, ip, rest} <- Binary.parse(rest, 16),
         {:ok, port, rest} <- UInt16T.parse(rest),
         do:
           {:ok,
            %NetAddr{
              time: time,
              services: services,
              ip: ip,
              port: port
            }, rest}
  end
end

Serializing a NetAddr structure is even easier. We simply build a list of the fields we want serialized, in the order we want them serialized, and then map over that list with our serialize/1 function:


defimpl BitcoinNetwork.Protocol.Serialize, for: BitcoinNetwork.Protocol.NetAddr do
  def serialize(net_addr),
    do:
      [
        net_addr.time,
        net_addr.services,
        net_addr.ip,
        net_addr.port
      ]
      |> BitcoinNetwork.Protocol.Serialize.serialize()
end

We’re left with an Elixir binary that represents the entire serialized NetAddr structure, but we didn’t have to do any of the heavy lifting ourselves.

The best part of this solution is that we can repeatedly build on top of our sub-structures. An Addr message is composed of a VarInt and a list of NetAddr sub-structures. It’s sub-structures all the way down.

Special Cases and Rough Edges

While the general case for this solution works beautifully, there are a few special cases and rough edges we need to smooth over.

The first of these rough edges comes when parsing and serializing fixed-size binaries. For example, within the NetAddr structure, we need to parse sixteen bytes off of the wire and interpret those bytes as an IP address. We instructed our NetAddr parser to do this by calling Binary.parse/2 with 16 as a second argument.

Our Binary module’s parse/2 function accepts an optional second argument that lets us specify exactly how many bytes we want to parse out of the incoming binary:


defmodule BitcoinNetwork.Protocol.Binary do
  def parse(binary, size \\ 1) do
    <<binary::binary-size(size), rest::binary>> = binary
    {:ok, binary, rest}
  end
end

Notice that Binary.parse/2 returns a primitive Elixir binary, rather than a struct. This is an intentional decision and makes our serialization that much easier:


defimpl BitcoinNetwork.Protocol.Serialize, for: BitString do
  def serialize(binary),
    do: binary
end

Another special case we need to handle is made apparent when we need to parse and serialize lists of “things”. A perfect example of this appears in our code when we need to parse an Addr structure, which is composed of a VarInt number of NetAddr structures:


with {:ok, count, rest} <- VarInt.parse(binary),
     {:ok, addr_list, rest} <- Array.parse(rest, value(count), &NetAddr.parse/1),
     do:
       {:ok,
        %Addr{
          count: count,
          addr_list: addr_list
        }, rest}

Like Binary.parse/2, Array.parse/3 has some special behavior associated with it. Our Array module’s parse/3 function takes our binary to parse, the number of “things” we want to parse out of it, and a function to parse each individual “thing”:


defmodule BitcoinNetwork.Protocol.Array do
  def parse(binary, count, parser),
    do: parse(binary, count, parser, [])
end

Our parse/3 function calls out to a private parse/4 function that builds up an accumulator of our parsed “things”. Once we’ve parsed a sufficient number of “things”, we return our accumulated list:


defp parse(rest, 0, parser, list),
  do: {:ok, Enum.reverse(list), rest}

The non-base case of our parse/4 function simply applies our parser/1 function to our binary and appends the resulting parsed “thing” to our list of “things”:


defp parse(binary, count, parser, list) do
  with {:ok, parsed, rest} <- parser.(binary),
       do: parse(rest, count - 1, parser, [parsed | list])
end

Once again, the result of our Array.parse/3 function returns a primitive Elixir list, not a struct. This makes our serialization fairly straight forward:


defimpl BitcoinNetwork.Protocol.Serialize, for: List do
  def serialize(list),
    do:
      list
      |> Enum.map(&BitcoinNetwork.Protocol.Serialize.serialize/1)
      |> join()

  def join(pieces),
    do: 
      pieces
      |> Enum.reduce(<<>>, fn piece, binary -> <<binary::binary, piece::binary>> end)
end

We simply map serialize/1 over our list of “things”, and concatenate the newly serialized pieces together.

If you remember back to our NetAddr serialization example, you’ll notice that we’ve been using our List primitive’s serialization protocol this whole time.

Awesome!

Final Thoughts

I struggled with this refactor on and off for a good few weeks. Ultimately, I’m happy with the solution I landed on. It’s more complex than my original solution in terms of the number of moving parts, but it’s a much more scalable and mentally manageable solution than the one I was originally working with.

Now that this is out of my system, I can turn my attention to the interesting pieces of building a Bitcoin full node: processing blocks!

Expect articles digging into that topic in the future. In the meantime, check out the entire project on Github to get a more hands-on feel for the refactor and the recursive parsing and serialization solution I ultimately landed on.