Building a Better Receive Loop

I’ve been putting quite a bit of time this past week into overhauling and refactoring my in-progress Elixir-based Bitcoin node.

As a part of that overhaul, I turned my attention to how we’re receiving packets from connected peers. The way we’ve been handling incoming packets is overly complicated and can be greatly simplified by taking advantage of the Bitcoin protocol’s packet structure.

Let’s go over our old solution and dig into how it can be improved.

The Original Receive Loop

Our Bitcoin node uses Erlang’s :gen_tcp module to manage peer to peer communications. Originally, we were using :gen_tcp in “active mode”, which means that incoming packets are delivered to our node’s Elixir process in the form of :tcp messages:


def handle_info({:tcp, _port, data}, state) do
  ...
end

Because TCP is a streaming protocol, no guarantees can be made about the contents of these messages. A single message may contain a complete Bitcoin packet, a partial packet, multiple packets, or any combination of the above. To handle this ambiguity, the Bitcoin protocol deliminates each packet with a sequence of “magic bytes”. Once we reach this magic sequence, we know that everything we’ve received up until that point constitutes a single packet.

My previous receive loop worked by maintaining a backlog of all incoming bytes up until the most recently received sequence of magic bytes. Every time a new message was received, it would append those incoming bytes to the backlog and chunk that binary into a sequence of packets, which could then be handled individually:


{messages, rest} = chunk(state.rest <> data)

case handle_messages(messages, state) do
  {:error, reason, state} -> {:disconnect, reason, %{state | rest: rest}}
  state -> {:noreply, %{state | rest: rest}}
end

This solution works, but there are quite a few moving pieces. Not only do we have to maintain a backlog of all recently received bytes, we also have to build out the functionality to split that stream of bytes into individual packets:


defp chunk(binary, messages \\ []) do
  case Message.parse(binary) do
    {:ok, message, rest} ->
      chunk(rest, messages ++ [message])

    nil ->
      {messages, binary}
  end
end

Thankfully, there’s a better way.

Taking Advantage of Payload Length

Every message sent through the Bitcoin protocol follows a specific format.

The first four bytes of every packet are reserved for the network’s magic bytes. Next, twelve bytes are reserved for the name of the command being sent across the network. The next four bytes hold the length of the payload being sent, followed by a four byte partial checksum of that payload.

These twenty four bytes can be found at the head of every message sent across the Bitcoin peer-to-peer network, followed by the variable length binary payload representing the meat and potatoes of the command being carried out. Relying on this structure can greatly simplify our receive loop.

By using :gen_tcp in “passive mode” (setting active: false), incoming TCP packets won’t be delivered to our current process as messages. Instead, we can ask for packets using a blocking call to :gen_tcp.recv/2. When requesting packets, we can even specify the number of bytes we want to receive from the incoming TCP stream.

Instead of receiving partial messages of unknown size, we can ask :gen_tcp for the next 24 bytes in the stream:


{:ok, message} <- :gen_tcp.recv(socket, 24)

Next, we can parse the received message bytes and request the payload’s size in bytes from our socket:


{:ok, %{size: size}} <- Message.parse(message),
{:ok, payload} <- :gen_tcp.recv(socket, size)

And now we can parse and handle our payload, knowing that it’s guaranteed to be a single, complete Bitcoin command sent across the peer-to-peer network.

Final Thoughts

There’s more than goes into the solution that I outlined above. For example, if we’re receiving a command like "verack", which has a zero byte payload, asking for zero bytes from :gen_tcp.recv/2 will actually return all of the available bytes it has in its TCP stream.

Complications included, I still think this new solution is superior to our old solution of maintaining and continually chunking an ongoing stream of bytes pulled off the network.

If you’re eager to see the full details of the new receive loop, check it out on Github!

I’d also like to thank Karl Seguin for inspiring me to improve our Bitcoin node using this technique. He posted a message on the Elixir Slack group about prefixing TCP messages with their length to easily determine how many bytes to receive:

I’d length prefix every message with 4 bytes and do two recvs, {:ok, <<length::big-32>>} = recv(socket, 4, TIMEOUT) {:ok, message} = recv(socket, length, TIMEOUT)

This one line comment opened my mind to the realization that the Bitcoin protocol was already doing this, and that I was overcomplicating the process of receiving messages.

Thanks Karl!