Shutting Down and Open Sourcing Inject Detect

Written by Pete Corey on Apr 2, 2018.

It’s with a heavy heart that I’m announcing that my security-focused SaaS application, Inject Detect, is shutting down.

The goal of Inject Detect was to fight against NoSQL Injection vulnerabilities in Meteor applications. I still believe that this is a worthy cause, but I don’t think the approach taken by Inject Detect was the right one.

I talked with many customers and their primary concern with Inject Detect was the idea of sending their applications’ queries to a third-party service. No amount of explaining that only the structure of these queries, not the queries themselves were being transmitted could assuage their worries.

It makes me happy to think that my customers’ focus on security dissuaded them from using an application focused on security, and it’s obvious, in hindsight, that this would be an issue.


Inject Detect was the largest Elixir-based project I’d ever worked on at the time it was released, and it was my first real foray into the world Event Sourced systems. I invested nearly six months of my free time and time between client engagements working on Inject Detect, and I don’t want that work to go to waste.

With that in mind, I’ve decided to open source the Inject Detect project on Github. While you’re digging through the code, be sure to check out the InjectDetect.CommandHandler module and the InjectDetect.State module. These two modules are the heart of the system and the driving force behind my implementation of Event Sourcing.

Truth be told, I’m still in love with the concept of Event Sourcing, and I believe that it’s the future of web development. I plan on spending more time in the future diving into that topic.


While I’m shutting down Inject Detect, I’m not giving up the war against NoSQL Injection. Instead, I’m doubling down and focusing my efforts on my newest project, Secure Meteor.

Secure Meteor is an upcoming guide designed to help you secure your Meteor application by teaching you the ins and outs of Meteor security.

If you’re a Meteor application owner, a Meteor developer, or are just interested in Meteor security and NoSQL Injection, I highly recommend you head over to www.securemeteor.com and grab a copy of my Meteor security checklist.

RIP Inject Detect. Long live Secure Meteor!

Building Mixed Endian Binaries with Elixir

Written by Pete Corey on Mar 19, 2018.

I’ve never had much of a reason to worry about the “endianness” of my binary data when working on Elixir projects. For the most part, everything within an application will be internally consistent, and everything pulled in from external sources will be converted to the machine’s native ordering several layers of abstraction below where I tend to work.

That blissful ignorance came to an end when I found myself using Elixir to construct packets conforming to the Bitcoin peer-to-peer network protocol.

The Bitcoin Protocol

The Bitcoin protocol is a TCP-based protocol used by Bitcoin nodes to communicate over a peer-to-peer ad hoc network.

The real-world specifications of the protocol are defined to be “whatever the reference client does,” but this can be difficult to tease out from the code. Thankfully, the Bitcoin wiki maintains a fantastic technical description of the protocol.

The structures used throughout the protocol are a mishmash of endianness. As the wiki explains, “almost all integers are encoded in little endian,” but many other fields like checksums, strings, network addresses, and ports are expected to be big endian.

The net_addr structure is an excellent example of this endianness confusion. Both time and services are expected to be little endian encoded, but the IPv6/4 and port fields are expected to be big endian encoded.

How will we build this with Elixir?

First Attempt

My first attempt at constructing this net_addr binary structure was to create a net_addr function that accepts time, services, ip, and port arguments and returns a binary of the final structure in correct mixed-endian order.


def net_addr(time, services, ip, port) do
end

When manually constructing binaries, Elixir defaults to a big endian byte order. This means that I’d need to convert time and services into little endian byte order before adding them to the final binary.

My first attempt at endian conversion was to create a reverse/1 helper function that would take a binary, transform it into a list of bytes using :binary.bin_to_list, reverse that list of bytes, transform it back into a binary using :binary.list_to_bin, and return the result:


def reverse(binary) do
  binary
  |> :binary.bin_to_list
  |> Enum.reverse
  |> :binary.list_to_bin
end

Before I could pass time and services into reverse/1, I needed to transform them into binaries first. Thankfully, this is easy with Elixir’s binary special form.

For example, we can convert time into a four byte (32 bit) big endian binary and then reverse it to create its corresponding little endian representation:


reverse(<<time::32>>)

Using our helper, we can create out final net_addr binary:


<<
  <<time::32>> |> reverse::binary,
  <<services::64>> |> reverse::binary,
  :binary.decode_unsigned(ip)::128,
  port::16
>>

This works, but there’s some room for improvement.

A Faster Second Attempt

After doing some research, I discovered this set of benchmarks for several different techniques of reversing a binary in Elixir (thanks Evadne Wu!).

I realized that I could significantly improve the performance of my packet construction process by replacing my slow list-based solution with a solution that leverages the optional Endianness argument of :binary.decode_unsigned/2 and :binary.encode_unsigned/2:


def reverse(binary) do
  binary
  |> :binary.decode_unsigned(:little)
  |> :binary.encode_unsigned(:big)
end

While this was an improvement, I still wasn’t happy with my solution. Using my reverse/1 function meant that I had to transform my numbers into a binary before reversing them and ultimately concatenating them into the final binary. This nested binary structure was awkward and confusing.

After asking for guidance on Twitter, the ElixirLang account reached out with some sage advice:

Using Big and Little Modifiers

The big and little modifiers are binary special form modifiers, much like the bitstring and binary types. They can be used to specify the resulting endianness when coercing an integer, float, utf16 or utf32 value into a binary.

For example, we can replace our calls reversing the time and services binaries in our final binary concatenation by simply appending big to the final size of each:


<<
  time::32-little,
  services::64-little,
  :binary.decode_unsigned(ip)::128,
  port::16
>>

Awesome! That’s much easier to understand.

While Elixir defaults to a big endian format for manually constructed binaries, it doesn’t hurt to be explicit. We know that our ip and port should be big endian encoded, so let’s mark them that way:


<<
  time::32-little,
  services::64-little,
  :binary.decode_unsigned(ip)::128-big,
  port::16-big
>>

Beautiful.

Final Thoughts

I’m continually amazed by the quantity, diversity, and quality of the tooling that ships out of the box with Elixir and Erlang. Even when it comes to something as niche as low-level binary manipulation, Elixir’s tools are top notch.

If you want to see complete examples of the endian conversion code shown in this article, check out the BitcoinNetwork.Protocol.NetAddr module in my new bitcoin_network project on Github.

J's Low-level Obfuscation Leads to Higher Levels of Clarity

Written by Pete Corey on Mar 19, 2018.

After reading recent articles by Hillel Wayne and Jordan Scales, I’ve become fascinated with the J programming language. Trying to learn J has opened my mind to new ways of thinking about code.

One of the things I find most interesting about J is its ties to natural language, and its corresponding use of code constructs called “hooks” and “forks”.

Many people argue that J is a “write-only” language because of its extreme terseness and complexity of syntax. As a beginner, I’d tend to agree, but I’m starting to warm up to the idea that it might be more readable than it first lets on.

What is J?

Other developers far more knowledgable than I have written fantastic introductions to the J programming language. I highly recommend you check out Hillel Wayne’s posts on hand writing programs in J and calculating burn rates in J. Also check out Jordan Scales’ posts on computing the Fibonacci numbers and Pascal’s triangle in J.

If you’re still not inspired, check out this motivating talk on design patterns vs anti-patterns in APL by Aaron Hsu, and watch Tracy Harms wax poetic about the mind-expanding power of consistency and adjacency in the J programming language.

Next be sure to check out the J Primer, J for C Programmers, and Learning J if you’re eager to dive into the nuts and bolts of the language.

If you’re only interested in a simple “TL;DR” explanation of J, just know that it’s a high-level, array-oriented programming language that follows in the footsteps of the APL programming language.

Language and J

One of the most interesting aspects of J is that each component of a J expression is associated with a grammatical part of speech.

For example, plain expressions of data like the number five (5), or the list of one through three (1 2 3) are described as nouns. The “plus” operator (+) is a verb because it describes an action that can be applied to a noun. Similarly, the “insert” or “table” modifier (/) is an adverb because it modifies the behavior of a verb.

Unlike the human languages I’m used to, J expressions are evaluated from right to left. The following expression is evaulated as “the result of three plus two, multiplied by five”:

   5 * 2 + 3
25

We can modify our + verb with our / adverb to create a new verb that we’ll call “add inserted between”, or more easily, “sum”:

   +/ 1 2 3
6

Interestingly, all J verbs can easily operate over different dimensions (or ranks) of data. The + verb will happily and intuitively (in most cases) work on everything from a single atom of data to a many-dimensioned behemoth of a matrix.

Hooks and Forks

In J, any string of two concurrent verbs is called a “hook”, and any string of three concurrent verbs is called a “fork”. Hooks and forks are used to reduce repetition and improve the readability of our code.

Let’s look at a fork:

   mean =: +/ % #

Here we’re defining mean to be a verb composed of the “sum” (+/), “divided by” (%), and “tally” (#) verbs. This grouping of three verbs creates a fork.

When you pass a single argument into a fork, the two outer verbs (+/ and # in this case) both operate on that argument, and both results are passed as arguments to the middle verb. The resulting value of applying this middle verb is the final result of the fork expression.

A monadic fork applied to a noun.

Let’s use our mean verb to average the numbers one through four:

   mean 1 2 3 4
2.5

Just as we’d expect, the average of 1 2 3 4 is 2.5.


Now let’s try our hand at writing a hook. The routing of arguments within a monadic hook are slightly different than our mondaic fork. Let’s consider this example:

   append_length =: , #

Here we’re defining an append_length verb that first applies “length” (#) to append_length’s argument, and then applies “append” (,) to append_length’s original argument and the result of applying the “length” verb.

A monadic hook applied to a noun.

All that is to say that append_length is a hook that calculates that length of the provided argument and appends it to the end of that argument:

   append_length 0
0 1
   append_length append_length 0
0 1 2

I highly recommend checking out the guide on forks, hooks, and compound adverbs for a more complete explaination and overview of all of the hook and fork forms available to you as a J programmer.

Speaking with Forks

On first exposure, the application rules for hooks and forks struck me as confusing and disorienting. I found myself asking, “why can’t they just stick to the right to left evaulation order?” My eyes were opened to the expressive power of forks when I stumbled across this gem of a quote in Jordan Scales’ article on computing the Fibonacci numbers in J:

[Forks] help us read out our expressions like sentences. For instance, “the sum times the last” can be written as +/ * {:, and our argument is automatically passed to both sides as it needs to be.

Let that sink in. I’ll be the first to admit that the argument routing built into J’s fork and hook constructs isn’t immediately obvious or intuitive, but that low level obfuscation leads to a higher level of clarity. We can read our forks like English sentences.

Let’s try it out with our mean verb:

   mean =: +/ % #

Here we’re saying that mean “is” (=:) the “sum” (+/) “divided by” (%) the “tally” (#).

We can even apply that idea to more complex J expressions, like this tacit expression from Hillel Wayne’s article on hand writing programs in J:

   tacit =: (?@:$ #) { ]

Without context, this expression would have initially overwhelmed me. But armed with our new tools about parsing and reading hooks and forks, let’s see if we can tease some meaning out of it.

Let’s look at the expression in parentheses first. It’s actually a hook of two verbs, ?@:$ and #. Using a little J-foo, we can recognize the first verb as being “shape of random” (?@:$), and the second verb is “tally” (#) , or “indices” in this context. Put together, the expression in parentheses is a verb that creates a “shape of random indices.”

All together, the tacit expression is a fork that reads, tacit “is” (=:) “shape of random indices” ((?@:$ #)) “from” ({) “the right argument” (]).

With arguments, this becomes a little more concrete. We want “a 3 4 shape of random indices from 'aaab'”.

   (3 4) tacit 'aaab'

And that’s just what we get:

abaa
aaab
abbb

Our tacit expressions has constructed a 3 4 matrix filled with random values from the 'aaab' string.

Hooked on Hooks and Forks

I’m far from being an expert in J. To be honest, I’m not even sure I’d call myself a beginner. The truth is that J is a very hard language to learn. Despite its difficultly (or maybe because of it?), I’m enamored with the language.

That said, I’m not going to go out tomorrow and start writing all of my production projects in J. In fact, I don’t imagine ever writing any production code in J.

J may not give me any concrete tools that I can use in my day-to-day work as a software developer, but it’s teaching me new ways of approaching old problems. In a field where constant growth is required and expected, this is an invaluable gift.