Property Testing our Base58Check Encoder with an External Oracle

Written by Pete Corey on Feb 12, 2018.

Recently, we wrote a Base58Check encoder to power our Bitcoin private key and public address generator. Being the diligent developers that we are, we added a unit test to ensure that our encoder was working as we expected.

But was that enough?

Call me a coward, but relying on a solitary unit test based on a single example pulled from a wiki article doesn’t instill huge amounts of confidence in our solution.

Let’s thoroughly test our solution with the help of property-based testing tools and an external oracle!

Oracles and Property Testing

The Base58Check encoding algorithm has been implemented many times by many different developers. Wouldn’t it be great if we could automatically check our implementation against theirs?

We can!

In property-based testing vernacular, this is known as using an oracle. An oracle is another implementation of your solution that is known to be correct under some domain of inputs.

Thankfully, we have a perfect oracle in the form of the Bitcoin Explorer’s CLI tools. Bitcoin Explorer ships with a base58check-encode utility that Base58Check encodes any Base16 string with a given version byte:


> bx base58check-encode abc123 --version 0
17WWM7GLKg9

Given this oracle, we can thoroughly and concisely test our implementation with a single property. The primary desired property of our solution is that it should match the output of bx base58check-encode for all valid inputs.

Getting Comfortable with our Tools

Property testing is simple in concept, but more difficult in practice.

It’s easy to say that for any given binary and any given byte, the output of our solution should match the output of my oracle. Actually generating those inputs and coordinating those test executions is a whole different ball game.

Thankfully, the groundwork has already been laid for us, and there are plenty of Elixir-based property testing tools for us to chose from. For this exercise, let’s use StreamData.


To get our feet wet, let’s write a simple property test using StreamData that verifies the associative property of the Kernel.+/2 addition function:


property "addition is associative" do
  check all a <- integer(),
            b <- integer(),
            c <- integer() do
    l = Kernel.+(Kernel.+(a, b), c)
    r = Kernel.+(a, Kernel.+(b, c))
    assert l == r
  end
end

The property keyword defines our new property test with a short description of the property under test.

The check all block lets us define our automatically generated inputs and a function block that will use those inputs to make assertions about our property.

Put simply, we’re telling StreamData that we want three random integers: a, b, and c. For every set of a, b, and c, we want to verify that (a + b) + c equals a + (b + c).

StreamData does this by generating many (one hundred by default) random sets of a, b, and c and checking them against our assertions. If any assertion fails, StreamData will try to “shrink” the input set (a, b, and c, in this case) to the simplest possible failing test case and present it to us.


> mix test
.

Finished in 0.06 seconds
1 property, 0 failures

Thankfully, addition is associative, and our test passes!

Consulting the Oracle

Now let’s take the training wheels off and write a property test for our Base58Check encoder against our external oracle.

First, we’ll define a new test block:


property "gives the same results as bx base58check-encode" do
end

Within our test, we’ll generate two random variables, key and version:


check all key <- binary(min_length: 1),
          version <- byte() do
end

We’re telling StreamData that key can be any non-empty binary, and that version can be any byte.

Now that we have our set of test data, we’ll need to get the result of encoding key with version using our own implementation of the Base58Check encoding algorithm:


result = Base58Check.encode(key, <<version>>)

Next, we’ll use Elixir’s System.cmd to call bx base58check-encode, passing in our Base16-encoded key string and our version byte:


oracle =
  System.cmd("bx", [
    "base58check-encode",
    Base.encode16(key),
    "--version",
    "#{version}"
  ])
  |> elem(0)
  |> String.trim()

Now all that’s left to do is to verify that our result matches the output of our oracle:


assert result == oracle

If StreamData detects any failures in this assertion, it will simplify key and version to the simplest failing case and report the failure to us.

But thankfully, our implementation of the Base58Check encoding algorithm passes the test:


mix test
.

Finished in 1.0 seconds
1 property, 0 failures

Final Thoughts

I won’t pretend to be a property testing expert. I’m just a guy who’s read a few articles and who’s hopped on board the hype train. That said, property testing was the perfect tool for this job, and I can see it being an incredibly useful tool in the future. I’m excited to incorporate it into my testing arsenal.

If you’re interested in property-based testing, I recommend you check out Fred Hebert’s PropEr Testing, and Hillel Wayne’s articles on hypothesis testing with oracle functions and property testing with contracts.

Lastly, if you’re interested in Bitcoin development, I encourage you to check out Andreas Antonopoulos’ Mastering Bitcoin.

Mining for Bitcoin Vanity Addresses with Elixir

Written by Pete Corey on Feb 5, 2018.

We previously worked through the process of generating a Bitcoin private address and translating it into a shareable public address using only the tools and libraries shipped with Elixir and Erlang.

The guiding force behind that article was Andreas Antonopoulos’ excellent Mastering Bitcoin book.

Let’s take another bite out of Mastering Bitcoin and implement the algorithm Andreas describes for “mining for vanity addresses” at the end of chapter four. After we implement the basic algorithm, we’ll add our Elixir special sauce and turn it into a fully parallelized procedure.

What is a Vanity Address?

The concept of a vanity address is simple. It’s a normal Bitcoin public address that contains some sequence of desired characters.

For example, a random Bitcoin public address might look like the following:


1HKz4XU7ENT46ztEzsT83jRezyiDjvnBV8

On the live network, Bitcoin addresses always begin with 1, but the remaining characters are entirely random.

A vanity address might look like this:


1pete7qrCiembh8AEf1zRP2zn6nDsLoHC

You’ll notice that the first five characters of this address are 1pete. This isn’t an accident! I’ve intentionally sought out a public address that begins with my name, Pete, so people know who they’re sending their large sums of Bitcoin to.

While the term “mining” sounds intimidating, the actual process of generating these vanity addresses is laughably simple.

How do you Mine Vanity Addresses?

“Mining,” in this context, is just another term for repeatedly doing something until some condition is met. As in, “keep digging until you find gold!”

We’ll mine our vanity public address by repeatedly generating a private key, transforming it into a public address, and checking if the resulting address matches our desired pattern.

That’s it!

Building that in Elixir should be a walk in the park. We’ll start off by creating a new VanityAddress module and stubbing out a generate_private_key/2 function:


defmodule VanityAddress do
  def generate_private_key(regex, version \\ <<0x00>>)
end

Our generate_private_key/2 function expects a regex which represents the pattern we’re trying to find in a vanity address (like ~r/^1pete/), and a version byte that will used to indicate where this Bitcoin address will be used.

Within our generate_private_key/2 function, we’ll kick off the mining process by generating a random private key and transforming it into a public address:


private_key = PrivateKey.generate
public_address = PrivateKey.to_public_address(private_key)

If the public_address we generated matches the pattern we provided in our regex, we’ve successfully mined a vanity address! In that case, we’ll return the private_key. Otherwise, we’ll repeat the entire process with a recursive call to generate_private_key/2:


case public_address =~ regex do
  true -> private_key
  false -> generate_private_key(regex, version)
end

That’s all there is to it.

We can use our new generate_private_key/2 function in conjunction with the PrivateKey.to_public_address/2 function we built last time to view our newly mined vanity key:


VanityAddress.generate_private_key(~r/^1pete/)
|> PrivateKey.to_public_address

Congratulations; we’re miners!

Thinking in Parallel

The problem with our simple implementation of generate_private_key/2 is that it’s slow.

While it’s true that the mining algorithm is inherently slow, there are many optimizations we could make to the code we’ve written. The most obvious improvement that comes to mind when using a “process-oriented” programming language like Elixir is to parallelize the mining algorithm across multiple processes.

However, parallelizing our mining algorithm presents an interesting set of challenges.

Each individual call to generate_private_key/2 is completely synchronous and sequential. We won’t see much of a benefit by queuing up multiple concurrent calls to generate_private_key/2 on the same CPU core. That said, while we’re running generate_private_key/2 within a single process bound to a single CPU core, any other cores available to us are sitting idle.

Ideally, we could simultaneously run as many instances of our generate_private_key/2 execution as we have cores. The moment any of our parallel executions find a matching key, it would be returned to the caller.

Creating a Stream of Parallel Tasks

Elixir’s little known (to me) Task.async_stream/3 function is the tool we need to implement this functionality.

Task.async_stream/3 expects an enumerable as its first argument and a function to be applied concurrently to each element in the enumerable. Each element in the enumerable will have the provided function applied to it in a new process.

If we squint our eyes a little, we can see that this gives us what we need. The “enumerable” we pass into Task.async_stream/3 will really be an infinite stream of zero-argument anonymous functions. Each of those anonymous functions simply calls generate_private_key/2.

We’ll use Stream.cycle/2 to create an infinite stream of these functions:


[fn -> generate_private_key(regex, version) end]
|> Stream.cycle

The function that we want to run in parallel simply executes each of those passed in anonymous functions, one at a time, each in its own process:


|> Task.async_stream(fn f -> f.() end)

This is where our parallelization happens. Each call to generate_private_key/2 is happening in a new process, and Elixir’s scheduler will spread each new process out over the available cores in the system.

By default, Task.async_stream/3 will run up to System.schedulers_online/0 parallel instances of our generate_private_key/2 execution, and System.schedulers_online/0 defaults to the number of available CPU cores in the system. This means we’ll always have one instance of generate_private_key/2 running on each of our cores.

Perfect!

Filtering Our Stream

Task.async_stream/3 returns a stream that produces either an {:ok, value} tuple on success, or an {:exit, reason} tuple on failure. We don’t anticipate or care about failures in this situation, so we’ll nil them out with Stream.map/2:


|> Stream.map(fn
  {:ok, thing} -> thing
  _ -> nil
end)

Now we can use Stream.reject/2 to filter out any nil values from our mapped stream:


|> Stream.reject(&(&1 == nil))

Let’s wrap what we’ve done in a function called stream_private_keys/2 that accepts a regex and a version:


def stream_private_keys(regex, version \\ <<0x00>>) do
  [fn -> generate_private_key(regex, version) end]
  |> Stream.cycle
  |> Task.async_stream(fn f -> f.() end)
  |> Stream.map(fn
    {:ok, thing} -> thing
    _ -> nil
  end)
  |> Stream.reject(&(&1 == nil))
end

What we’re left with is a stream that will produce any number of valid Bitcoin vanity addresses for a given regex and version, using all of the available CPU cores on our system.

Putting Our Stream to Use

Our stream doesn’t actually do anything until we try to pull values out of it using a function from the Enum module. Let’s use Enum.take/2 to pull out three vanity Bitcoin addresses that match our desired pattern (123):


VanityAddress.stream_private_keys(~r/^123/)
|> Enum.take(3)
|> Enum.map(&PrivateKey.to_public_address/1)

["123avbA76Zk98jik3ymTHkjbjKKftAhJiZ",
 "123aGknGk5F2NPkB6e4e6pehVGL7gBR8az",
 "123hzDB1CxcyDwusfsb8Pfh3Ti2i4NQLGR"]

If we take a look at our CPU usage while our mining pipeline is chugging away, we’ll see that all of the CPUs on our machine are being fully utilized.

Success!

Final Thoughts

Spoiler alert: the process of mining for Bitcoin is nearly identical to mining for vanity addresses. Instead of hashing private keys and looking for a random leading string like 1pete, Bitcoin miners hash transaction data, looking for hashes that begin with some number of leading zeros corresponding to the current block difficulty.

There’s a huge amount of pomp and circumstance around the term “mining”, but at its core, it’s an incredibly simple and approachable idea.

Be sure to check out the VanityAddress module in my hello_bitcoin project on Github, and if this kind of thing is at all interesting to you, I highly recommend you pick up a copy of Andreas Antonopoulos’ Mastering Bitcoin.

Hacking Prototypal Inheritance for Fun and Profit

Written by Pete Corey on Jan 29, 2018.

Every object in Javascript is built on top of a “prototype”. A prototype can either be either another object, or null. When an object’s prototype is another object, the first object can inherit fields from the second.

This type of prototypal inheritance is well-accepted and relatively well understood by most Javascript developers, but there are dangerous implications behind this kind of inheritance model.

Let’s dive into how we can hack prototypal inheritance for fun and profit!

How are Prototypes Created?

A Javascript object’s prototype is referenced through the hidden __proto__ field. This field can be found on every object in an application. The __proto__ field can either point to another object, or null if the current object has no prototype.

The two options for an object's prototype.

The prototype of an object can explicitly be set using Object.create, and passing in your desired prototype.


let p = { foo: 123 };
let a = Object.create(p);
let b = Object.create(null);

In this example, our new a object inherits the foo field from the p object being used as its prototype. Our new b object has no prototype, and inherits no fields.

Our two new objects and their prototype chains.

The prototype of an object can also be manually set through the __proto__ field:


let c = {};
c.__proto__ = { bar: 234 };

In this case, we replace the reference to c’s original prototype with a reference to a new object. We can now access the inherited bar field through c.

It's objects all the way down.

By default, all Javascript objects created through the literal notion point to Object.prototype as their prototype. Object.prototype is an object that holds helper functions like constructor, hasOwnProperty, and toString. Additionally, Object.prototype has a prototype of null.

This means that in addition to the bar field, our c object also has access to everything living in Object.prototype via its prototype’s prototype!

Setting the Scene

Armed with this information, let’s think about how we can exploit a simple (read: contrived) Node.js application.

Let’s assume that we’re building an application using an Express-like framework. We’ve created one endpoint to update values in an in-memory key-value store:


const store = {
    cats: "rule",
    dogs: "drool"
};

app.post('/update/:key/:value', function(req, res) {
    let { key, value } = req.params;
    res.send(_.set(store, key, value));
});

The /update route is used to update our store with various facts. This route is unauthorized as its intended to be used by unauthenticated clients.


We have another route, /restricted, that’s only intended to be used by authenticated, authorized users:


app.post('/restricted', function(req, res) {
    let user = getUser(req);
    if (!user || !user.isAdmin) {
        throw new Error("Not authorized!");
    }
    res.send("Permission granted!");
});

Let’s assume that the getUser function returns a user object based on a session token provided through req. Let’s also assume that the isAdmin field is set to true on administrator user objects, and unset on non-administrator user objects.

Hacking the Prototype

Now that the scene is set, imagine that we’re a normal, non-administrator, user of this application, and we want access to the /restricted endpoint.

Our calls to /restricted return a "Not authorized!" exception because our user object returned by getUser doesn’t have an isAdmin field. With no way of updating our admin flag, it seems we’re stuck.

Or are we?

Thankfully, our recent reading on prototypal inheritance has given us a flash of malevolent insight!

The /update endpoint is using Lodash’s _.set function to update the value of any field in our store, including nested fields. We can use this to our advantage. We quickly make a call to /update with a key of "__proto__.isAdmin", and a value of "true" (or any other truthy value), and try our restricted endpoint again:


Permission granted!

Victory! We’ve given ourself access to a restricted endpoint by modifying an arbitrary object within our Javascript application!

But how did we do it?

Explaining the Magic

As we mentioned earlier, unless specifically created with a different prototype, all objects reference Object.prototype as their prototype. More specifically, all objects in an application share the same reference to the same instance of Object.prototype in memory.

If we can modify Object.prototype, we can effectively modify the fields inherited by all of the objects in our application.

Our request to the /update endpoint, with a key of "__proto__.isAdmin", and a value of "true" effectively turned into this expression on our server:


_.set(store, "__proto__.isAdmin", "true")

This expression reaches into Object.prototype through the __proto__ field of our store and creates a new isAdmin field on that object with a value of "true". This change has far reaching consequences.

Everything is an admin!

After we update our “store”, every object that exists in our application now inherits an isAdmin field with a value of "true". This means that on retrieving our user object from getUser, it looks something like this:


{
  _id: 123,
  name: 'Pete',
  __proto__: {
    isAdmin: 'true',
    constructor: ...,
    hasOwnProperty: ...,
    toString: ...,
    ...
    __proto__: null
  }
}

Because our base user object has no isAdmin field, trying to access isAdmin on this object results in the isAdmin field from our underlying Object.prototype object to be returned. Object.prototype returns a value of "true", causing our server’s permission check to pass, and giving us access to juicy, restricted functionality.

In Reality

Obviously, this a fairly contrived example. In the real world, this type of vulnerability wouldn’t present itself in such a simple way. That said, this vulnerability does exist in the real world. When it rears its head, it’s often incredibly ugly. Adding unexpected fields to every object in your system can lead to disastrous results.

For example, imagine a vulnerability like this existing in a Meteor application. Once the underlying Object.prototype is updated with superfluous fields, our entire applications falls to pieces. Any queries made against our MongoDB collections fail catastrophically:

Exception while invoking method 'restricted' MongoError: 
  Failed to parse: { 
    find: "users", 
    filter: { 
      _id: "NktioYhaJMuKhbWQw", 
      isAdmin: "true" 
    }, 
    limit: 1, 
    isAdmin: "true" 
  }. Unrecognized field 'isAdmin'.

MongoDB fails to parse our query object with the added isAdmin fields, and throws an exception. Without being able to query our database, our application is dead in the water.

Fixing the Vulnerability & Final Thoughts

The fundamental fix for this issue is incredibly simple. Don’t trust user-provided data.

If a user is allowed to update a field on an object (or especially a nested field in an object), always whitelist the specific fields they’re allowed to touch. Never use user-provided data in a way that can deeply modify an object (any object) on the server.

If you’re interested in this kind of thing, I encourage you to check out my latest project, Secure Meteor! It’s an in-the-works guide designed to help you secure your past, present, and future Meteor applications. As a token of thanks for signing up, I’ll also send you a free Meteor security checklist!