Let's Get Personal

Written by Pete Corey on Dec 18, 2017.

Two and a half years ago I officially incorporated my first company and started my independent consulting career under the name “East5th, Inc.”

At that time, I really didn’t know what I wanted out of my business, or what it would ultimately grow into. When deliberating whether to go into business under my own name or under the guise of a company, I figured that working behind a fictional entity would be the safer choice.

Over the years, I’ve come to realize that this was the wrong choice for me.

Fun fact: I picked the name “East5th, Inc.” because:

  1. I lived on East 5th Avenue.
  2. My accountant suggested I tack “Inc.” to the end because, well, that’s just what you do.

Pros and Cons

You’re afforded certain freedoms when you do business as a fictional entity. You’re able to easily scale your business to multiple employees, which in turns lets you easily distance yourself from your business and eventually exit.

I don’t want either of these things.

When I first started out on my own, I imagined building a small team around myself and managing projects entirely in-house from start to finish. While I’ve done some of that and I do enjoy it, I’ve found that I do my best work when I deeply integrate myself into an existing team or project. I want to focus on the technical aspects of my work, not the complexities of managing a team.

I’ve also come to the realization that I don’t want an exit strategy. I’ve never been happier with my career than I am right now. I love the teams I’ve worked with over the years, and I love the projects we’ve worked on together. I love writing and sharing the things I’m excited about with the world.

I’m in this for life.

Good Night, East5th

All of that is a roundabout way of saying that I’ve made the decision to drop the “East5th” name. Effective immediately, this site will become www.petecorey.com. Old links to www.east5th.co will continue to work and redirect to the new domain. Additionally I’ve renamed my Medium account from “@east5th” to @petecorey.

I’m hoping this change will bring a me a new set of freedoms, like the freedom to be more honest, open, and authentic with my writings and projects.

Generating Sequences with Elixir Streams

Written by Pete Corey on Dec 11, 2017.

It’s almost Christmas, which means the 2017 edition of Advent of Code is officially upon us! While studying other people’s solutions to the first few problems, Sasa Juric‏’s use of streams stood out to me.

The resource/3 and the closely related iterate/2 and unfold/2 functions in the Stream module were completely unknown to me and seemed worthy of some deeper study.

After spending time researching and banging out a few examples, I realized that these functions are extremely useful and underrated tools for generating complex enumerable sequences in Elixir.

Let’s dive into a few examples to find out why!

Simple Sequences

The resource/3, iterate/2, and unfold/2 functions in the Stream module are all designed to emit some new values based on previous information. While they all accomplish the same task, they each have their own nuances.

Stream.iterate/2 is best suited for generating simple sequences. We’ll define a “simple sequence” as a sequence who’s next value can entirely be determined by its preview value.

For example, we can implement a simple incrementing sequence:


Stream.iterate(0, &(&1 + 1))

Taking the first 5 values from this stream (|> Enum.take(5)) gives us [0, 1, 2, 3, 4].

We ramp the complexity up a notch and generate a Mandelbrot sequence for a given value of c:


defmodule Mandelbrot do
  def sequence({cr, ci}) do
    Stream.iterate({0, 0}, fn {r, i} -> 
      # z₂ = z₁² + c
      {(r * r - i * i) + cr, (i * r + r * i) + ci}
    end)
  end
end

The value of z is entirely determined by the value of z and some constant.

Sequences with Accumulators

There are times when the next value in a sequence can’t be generated with the previous value alone. Sometimes we need to use other information we’ve built up along the way to generate the next value in our sequence.

A perfect example of this type of sequence is the Fibonacci sequence. To generate the nth value in the Fibonacci sequence, we need to know the previous value, n-1, and also the value before that, n-2.

This is where Stream.unfold/2 shines. The unfold/2 function lets us build up an accumulator as we generate each value in our sequence.

Here’s an stream that generates the Fibonacci sequence:


Stream.unfold({0, 1}, fn {a, b} -> {a, {b, a + b}} end)

Stream.unfold/2 takes the initial value of our accumulator and a next_fun function. Our next_fun function returns a tuple who’s first element is the value emitted by our stream, and who’s second element is the accumulator that will be passed into the next call to next_fun.

Taking the first 10 elements of this stream gives us a familiar sequence of numbers:


[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

For a more complex example of the power of Stream.unfold/2, check out how I implemented a stream that generates an integer spiral as part of an Advent of Code challenge.

Multiple Values per Accumulator

Sometimes we’re faced with a true doozy of a sequence, and even an accumulator isn’t enough to accurately describe the next value. For example, what if we need to emit multiple sequential values for each invocation of our next_fun?

Stream.resource/3 to the rescue!

The Stream.resource/3 function was originally intended to consume external resources, hence its name and start_fun and after_fun functions, but it also works beautifully for building sufficiently complex sequences.

Wolfram-style cellular automata, such as Rule 30, is a perfect example of this type of complex sequence.

To generate our Rule 30 sequence, we’ll start with an initial list of values (usually just [1]). Every time we call our next_fun, we want to calculate the entire next layer in one go (which would be [1, 1, 1]). Unfortunately, the size of these layers continues to grow (the next would be [1, 1, 0, 0, 1], etc…).

How would we write a stream that outputs each cell of each layer in order, and tells us the cell’s value and depth?


defmodule Rule30 do
  def stream(values) do
    Stream.resource(fn -> {values, 0} end, &next_fun/1, fn _ -> :ok end)
  end

  defp next_fun({values, layer}) do
    next = ([0, 0] ++ values ++ [0, 0])
    |> Enum.chunk_every(3, 1, :discard)
    |> Enum.map(&rule/1)
    {values |> Enum.map(&{layer, &1}), {next, layer + 1}}
  end

  defp rule([1, 1, 1]), do: 0
  defp rule([1, 1, 0]), do: 0
  defp rule([1, 0, 1]), do: 0
  defp rule([1, 0, 0]), do: 1
  defp rule([0, 1, 1]), do: 1
  defp rule([0, 1, 0]), do: 1
  defp rule([0, 0, 1]), do: 1
  defp rule([0, 0, 0]), do: 0
end

Stream.resource/3 makes short work of this potentially difficult problem. We start with our initial set of values, and a depth of 0. Each call to our next_fun tacks on a few buffering 0s to each side of our values list, and applies the rule function to each chunk of 3 values. Once we’ve finished calculating our next layer, we emit each value grouped with the current depth along with our updated accumulator tuple.

We can take the first 9 values from this stream and notice that they match up perfectly with what we would expect from Rule 30:


Rule30.stream([1])
|> Enum.take(9)
|> IO.inspect
# [                {0, 1},
#          {1, 1}, {1, 1}, {1, 1}, 
#  {2, 1}, {2, 1}, {2, 0}, {2, 0}, {2, 1}]

Final Thoughts

While we’ve mostly talked about mathematical sequences here, these and techniques ideas apply to any type of enumerative data that needs to be generated.

In closing, here are a few rules to remember when working with Stream.iterate/2, Stream.unfold/2, and Stream.resource/3:

  • Use Stream.iterate/2 when you want to emit one value per call to your next_fun function, and that value can be entirely generated from the previously emitted value.
  • Use Stream.unfold/2 when you want to emit one value per call to your next_fun function, but need additional, accumulated data to generate that value.
  • Use Stream.resource/3 when you want to emit multiple values per call to your next_fun function.

I highly encourage you to check out Elixir streams if you haven’t. The Stream module is full of ridiculously useful tools to help streamline your data pipeline.

Fleshing out URLs with Elixir

Written by Pete Corey on Dec 11, 2017.

My most recent side project, Affiliate Crawler, is a web crawler designed to find links in your written content that can be turned into affiliate links.

If you’re curious, be sure to read more about the how, what, and why behind Affiliate Crawler.

Affiliate Crawler works by being given a starting URL that points to your content. It crawls that page, looking for external links that can potentially be monetized, and internal links to crawl further.

Unfortunately, the Elixir-based crawler that powers Affiliate Crawler only works correctly when given fully qualified URLs, such as http://www.east5th.co/. Partial URLs, like www.east5th.co result in parsing errors and a general “failure to crawl”.

This is unfortunate because most people deal exclusively in partially formed URLs. You tell people to go to google.com, not http://www.google.com/. This seems to be especially true when prompted for URLs online.

What we need is a way to infer fully fleshed out URLs from user-provided partial URLs, just like we naturally would in conversation.

Thankfully, this is ridiculously easy thanks to Elixir’s URI module, pattern matching, and the awesome power of recursion!

The Parts of a Parsed URL

Elixir’s URI.parse/1 function accepts a uri string as an argument and returns a URI struct that holds the component pieces that make up that uri.

For example, parsing "http://www.east5th.co/" returns the following data:


%URI{authority: "www.east5th.co", fragment: nil, host: "www.east5th.co",
 path: "/", port: 80, query: nil, scheme: "http", userinfo: nil}

URI.parse/1 only works on fully fleshed out URLs. Attempting to parse a partially specified uri string, like "east5th.co", returns a mostly empty struct:


%URI{authority: nil, fragment: nil, host: nil, path: "east5th.co", port: nil,
 query: nil, scheme: nil, userinfo: nil}

The major pieces of missing information needed to properly parse this uri are the path, and the scheme. Given those two pieces of information, everything else can be inferred by URI.parse/1.

Thankfully, we can come up with some fairly reasonable defaults for both path and scheme. If no path is provided, as in a uri like "http://www.east5th.co", we could assume a default of "/". If no scheme is specified, like in "www.east5th.co/, we could assume a default of http.

But how do we know when and where to add these default values to our uri string?

Always prepending "http://" and appending "/" leads to obvious problems in most cases. Using a regex-based check sounds fragile and error-prone.

There has to be some other way.

Pattern Matching and Recursion

In turns out that pattern matching makes it incredibly easy to know wether we need to provide a default path or scheme. After our first pass through URI.parse/1, we can look for nil values in either path or scheme:


case URI.parse(uri) do
  %URI{scheme: nil} ->
    # Needs default scheme
  %URI{path: nil} ->
    # Needs default path
  %URI{} ->
    # Successfully parsed
end

But pattern matching alone doesn’t give us our solution…

But what if uri needs both a scheme and path, as in the case of "www.east5th.co"? And wouldn’t we still need to re-parse the newly fleshed out uri to populate the inferred fields in the URI struct like port, and authority?

Thankfully, wrapping our URI.parse/1 call in a function and calling that function recursively elegantly solves both problems:


defp parse(uri) do
  case URI.parse(uri) do
    %URI{scheme: nil} ->
      parse("http://#{url}")
    %URI{path: nil} ->
      parse("#{url}/")
    parsed ->
      parsed
  end
end

We define a function called parse and within that function we parse the provided uri string with a call to URI.parse/1.

If URI.parse/1 returns a URI struct without a scheme, we recursively call parse with "http://" prepended to the current uri.

Similarly, if URI.parse/1 returns a URI struct without a path, we recursively call parse with "/" appended to the current uri string.

Otherwise, we return the parsed URI struct generated from our fully-qualified URL.


Depending on your personal preference, we could even write our parse/1 function with our pattern matching spread across multiple function heads:


def parse(uri) when is_binary(uri), do: parse(URI.parse(uri))
def parse(uri = %URI{scheme: nil}), do: parse("http://#{to_string(uri)}")
def parse(uri = %URI{path: nil}),   do: parse("#{to_string(uri)}/")
def parse(uri),                     do: uri

Pattern matching and recursion, regardless of your preferred style, can lead to some truly beautiful code.

Final Thoughts

I’m continually amazed by the tools offered by the Elixir programming language straight out of the box. Pattern matching, a recursion-first philosophy, and even the incredibly useful URI module are examples of techniques and features that make your day-to-day development life easier when working this this language.

If you want to see how I used these tools to build fully-qualified URLs from partial user input, check out the Affiliate Crawler source code on Github, and check it out in action at Affiliate Crawler.