Detecting NoSQL Injection

Written by Pete Corey on Jul 10, 2017.

The entire premise behind my latest project, Inject Detect, is that NoSQL Injection attacks can be detected in real-time as they’re being carried out against your application.

But how?

In this article, I’ll break down the strategy I’m using for detecting NoSQL Injection in MongoDB-based web applications.

At a high level, the idea is to build up a set of expected queries an application is known to make, and to use that set to detect unexpected queries that might be to result of a NoSQL Injection attack.

Let’s dig into the details.

Expected Queries

Every web application has a finite number of queries it can be expected to make throughout the course of its life.

For example, a shopping cart application might query for single orders by _id:


Orders.findOne(orderId);

Similarly, it might query for a number of orders created in the past three days:


Orders.find({createdAt: {$gte: moment().subtract(3, "days").toDate()}});

These queries aren’t limited to fetching data. When a user “deletes” an order, the application may want to set a flag on the order in question:


Orders.update({userId: this.userId}, {$set: {deleted: true}});

Each of these individual queries can be generalized based on the shape of the query and the type of data passed into it.

For example, we expect the Orders.findOne query to always be called with a String. Similarly, we expect the Orders.find query to be passed a Date for the createdAt comparison. Lastly, the Orders.update query should always be passed a String as the userId.


Orders.findOne({_id: String});

Orders.find({createdAt: {$gte: Date}});

Orders.update({userId: String}, ...);

An application might make thousands of queries per day, but each query will match one of these three generalized query patterns.

Unexpected Queries

If our application makes a query that does not fall into this set of expected queries, we’re faced with one of two possibilities:

  1. We left a query out of our set of expected queries.
  2. Our application is vulnerable to a NoSQL Injection vulnerability.

Imagine our application makes the following query:


Orders.findOne({_id: { $gte: "" }});

A query of this pattern (Orders.findOne({_id: {$gte: String}})) doesn’t exist in our set of expected queries. This means that this is either an expected query that we missed, or our application is being exploited.

It’s unlikely that our application is trying to find a single Order who’s _id is greater than or equal to an empty string.

In this case, it’s far more likely that someone is exploiting our Orders.findOne({_id: String}) query and passing in an orderId containing a MongoDB query operator ({$gte: ""}) rather than an expected String.

We’ve detected NoSL Injection!

By watching for queries that fall outside our set of expected queries, we can detect NoSQL Injection as it happens.

Similar Expected Queries

Basing our NoSQL Injection detection strategy around expected and unexpected queries has an added bonus.

Because we have a set of all expected queries for a given application, unexpected queries that are the result of NoSQL Injection attacks can often be linked back to the query being exploited.

To illustrate, in our previous example we detected an unexpected query against our application:


Orders.findOne({_id: {$gte: ""}});

Inspecting our set of expected queries, it’s obvious that the most similar expected query is the Orders.findOne query:


Orders.findOne({_id: String});

As the application owner, we know that we need to enforce more stringent checks on the type of the provided orderId.

Based on this information, an application owner or developer can deduce which specific query is being exploited within their application and respond quickly with a fix.

Final Thoughts

Progress on Inject Detect continues to move along at a steady pace. Once finished, Inject Detect will automatically apply this type of real-time NoSQL Injection detection to every query made by your Meteor application.

If you’re interested in learning more, be sure to sign up for the Inject Detect newsletter.

What is NoSQL Injection?

Written by Pete Corey on Jul 3, 2017.

Progress on Inject Detect continues to chug along. I’ve been working on building out an educational section to hold a variety of articles and guides designed to help people better understand all things NoSQL Injection.

This week I put the finishing touches on two new articles: ”What is NoSQL Injection?”, and “How do you prevent NoSQL Injection?”.

For posterity, I’ve included both articles below.

What is NoSQL Injection?

NoSQL Injection is security vulnerability that lets attackers take control of database queries through the unsafe use of user input. It can be used by an attacker to:

  • Expose unauthorized information
  • Modify data
  • Escalate privileges
  • Take down your entire application

Over the past few years, we’ve worked with many teams building amazing software with Meteor and MongoDB. But to our shock and dismay, we’ve found NoSQL Injection vulnerabilities in nearly all of these projects.

An Example Application

Let’s make things more real by introducing an example to help us visualize how NoSQL Injection can occur, and the impact it can have on your application.

Imagine that our application accepts a username and a password hash from users attempting to log into the system. We check if the provided username/password combination is valid by searching for a user with both fields in our MongoDB database:


Meteor.methods({
    login(username, hashedPassword) {
        return Meteor.users.findOne({ username, hashedPassword });
    }
});

If the user provided a valid username, and that user’s corresponding hashedPassword, the login function will return that user’s document.

Exploiting Our Application

In this example, we’re assuming that username and hashedPassword are strings, but we’re not explicitly making that assertion anywhere in our code. A user could potentially pass up any type of data from the client, such as a string, a number, or even an object.

A particularly clever user might pass up "admin" as their username, and {$gte: ""} as their password. This combination would result in our login method making the following query:


db.users.findOne({ username: "admin", hashedPassword: {$gte: ""}})

This query will return the first document it finds with a username of "admin" and a hashed password that is greater an empty string. Regardless of the admin user’s password, their user document will be returned by this query.

Our clever user has successfully bypassed out authentication scheme by exploiting a NoSQL Injection vulnerability.

How do you prevent NoSQL Injection?

In our previous example, our code was making the assumption that the user-provided username and hashedPassword were strings. We ran into trouble when a malicious user passed up a MongoDB query operator as their hashedPassword.

Speaking in broad strokes, NoSQL Injection vulnerabilities can be prevented by making assertions about the types and shapes of your user-provided arguments. Instead of simply assuming that username and hashedPassword were strings, we should have made that assertion explicit in our code.

Using Checks

Meteor’s Check library can be used to make assertions about the type and shape of user-provided arguments. We can use check in our Meteor methods and publications to make sure that we’re dealing with expected data types.

Let’s secure our login method using Meteor’s check library:


Meteor.methods({
    login(username, hashedPassword) {
        check(username, String);
        check(hashedPassword, String);
        return Meteor.users.findOne({ username, hashedPassword });
    }
});

If a user passes in a username or a password that is anything other than a string, the one of the calls to check in our login method will throw an exception. This simple check stops NoSQL Injection attacks dead in their tracks.

Using Validated Methods

Meteor also gives us the option of writing our methods as Validated Methods. Validated methods incorporate this type of argument checking into the definition of the method itself.

Let’s implement our login method as a validated method:


new ValidatedMethod({
    name: "login",
    validate: new SimpleSchema({
        username: String,
        hashedPassword: String
    }).validator(),
    run({ username, hashedPassword }) {
        return Meteor.users.findOne({ username, hashedPassword });
    }
});

The general idea here is the same as our last example. Instead of using check, we’re using SimpleSchema to make assertions about the shape and types of our method’s arguments.

If a malicious user provides a username or a hashedPassword that is anything other than a string, the method will return an exception, preventing the possibility of NoSQL Injection attacks.

Distributed Systems Are Hard

Written by Pete Corey on Jun 26, 2017.

As I dive deeper and deeper into the world of Elixir and distributed systems in general, I’ve been falling deeper and deeper into a personal crisis.

I’ve been slowly coming to the realization that just about every production system I’ve worked on or built throughout my career is broken in one way or another.

Distributed systems are hard.

Horizontal Scaling is Easy, Right?

In the past, my solution to the problem of scale has always been to scale horizontally. By “scale horizontally”, I mean spinning up multiple instances of your server processes, either across multiple CPUs, or multiple machines, and distributing traffic between them.

As long as my server application doesn’t persist in-memory state across sessions, or persist anything to disk, it’s fair game for horizontal scaling. For the most part, this kind of shoot-from-the-hip horizontal scaling works fairly well…

Until it doesn’t.

Without careful consideration and deliberate design, “split it and forget it” scaling will eventually fail. It may not fail catastrophically - in fact, it will most likely fail in subtle, nuanced ways. But it will always fail.

This is the way the world ends
Not with a bang but a whimper.

Let’s take a look at how this type of scaling can break down and introduce heisenbugs into your system.

Scaling in Action

For the sake of discussion, imagine that we’re building a web application that groups users into teams. A rule, or invariant, of our system is that a user can only be assigned to a single team at a time.

Our system enforces this rule by checking if a user already belongs to a team before adding them to another:


function addUserToTeam(userId, teamId) {
    if (Teams.findOne({ userIds: userId })) {
        throw new Error("Already on a team!");
    }
    Teams.update({ _id: teamId }, { $push: { userIds: userId } });
}

This seems relatively straight-forward, and has worked beautifully in our small closed-beta trials.

Great! Over time, our Team Joiner™ application becomes very popular.

To meet the ever growing demand of new users wanting to join teams, we begin horizontally scaling our application by spinning up more instances of our server. However, as we add more servers, mysterious bugs begin to crop up…

Users are somehow, under unknown circumstances, joining multiple teams. That was supposed to be a premium feature!

With Our Powers Combined

The root of the problem stems from the fact that we have two (or more) instances of our server process running in parallel, without accounting for the existence of the other processes.

Imagine a scenario where a user, Sue, attempts to join Team A. Simultaneously, an admin user, John, notices that Sue isn’t on a team and decides to help by assigning her to Team B.

Sue’s request is handled entirely by Server A, and John’s request is handled entirely by Server B.

Diagram of conflict between Server A and Server B.

Server A begins by checking if Sue is on a team. She is not. Just after that, Server B also checks if Sue is on a team. She is not. At this point, both servers think they’re in the clear to add Sue to their respective team. Server A assigns Sue to Team A, fulfilling her request. Meanwhile, Server B assigns Sue to Team B, fulfilling John’s request.

Interestingly, both servers do their jobs flawlessly individually, while their powers combined put the system in an invalid, unpredictable, and potentially unrecoverable state.


The issue here is that between the point in time when Server B verifies that Sue is not on a team and the point when it assigns her to Team B, the state of the system changes.

Server B carries out its database update operating under the assumptions of old, stale data. The server process isn’t properly designed to handle, or even recognize these types of conflicting updates.

Interestingly (and horrifyingly), this isn’t the only type of bug that can result from this type of haphazard scaling.

Check out the beginning of Nathan Herald’s talk from this year’s ElixirConf EU to hear about all of the fantastic ways that distributed systems can fail.

Handling Conflicts

While this specific problem is somewhat contrived and could be easily fixed by a database schema that more accurately reflects the problem we’re trying to solve (by keeping teamId on the user document), it serves as a good platform to discuss the larger issue.

Distributed systems are hard.

When building distributed systems, you need to be prepared to be working with data that may be inconsistent or outdated. Conflicts should be an expected outcome that are designed into the system and strategically planned for.

This is part of the reason I’ve gravitated towards an Event Sourcing approach for my latest project, Inject Detect.

Events can be ordered sequentially in your database, and you can make assertions (with the help of database indexing) that the event you’re inserting immediately follows the last event you’ve seen.

We’ll dive into the details surrounding this type of solution in future posts.

Final Thoughts

Wrapping up, I feel like this article ranks high in fear-mongering and low in actionable value. That definitely isn’t my intention.

My goal is to show that working with distributed systems is unexpectedly hard. The moment you add a second CPU or spin up a new server instance, you’re entering a brave new (but really, not so new) world of computing that requires you to more deeply consider every line of code you write.

I encourage you to re-examine projects and code you’ve written that exist in a distributed environment. Have you ever experienced strange bugs that you can’t explain? Are there any race conditions lurking there that you’ve never considered?

Is your current application ready to be scaled horizontally? Are you sure?

In the future, I hope to write more actionable articles about solving these kinds of problems. Stay tuned for future posts on how Event Sourcing can be used to write robust, conflict-free distributed systems!