Node Vulnerability Scanners in a 1.3 World

Written by Pete Corey on Jun 20, 2016.

Meteor’s recent transition to using NPM modules has opened up a world of possibilities for Meteor developers. Unfortunately, with great power comes great responsibility.

Along with a host of new functionality, NPM packages also come with a world of vulnerabilities and security concerns. In fact, over 14% of all NPM modules have known vulnerabilities.

Node Security Project

Thankfully, there are teams and tools dedicated to tackling the problem of documenting and cataloging known Node.js module vulnerabilities. A very popular option for scanning and monitoring your NPM dependencies for known vulnerabilities is the Node Security Platform.

In its most basic form, NSP offers a command line tool that scans your package.json or your npm-shrinkwrap.json for know vulnerabilities.

Because all of your NPM dependencies are saved in the package.json file in your project root, using the nsp tool to scan your Meteor project for vulnerabilities is a simple process:


> cd $YOUR_METEOR_PROJECT
> nsp check
(+) No known vulnerabilities found

If NSP finds and vulnerable dependencies, you’ll be given more information and hopefully an upgrade patch that will fix the issue.

Snyk

Snyk is another tool designed to find vulnerable NPM dependencies within your Node.js project. The Snyk command line tool can be used just like the NSP command line tool:


> cd $YOUR_METEOR_PROJECT
> snyk test
✓ Tested ... for known vulnerabilities, no vulnerabilities found.

Snyk even lets you test GitHub repositories or individual NPM modules using their web tool.

I’m a big fan of Snyk. Their VulnDB is built on top of Node Security Project’s advisories database and is taking strides to improve and build upon that great foundation. At the time of writing this article, Snyk has documented 105 Node.js vulnerabilities in their vulnerability database.

The Snyk team also regularly posts insightful blog posts about a variety of security topics.

Meteor Package Dependencies

While NSP and Snyk are great options for testing your base project’s NPM dependencies for known vulnerabilities, there are other avenues for vulnerable Node packages to find their way into your Meteor project.

Pre-1.3 Meteor projects relied on using Meteor packages to pull in NPM dependencies or using the meteorhacks:npm package to simulate direct dependencies within the base project. Both of these techniques obfuscate the actual NPM dependencies being used and make it difficult to scan them using traditional techniques.

Check out my post on Scanning Meteor Projects for Node Vulnerabilities for information on writing a bash script to dive into a Meteor project’s build bundle to call nsp check or snyk test on a project’s entire dependency tree.

Final Thoughts

Both the Node Security Platform and Snyk offer fantastic tools for scanning your Node.js and Meteor projects for known vulnerabilities. I recommend you pick one of these two tools and incorporate this type of vulnerability scanning into your development, deployment, and continuous integration workflow.

Using Snyk or NSP with a Meteor-specific vulnerability scanning tool such as Package Scan will help give you some peace of mind as you move forward developing fantastic Meteor applications.

NoSQL Injection and GraphQL

Written by Pete Corey on Jun 13, 2016.

It’s no secret that GraphQL has been making some serious waves in the developer community lately, and for good reason. It acts as an abstraction around your application’s data and seemingly gives any consumers of that data super powers. In my opinion, GraphQL is one of the most exciting pieces of technology to come out of the “Facebook stack”.

While the obvious benefits of GraphQL are exciting, in my mind the security repercussions of using GraphQL are even more amazing!

Because every query and mutation must correspond to a strictly typed schema (which is validated both on the client and server), GraphQL eradicates an entire class of injection vulnerabilities that I’ve spent so much time talking about.

An Intro to NoSQL Injection

If you’re a reader of this blog, you’re probably very familiar with the dangers of NoSQL injection. For a very quick crash course, let’s check out an example.

Imagine you have a Meteor publication that publishes a single item from the Foo collection based on an ID (_id). The ID of the desired Foo document is provided by the client when they subscribe to the publication.

Meteor.publish("foo", function(_id) {
  return Foo.find({ _id });
});

In the context of a Meteor application, _id is assumed to be a String. But because that assumption isn’t be codified or asserted, our application can run into some serious trouble.

What would happen if a malicious user were to pass something other than a String into the "foo" publication’s _id argument? What if they make the following subscription:

Meteor.subscribe("foo", { $gte: "" });

By passing in an object that houses a MongoDB query operator, the malicious user could modify the intended behavior of the publication’s query. Because all the IDs are greater then an empty string, all documents in the Foo collection would be published to the attacking user’s client.


Hopefully that quick primer shows you how serious NoSQL injection can be. For more information on this type of vulnerability, check out some of my previous posts:

NoSQL Injection in Modern Web Applications
Why You Should Always Check Your Arguments
Rename Your Way to Admin Rights
DOS Your Meteor Application With Where
Meteor Security in the Wild
NoSQL Injection - Or, Always Check Your Arguments

Check to the Rescue - Kind of…

The recommended way of dealing with these types of vulnerabilities in a Meteor application is to use the check function to make assertions about user provided arguments to your Meteor methods and publications.

Going back to our original example, if we’re expecting _id to be a String, we should turn that expectation into an assertion. Using check, it’s as simple as adding a line to our publication:

Meteor.publish("foo", function(_id) {
  check(_id, String);
  return Foo.find({ _id });
});

Now, whenever someone tries to subscribe to "foo" and provides an _id argument that is not a String, check will throw an exception complaining about a type mismatch.

While using check correctly can prevent all instances of NoSQL vulnerabilities, it doesn’t come without its downsides.


Unfortunately, using check correctly can be a significant undertaking. Not only does it require that you explicitly check every argument passed into all of your methods and publications, but it requires that you remember to continue to do so for the lifetime of your application.

Additionally, you must remember to write exhaustive checks. Lax checks will only lead to pain down the road. For example, check(_id, Match.Any) or check(_id, Object) won’t prevent anyone from passing in a Mongo operator. Incomplete argument checks can be just as dangerous as no checks at all.

There are tools (east5th:check-checker, audit-argument-checks, aldeed:simple-schema) and patterns (Validated Methods) designed to overcome these shortcomings, but the truth is that check will always be a superfluous security layer that sits on top of your application.

Security Built In

What if instead of having our argument assertions be a superfluous layer slapped on top of our data access methods, it were a core and integral part of the system? What if it simply weren’t possible to write a method without having to write a thorough, complete and correct argument schema?

We know that we would be protected from NoSQL injection attacks, and because the assertion system is an integral part of the system, we know that our checks would always be up-to-date and accurate.

Enter GraphQL.

GraphQL allows you to define strongly-typed queries or mutations (similar to Meteor’s methods). The key word here is “strongly-typed”:

Given a query, tooling can ensure that the query is both syntactically correct and valid within the GraphQL type system before execution, and the server can make certain guarantees about the shape and nature of the response.

This means that every defined query or mutation must have a fully defined schema associated with it. Similar to our previous example, we could write a query that returns a Foo document associated with a user provided _id:

let FooQuery = {
  type: FooType,
  args: {
    _id: { type: new GraphQLNonNull(graphql.GraphQLString) }
  },
  resolve: function (_, { _id }) {
    return Foo.findOne(_id);
  }
};

After wiring FooQuery into our GraphQL root schema, we could invoke it with a query that looks something like this:

{
  foo(_id: "12345”) {
    bar
  }
}

If we try to pass anything other than a String into the "foo" query, we’ll receive type errors and our query will not be executed:

{
  "errors": [
    {
      "message": "Argument \"_id\" has invalid value 54321.\nExpected type \"String\", found 54321.",
      ...

So we know that GraphQL requires us to write a schema for each of our queries and mutations, but can those schemas be incomplete, or so relaxed that they don’t provide any security benefits?

It is possible to provide objects as inputs to GraphQL queries and mutations through the use of the GraphQLInputObjectType. However, the fields defined within the input object must be fully fleshed out. Each field must either be a scalar, or a more complex type that aggregates scalars.

Scalars and Enums form the leaves in [request and] response trees; the intermediate levels are Object types, which define a set of fields, where each field is another type in the system, allowing the definition of arbitrary type hierarchies.

Put simply, this means that an input object will never have any room for wildcards, or potentially exploitable inputs. Partial checking of GraphQL arguments is impossible!

The King is Dead…

So what does all of this mean, especially from a Meteor developer’s perspective? Unfortunately, when writing vanilla Meteor methods or publications, we’ll still have to stick with using either check or aldeed:simple-schema for making assertions on our arguments.

However, GraphQL is becoming a very real possibility in the Meteor ecosystem. If you chose to forgo the traditional Meteor data stack, you can start using GraphQL with your Meteor application today.

Additionally, the Meteor team has been taking strides with the Apollo stack. Apollo is “an incrementally-adoptable data stack that manages the flow of data between clients and backends.” Because Apollo is built on top of GraphQL, it inherently comes with all of the baked in security features we’ve discussed.

Another thing to remember is that everything we’ve talked about here relates to type-level checking in order to prevent a very specific type of NoSQL injection attack. It’s still you’re responsibility to ensure that all user provided input is properly sanitized before using it within your application.

No matter which data stack you land on, be sure to check all user provided inputs!

MongoDB With Serverless

Written by Pete Corey on Jun 6, 2016.

Last week I wrote about how excited I was about AWS Lambda’s pricing model. Fueled by that excitement, I spent some time this week experimenting on how I could incorporate Lambda functions into my software development toolbox.

As a Meteor developer, I’m fairly intimately associated with MongoDB (for better or worse). It goes without saying that any Lambda functions I write will most likely need to interact with a Mongo database in some way.

Interestingly, using MongoDB in a Lambda function turned out to be more difficult that I expected.

Leveraging Serverless

Rather than writing, deploying and managing my Lambda functions on my own, I decided to leverage one of the existing frameworks that have been built around the Lambda platform. With nearly nine thousand stars on its GitHub repository, Serverless seems to be the most popular platform for building Lambda functions.

Serverless offers several abstractions and seamless integrations with other AWS tools like CloudFormation, CloudWatch and API Gateway that help make the micro-service creation process very simple (once you wrap your head around the massive configuration files).

Using the tools Serverless provides, I was able to quickly whip up a Lambda function that was triggered by a web form submission to an endpoint. The script would take the contents of that form submission and store them in a MongoDB collection called "events":

"use strict";

import _ from "lodash";
import qs from "qs";
import { MongoClient } from "mongodb";

export default (event, context) => {

    let parsed = _.extend(qs.parse(event), {
        createdAt: new Date()
    });

    MongoClient.connect(process.env.MONGODB, (err, db) => {
        if (err) { throw err; }
        db.collection("events").insert(parsed);
        db.close();
        context.done();
    });

};

Unfortunately, while the process of creating my ES6-based MongoDB-using Lambda function with Serverless was painless, the deployment process turned out to be more complicated.

MongoDB Module Problems

Locally, I was using Mocha with a Babel compiler to convert my ES6 to ES5 and verify that my script was working as expected. However, once I deployed my script, I ran into problems.

After deploying, submitting a web form to the endpoint I defined in my project resulted in the following error:

{
  "errorMessage": "Cannot find module './binary_parser'",
  "errorType": "Error",
  "stackTrace": [
    "Function.Module._load (module.js:276:25)",
    "Module.require (module.js:353:17)",
    "require (internal/module.js:12:17)",
    "o (/var/task/_serverless_handler.js:1:497)",
    "/var/task/_serverless_handler.js:1:688",
    "/var/task/_serverless_handler.js:1:17260",
    "Array.forEach (native)",
    "Object.a.12../bson (/var/task/_serverless_handler.js:1:17234)",
    "o (/var/task/_serverless_handler.js:1:637)"
  ]
}

At some point during the deployment process, it looked like the "binary_parser" module (an eventual dependency of the "mongodb" module) was either being left behind or transformed beyond recognition, resulting in a broken Lambda function.

Over Optimized

After hours of tinkering and frantic Googling, I finally made the realization that the problem was with the serverless-optimizer-plugin. Disabling the optimizer and switching to using ES5-style JavaScript resulted in a fully-functional Lambda.

While I could have stopped here, I’ve grown very accustomed to writing ES6. Transitioning back to writing ES5-style code seemed like an unacceptable compromise.

While weighing the decision of forking and hacking on the serverless-optimizer-plugin to try and fix my problem, I discovered the serverless-runtime-babel plugin. This new plugin seemed like a promising alternative to the optimizer. Unfortunately, after removing the optimizer form my project and adding the babel plugin, I deployed my Lambda only to receive the same errors.

Webpack Saves the Day

Finally, I discovered the serverless-webpack-plugin. After installing the Webpack plugin, and spending some time tweaking my configuration file, I attempted to deploy my Lambda function…

Success! My ES6-style Lambda function deployed successfully (albeit somewhat slowly), and successfully inserted a document into my MongoDB database!

PRIMARY> db.events.findOne({})
{
        "_id" : ObjectId("5751e06e1aba0e0100313db7"),
        "name" : "asdf",
        "createdAt" : ISODate("2016-06-03T19:54:22.139Z")
}

MongoDB With Lambda

While I still don’t fully understand how the optimizer or babel plugins were corrupting my MongoDB dependencies, I was able to get my ES6-style Lambda function communicating beautifully with a MongoDB database. This opens many doors for exciting future projects incorporating Lambda functions with Meteor applications.

Check out the full serverless-mongodb project on GitHub for a functional example.


While working on this project, some interesting ideas for future work came up. In my current Lambda function, I’m re-connecting to my MongoDB database on every execution. Connecting to a Mongo database can be a slow operation. By pulling this connection request out of the Lambda handler, the connection could be re-used if several executions happen in quick succession. In theory, this could result in significantly faster Lambda functions, cutting costs significantly.

Finding explicit details on this kind of container sharing is difficult. The information that I’ve been able to find about it is incomplete at best, but it’s definitely an interesting area to look into.