Suffixer! Find Meaningful Unregistered Domains

Written by Pete Corey on Feb 2, 2015.

Early last week I released a small web app, Suffixer! The idea behind Suffixer is to compile a database of all words that can be created with registerable Top Level Domains as their suffixes. You can then search through that database and quickly determine if any of the resulting domains are available for registration.

Like most of my recent projects, Suffixer was built with Meteor. You can find the source on github.

Building Suffixer’s Database

Suffixer was built using data from Wiktionary and Namecheap’s API. I wrote a few custom Meteor packages to help initially populate Suffixer’s database. The first, pcorey:namecheap, is simply a wrapper around a modified version of Chad Smith’s node-namecheap library. Another custom package, pcorey:namecheap-tlds, exposes a Meteor collection which is populated with a call to the Namecheap’s getTldList API method. A third package, pcorey:wiktionary, parses a Wiktionary tsv dump file and fills a collection with all relevant words.

None of these packages are currently published. If you would like to use any of them, let me know and I’ll hapily make them available.

Searching and Checking

Searching through the database is fairly straight forward thanks to MongoDB’s text search functionality and the power of Meteor’s Publish & Subscribe. Searching on the client is initiated through a debounced subscription. The server has a matching publish method that takes the search term as an argument (among other things):

Meteor.publish('wiktionary-namecheap', function(suffix, definition, limit, hideRegistered, favorites) {
    var results = Wiktionary.find(
        getSelector(suffix, definition, hideRegistered, favorites),
        getOptions(limit));
    var domainMap = getDomainMap(results);
    checkAndUpdateDomains(domainMap);
    return results;
});

This publish method does a few interesting things. First, it queries Mongo for any results matching the provided search term (definition). Next, it loops through all of the results looking for any domains who’s registration status is either unregistered or unknown. Those domains are passed to Namecheap’s check API, and the results of that call update the status of the corrresponding Mongo documents. The real magic is that while the API callback updating the Mongo documents is asynchronous, those changes are automatically and instantly pushed to the client. How cool is that?

Known Problems and Lessons Learned

My goal with this project was to create a Minimum Viable Product. That means that Suffixer was released with a few issues:

Static Data

Namecheap’s getTldList is only called when the database is first populated, and all words in the Wiktionary tsv not paired up to a Top Level Domain are excluded from the database. This means that if any new TLDs are made available by Namecheap in the future, the Suffixer database would have to be wiped and rebuilt. A much better way to future-proof this functionality would be to store all Wiktionary entries in Mongo, along with the list of available TLDs. Periodically, getTldList could be called, potentially updating the TLD collection and any new Wiktionary word matches. This would most likely require some schema re-working.

Wikitext Is Hard

Wikitext is hard. Specifically, correctly rendering wikitext templates is hard. I looked into a few different options for rendering wikitext into plain text, but all of them fell short when it came to expanding templates. From what I’ve found, the only way to expand a template is to use Wikipedia’s API. Unfortunately, do to the huge number of wikitext entries I needed to parse, this wasn’t a viable option. This version of Suffixer leaves the unexpanded templates in the definition text. For the most part, they’re human readable and add valuable context to the definitions.

Final Thoughts

Overall, I’m happy with how the project turned out. I learned a good deal about Mongo and even more about Meteor. Meteor continues to be a very interesting and exciting platform to work with.

If you have any comments or suggestions about Suffixer, please let me know!

Mongo Text Search with Meteor

Written by Pete Corey on Jan 26, 2015.

My most recent project, Suffixer, involves doing a primarily text-based search over more than 80k documents. Initially, I was using $regex to do all of my querying, but this approach was unacceptably slow. I decided to try out MongoDB’s text search functionality to see if I could get any performance gains.

I replaced my main query with something like this:

MyCollection.find({$text: {$search: searchText}});

Unfortunately, Meteor seemed very unhappy with this change. I immediately began setting errors in my server logs:

Exception from sub ZskAqGy2t2jJckpXK MongoError: invalid operator: $search

A quick investigation showed what was wrong. Meteor uses Mongo 2.4, instead of 2.6. You can check this by running db.version() in your Mongo shell (meteor mongo). Text search in 2.4 is syntactically significatly different than text search in 2.6.

If you insist on using Meteor’s bundled version of Mongo, this Meteorpedia post shows how to manually kick off the search command in a reactive context.

A much better solution is to simply use your own instance of Mongo 2.6. Follow the available installation guides to get an instance running on your machine (or remotely). Once Mongo is successfully installed, you can instruct Meteor to use this new instance of Mongo by pointing to it with the MONGO_URL environment variable.

Using Mongo’s text search coupled with a text index drastically improved the performance of my web-app.

The Dangers of Debouncing Meteor Subscriptions

Written by Pete Corey on Jan 19, 2015.

I’ve been working on a Meteor app with instant search functionality. When users to type data into an input box, the system updates a session value which kicks off a Meteor.subscribe.

Template.controls.events({
    'keyup #search': function(e) {
        Session.set('search', e.target.value);
    }
});

Meteor.autorun(function() {
    Meteor.subscribe('my-collection', Session.get('search'));
});

While this worked, triggering a new subscribe for every keypress put too much of an unneeded strain on the system. Typing the word “test” triggered 4 different subscriptions, and the first 3 sets of subscription results were thrown out in a fraction of a second. I needed to limit the rate at which I was triggering my new subscriptions and subsequent database queries. A great way to do that is with Lo-Dash’s debounce method.

Debounce Meteor.subscribe

My initial idea was to debounce the Meteor.subscribe function used within the Meteor.autorun callback. Since the session variables being tracked by the Tracker computation could be updated in other places in the app as well, I figured this would be a clean way to limit excessive subscriptions being made to the server.

I changed my code to look like this:

var debouncedSubscribe = _.debounce(Meteor.subscribe, 300);
Meteor.autorun(function() {
    debouncedSubscribe('my-collection', Session.get('search'));
});

This had a very interesting affect on my application. While changing the session variable did trigger a new subscription, and the call was being debounced as expected, I noticed that old subscription results were being kept around on the client. The collection was starting to balloon in size.

Down to Debugging

I fired up my trusty Chrome Dev Tools and started debugging within the subscribe method itself in livedata_connection.js. After comparing behavior with a normal subscription invocation and a debounced invocation, the problem made itself known on line 571.

When it was executed as a debounce callback, the Meteor.subscribe call was no longer part of a computation because it is executed asynchronously. Tracker.active was returning false within the context of the debounce callback. This means that the Tracker.onInvalidate and Tracker.afterFlush callbacks were never initiated within the subscribe call as they would have been if subscribe were called from directly within a computation. That caused the subscription to never “stop” and its subscription data stayed around forever. Effectively, I was piling up new subscriptions every time the search string changed.

Meteor.subscribe('my-collection', 't');
Meteor.subscribe('my-collection', 'te');
Meteor.subscribe('my-collection', 'tes');
Meteor.subscribe('my-collection', 'test');
...

The Solution

I spent some time trying to find a way to run an asynchronous callback under an existing computation, but I wasn’t able to find a good way to do this. Ultimately, my solution was to not debounce the Meteor.subscribe call, but to debounce the keyup event handler:

Template.controls.events({
    'keyup #search': _.debounce(function(e) {
        Session.set('search', e.target.value);
    }, 300)
});

Meteor.autorun(function() {
    Meteor.subscribe('my-collection', Session.get('search'));
});