Adding fuzzy search capability to Rails 3

I wanted to add the ability to do ‘fuzzy’ searches to my rails app (without having to resort to stronger solutions like solr or sphinx). This is more than a simple ‘like’ clause, as you want to be able to match on things that sound similar, or are otherwise ‘very close’.
I came across this blog post

http://unirec.blogspot.com/2007/12/live-fuzzy-search-using-n-grams-in.html

which took me to https://github.com/bmaland/no_fuzz which doesn’t seem to work with rails 3. After looking at the branches of the plugin, I managed to get the plugin installed and running with my users table.

in your rails app directory

Pull the plugin

rails plugin install git://github.com/beno/no_fuzz.git


rails generate no_fuzz User

Then we need to actually create the table to hold our trigrams

rake db:migrate


Now we need to tell our model to populate the trigrams when it’s created, and I added a simple wrapper method for doing the fuzzy searches

class User < ActiveRecord::Base
include NoFuzz
fuzzy :username

def initialize
  super
  User.populate_trigram_index
end

#this returns an array of User objects
def self.fuzzy_search(search_term)
  results = User.fuzzy_find(search_term, 10)
end

  #this returns an active relation object
  def self.search(search_term) 
    if search  
      where('username LIKE ?', "%#{search_term}%")  
    else  
      scoped  
    end  
  end

Now we can use both search methods for our data as shown here

User.fuzzy_find("ma", 10) returns an array of records
User.search("Mat") returns an active record relation

Edit. I wasn’t fully happy with the previous solution, since it wouldn’t play well with the sorting and pagination code that I use elsewhere. So, I ‘fixed’ the no_fuzz plugin to play nice with the rest of the app.

I edited the no_fuzz.rb file to modify the the fuzzy_find method to the following.

    def get_fuzzy_matches(word, limit = 0)
      word = " #{word} "
      trigram_list = (0..word.length-3).collect { |idx| word[idx,3] }
      trigrams = @fuzzy_trigram_model.where(["tg IN (?)", trigram_list])
      trigrams = trigrams.group(@fuzzy_ref_id)
      trigrams = trigrams.order('SUM(score) DESC')
      trigrams = trigrams.includes(belongs_to_association)
      trigrams = trigrams.limit(limit) if limit > 0
      trigrams
    end
    
    #this just returns the ids for the records that match
    def fuzzy_find_ids(word, limit = 0)
      trigrams = get_fuzzy_matches(word, limit)
      trigrams.all.collect do |trigram|
        trigram[belongs_to_association + '_id'] 
      end
    end

    #this returns the actual records that match.
    def fuzzy_find(word, limit = 0)      
      trigrams = get_fuzzy_matches(word, limit)
      trigrams.all.collect do |trigram|
         trigram.send(belongs_to_association)
      end
    end

Then I modified my user model to the following


  #this returns an array of User objects
  def self.fuzzy_search(search)
      results = User.fuzzy_find(search, 10)  
  end
  
  #this returns an active relation object
  def self.search(search) 
    if search  
      user_ids = User.fuzzy_find(search, 10)  
      where(["id IN (?)", user_ids])
    else  
      scoped  
    end  
  end

Finally, in my controller I was able start using this thusly (I’m using Kaminari for my pagination)

 @users = User.search(params[:search]).page(params[:page]).per(10)

Now that the code plays nice with AR, I was able to use it in my ajax’ified search box, so it provides lookahead suggestions as a user is typing (using jquery of course). It also integrates with the dynamic table view I have, to update the tables results based on what’s being searched for.

It should be noted that this has the side effect of possibly having your specified ordering override the ordering returned by no_fuzz, but if you’re only returning a small number of possible matches then it’s not a very big deal.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.