Adding full text search to your Rails app in 1 hour and 10 lines of code
Yup, it's that fast and simple, using Ferret, a Ruby port of Lucene (the de facto full text search engine written in Java) and the acts_as_ferret plugin, I added "Did you mean this?" functionality to a small Rails application (non-perma link) today rather quickly and painlessly.
Adding full text search funtionality to your Rails applications
Install Ferret via RubyGems.
sudo gem install ferret
Install the acts_as_ferret plugin for your Rails application.
ruby script/plugin install <ACTS_AS_FERRET_SVN_REPOSITORY>
Add the acts_as_ferret mixin method to your ActiveRecord model, specifying the fields you want indexed.
class Location < ActiveRecord::Base
belongs_to :country, :foreign_key => 'country_code'
belongs_to :state
acts_as_ferret :fields => [:location_name, :country_name, :state_name]
def country_name
country.country_name
end
If you have any associations that you want searched as well, one way (there is more than one way to do this) would be like what I did above with the country_name attribute of Country. For a more detailed discussion on Ferret and ActiveRecord associations, I found this thread in the Ruby forum enlightening.
In your controller, use Model.find_by_contents() to perform the full text search with your query. If you need fuzzy matching, append a tilde ("~") to the end of your query - if you're here, you probably needed fuzzy matching anyway as you're most likely creating "Did you mean this?"-type functionality.
# In the controller...
@locations = Location.find_by_contents("#{@search_request.location_entered}~") # The "~" is for fuzzy matching.
All that's left is to flash the appropriate message in your view and display your model objects as you normally would.
Some gotchas:
- If you change the way your model is indexed, remember to clear out the existing one. acts_as_ferret stores the index by default in a 'index/' directory in the root of your Rails application. Stop your mongrel/Webrick process to remove the lock on the index files and delete the whole directory - Ferret will rebuild it the next time you perform a full text search.
- The first time you do a full text search, Ferret builds your index for you and this takes quite a while - don't be alarmed if your Rails application appears for freeze or returns a 500 internal server error. You can observe the contents of the 'index/' directory in the root of your Rails application to tell when Ferret has finished indexing (files stop appearing and disappearing). I found it easier to run a 'find_by_contents' call on my models via 'script/console' and waiting for some results to appear.
- Append a "~" to your query string for fuzzy matching! This allows you to retrieve results matching phonetically or are mispelled. There are more complex matching rules - check out the Lucene documentation.