Rails, PostgreSQL and Case-Sensitivity
Listen to this article
Possibly the only thing I like about MySQL is when performing a search, the values 'SIMON' and 'sImOn' are considered equal—case-insensitive searching. PostgreSQL on the other hand considers them to be different—case-sensitive searching. Now I don't know about you but for %99.999~ of the applications I've ever written, I'd rather 'Australia' and 'AuStRaLiA' weren't considered different countries.
The "standard" approach to solving this problem is to change a query from this:
SELECT * FROM countries WHERE name = ?;
To something like this:
SELECT * FROM countries WHERE LOWER(name) = LOWER(?);
Thereby forcing the database to perform a pseudo case-insensitive search. The only problem is that all those nice indexes you've created to ensure fast, efficient searching are totally ignored. (Who can spell full-table scan?) Performance issues aside (I mean after all we know that premature optimisation is the root of all evil right?) what's just as annoying is that I can't actually guarantee uniqueness, which is pretty much the whole point! No, even with a unique index on countries.name, the database will still quite happily allow me to:
INSERT INTO countries (name) VALUES ('Australia');
INSERT INTO countries (name) VALUES ('AUSTRALIA');
INSERT INTO countries (name) VALUES ('aUsTrAlIa');
So when I perform a case-insensitive search as previously discussed, I'll end up with three (count 'em 3) records. Thankfully, there is a solution (of sorts): expression indexes.
PostgreSQL allows you to create indexes based on expressions, say for example LOWER(name), allowing us to create a unique, case-insensitive index as simply as:
CREATE UNIQUE INDEX index_countries_on_name ON countries (LOWER(name));
Ok, so perhaps you knew this already and you're wondering what all this has to do with Rails? Well I'm glad you asked.
Rails (as of 1.2) has a new option for validates_uniqueness_of named, oddly enough, case_sensitive. This is assumed to be true by default (meaning all searches are case-sensitive). Set it to false however and you'll magically get validation queries that look like:
SELECT * FROM countries WHERE (LOWER(countries.name) = 'australia') LIMIT 1;
To compliment this feature, I've recently enhanced the RedHill on Rails Plugins in two interesting (and hopefully useful) ways.
The first is in the core and supports the creation of case-insensitive indexes during schema migration:
add_index :countries, [:name], :unique => true, :case_sensitive => false
The second is in schema validations and causes case_sensitive => false to be passed as an option to validates_uniqueness_of whenever a case-insensitive index is detected.
(I also looked at the possibility of automagically surrounding query parameters, etc. with LOWER() inside find methods but given the myriad forms queries can take, it seems altogether too difficult for my feeble mind at this point.)
The upshot of all this is that at the very least, it should now be possible to add case-insensitivity to your queries and be assured that (bugs not withstanding) the performance of your application won't suddenly plummet as a consequence.
Comments
Excellent and as always thanks for sharing with the community.
Posted by: Ted | February 21, 2007 10:43 AM
Note that the lower() construct is only necessary when you want to make use of a functional index (ie. you can make an index that will work with that query).
Where you don't need to use an index, you can just use ILIKE instead of LIKE.
FYI.
Posted by: Mark Aufflick | February 21, 2007 08:22 PM
Indeed. I kinda think of this the other way around. The only reason I have the functional index is because I don't want the performance hit of ilike which is pretty steep. So rather than use the lower() to make use of the functional index, I'm making the functional index so that I can use lower(). :)
Posted by: Simon Harris
|
February 21, 2007 09:38 PM
Interestingly, I had a similar need for a functional index, but I was actually looking for the ability to use ANY functional index, not just for lower(). I did a patch to the rails code as I needed it only locally, but it could also be done into a plugin or a gem.
Take a look here: http://devblog.famundo.com/articles/2007/01/29/adding-support-for-functional-indexes-in-rails
And thanks for providing a solution to the (by far) most common case.
Guy.
Posted by: Guy Naor | February 23, 2007 04:23 AM