« August 2006 | Main | October 2006 »

September 29, 2006

Deploying to Multiple Rails Environments

Listen to this articleListen to this article

On one Rails project, we have two deployment environments: production; and UAT. Using the default Capistrano configuration makes deploying to these two environments rather difficult so, I thought I'd share our deploy.rb with a bit of explanation along the way. Ok, here goes:

For a start, we deploy to a directory that includes the environment as part of the path:

set :deploy_to, lambda { "/home/#{user}/www/#{rails_env}" }

For subversion, we checkout the code as the user who is running the deployment making sure not to cache authentication details on the server:

set :svn_user, ENV['USER']
set :svn_password, lambda { Capistrano::CLI.password_prompt('SVN Password: ') }
set :repository, lambda { "--username #{svn_user} --password #{svn_password} --no-auth-cache svnurl/trunk/#{application}" }

In both cases, we run a mongrel cluster. Because the mongrel configuration files share a lot in common and because they largely duplicate information contained within the deployment script, we generate an appropriate configuration on deployment. More of that in a bit but for now, the common bits look like:

set :mongrel_address, "127.0.0.1"
set :mongrel_environment, lambda { rails_env }
set :mongrel_conf, lambda { "#{current_path}/config/mongrel_cluster.yml" }

Now, for the environment specific portions. For each environment we have a task that simply sets variables appropriately—I toyed with using an environment variable such as RAILS_ENV rather than the pseudo-tasks but it was more typing and I'm allergic to typing :).

For production, we want 3 mongrel instances in the cluster, listening on ports 8000-8002:

desc "Production specific setup"
task :production do
  set :rails_env, :production
  set :mongrel_servers, 3
  set :mongrel_port, 8000
end

For UAT, we want 2 mongrel instances in the cluster, listening on ports 8010-8011:

desc "UAT specific setup"
task :uat do
  set :rails_env, :uat
  set :mongrel_servers, 2
  set :mongrel_port, 8010
end

And finally, a custom deployment script based almost entirely on the built-in deploy_with_migrations with the major difference being the configuration of the mongrel cluster just prior to restart:

desc "Generic deployment"
task :deploy do
  update_code

  begin
    old_migrate_target = migrate_target
    set :migrate_target, :latest
    migrate
  ensure
    set :migrate_target, old_migrate_target
  end

  symlink
  
  configure_mongrel_cluster

  restart
end

That's it really. Now whenever we need to deploy to a particular environment, say for example UAT, we do something like:

cap uat deploy

Update: By request, here is our database.yml file :

common: &common
  adapter: postgresql
  username: <%= ENV['USER'] %>

development:
  database: foo_development
  <<: *common

test:
  database: foo_test
  <<: *common

uat:
  database: foo_uat
  <<: *common

production:
  database: foo_production
  <<: *common

As you can probably tell, we're lucky enough that the database user is always the same as the user under which the application will be run and is that the database itself is named according to the environment. That makes it very easy to wrap up most of the common parts—Thanks goes to Jon Tirsen for that YAML tip.

This could also easily be generated. I guess it just hasn't needed any attention since it was created so YAGNI overrode DRY ;-)

September 28, 2006

No Really, Perforce Does Suck

Listen to this articleListen to this article

Ok, so after my rant yesterday I was feeling a bit better. So many people rushed to the defence of Perforce and on the authority of people I know, respect and work for—not mutually exclusive roles—I thought I'd get stuck into it and read the manuals, read news groups and even rushed out to buy a copy of Practical Perforce.

The documentation is plentiful and very informative and the support groups are very helpful. As for the book, well, the book is most excellent, a very easy read indeed and full of tonnes of really great tips—recipes, idioms, patterns, hacks, call them what you will—which just about sums up my experience thus far: Lots and lots of rather involved processes to do what I consider to be normal everyday activities. (At this point I feel compelled to direct you to an excellent article on why patterns are indicative of unsophisticated systems.)

To give you a 100% practical example, just today I committed 1600 files which I had to back-out almost immediately because I realised I had broken something. Now, ignoring the why's and how's I managed to get myself into such a pickle, the fact is I needed to rollback a commit. Here's what I did:

svn merge -c -27289 svn+ssh://me@therepositoryurl
svn commit

Tricky stuff that!

So then on my way home I picked up the book mentioned earlier and went straight to the index to find "Backing out a recent change". Whoot! Just what I wanted to know. So here's the deal:

p4 files @=27289 # This lists all the files that have changed
p4 sync @27288
p4 add ... # For each deleted file
p4 edit ... # For each changed file
p4 sync
p4 delete ... # For each added file
p4 resolve -ay
p4 submit

Yes! Pretty impressive! And, straight from the book, re-printed without any permission whatsoever (emphasis added by yours-truly):

When a change involves a lot of files, you can filter the output of the files command to produce a list of files to open. Unfortunately, files can't be piped directly to other p4 commands because its format isn't acceptible to them. This can be easily fixed by using a filter; namely sed.

Wow. Cool! Just what I wanted to have to do. Ok, so let's try that:

p4 sync @27288
p4 files @=27289 | sed -n -e "s/#.* - delete .*//p" | p4 -x- add
p4 files @=27289 | sed -n -e "s/#.* - edit .*//p" | p4 -x- edit
p4 sync
p4 files @=27289 | sed -n -e "s/#.* - add .*//p" | p4 -x- delete
p4 resolve -ay
p4 submit

Awesome! That's sooooo much better. Sheesh, I might even be able to script it, fan-bloody-tastic. Thankfully, Perforce is touted as being lightning fast because unless I'm very much mistaken, that's seven, count 'em, seven calls to the server!

So, what have we learned so far? We've learned that precisely the scenario I've been told Perforce is great at handling, it really, really, really, ok once more, really, sucks!

Oh, but there's more. I forgot to mention that I was also working offline before I committed the original sin. When I eventually connected this is what I did:

svn commit

Ok, so technically I did:

svn up
svn commit

So, what would have been the equivalent if I had been using Perforce you might ask?

p4 sync
p4 diff -se | p4 -x- edit
p4 diff -sd | p4 -x- delete
p4 submit

(As a side note, adding new files in both systems is about the same amount of work. That said, at least with subversion a simple svn sta will show me which files are not yet under version control. For the life of me I can't seem to find an easy way to do this with Perforce.)

Not too bad but technically, three times as many commands. And yes, again, I could script it but why should I need to? This is something I, as a developer, do every day. Am I mistaken for thinking that developers are by far the largest users of a tool such as this? Perhaps.

It's no wonder Google want people to know how to use Perforce; it pretty much proves the candidate has a brain large enough to even feel like working out how to use it.

September 27, 2006

Perforce: Just A Faster CVS?

Listen to this articleListen to this article

So, it's 7am-ish and I've had 6 or so hours of sleep to ruminate on this but yup, from a developers perspective, I still think Perforce sucks.

Can anyone tell me why they believe it seems like a good idea to:

  • Require an ssh tunnel to have encyrpted communication;
  • Keep a secondary workspace to enable offline revert;
  • Have a command-line tool that uses environment variables—or command-line arguments—to specify connection details;
  • Display a diff of which files changed as a tree—I just want to see the individual files not my entire project;
  • The list goes on...

I like to work offline, a lot, on planes, trains and in taxi-cabs; I like to be able to see immediately what's changed; and I like to be able to revert everything (or only somethings) several times while I'm prototyping.

With subversion I get a lot out-of-the-box and while there will always be nice to have features such as "add all unknown files" it does pretty much everything I need.

As I moved from C to C++ to Java and then to Ruby, I felt empowered each step of the way. I had a similar experience moving from CVS to SVN. Perforce seems like a step backwards.

Google may use and recommend Perforce but when the answer to "why can't I do ..." is "you can, just write a script to ..." I'm not sure I'm convinced.

September 19, 2006

ActiveRecord Identity Map for Rails Transactions

Listen to this articleListen to this article

I happened to be reading a blog entry last night that mentioned some "short comings" in Rails' ActiveRecord and its handling of record loading. Specifically, AR will load the same record twice, into two different instances, within the same transaction. Ie. the following test fails:

Customer.transaction do
  c = Customer.find_by_name('RedHill Consulting, Pty. Ltd.')
  assert_same c, Customer.find(c.id)
end

To be honest, I've not yet been burned by this but it may just catch-out some so I quickly whipped up a very basic plugin to see how difficult it would be solve:

module RedHillConsulting
  module IdentityMap
    class Cache
      def initialize
        @objects = {}
      end

      def put(object)
        objects = @objects[object.class] ||= {}
        objects[object.id] ||= object
      end
    end

    module Base
      def self.included(base)
        base.extend(ClassMethods)

        base.class_eval do
          alias_method_chain :create, :identity_map
        end
      end

      module ClassMethods
        def self.extended(base)
          class << base
            [:instantiate, :increment_open_transactions, :decrement_open_transactions].each do |method|
              alias_method_chain method, :identity_map
            end
          end
        end

        def instantiate_with_identity_map(record)
          enlist_in_transaction(instantiate_without_identity_map(record))
        end

        def enlist_in_transaction(object)
          identity_map = Thread.current['identity_map']
          return object unless identity_map
          identity_map.put(object)
        end

        private
          def increment_open_transactions_with_identity_map
            increment_open_transactions_without_identity_map
            Thread.current['identity_map'] ||= Cache.new
          end

          def decrement_open_transactions_with_identity_map
            Thread.current['identity_map'] = nil if decrement_open_transactions_without_identity_map < 1
          end
      end

      def create_with_identity_map()
        create_without_identity_map
        self.class.enlist_in_transaction(self)
        id
      end
    end
  end
end

The code essentially interferes with create and instantiate (called from find) and ensures that, within a transactions, the same record will always be returned for the same id (IdentityMap).

As I mentioned, unlike all my other plugins, I've never used nor needed to use this one—and I'm not sure I will unless it proves to be a problem for me—but it's yet another example of how easy it is to extend Rails to do pretty much whatever you might imagine.

September 15, 2006

Automatically Validate Uniqueness of Columns with Scope

Listen to this articleListen to this article

The first cut at Schema Validations only applied validates_uniqueness_of for single-column unique indexes. This removed 80% of the cases in my code base but there were still cases where a scope was specified that lingered. Not any more.

The plugin now automatically generates validates_uniqueness_of with scope for multi-column unique indexes as well.

As always, there are some assumed conventions—which I believe will handle close to 99% of cases—around how to decide which column to validate versus which columns to consider part of the scope. The column to validate is chosen to be either:

  1. The last column in the index definition not ending in ‘_id’; or simply
  2. The last column in the index definition.

With all remaining columns considered part of the scope, following, what I believe to be, a typical typical composite unique index column ordering.

So, for example, given either of the following two statements in your schema migration:

add_index :states, [:country_id, :name], :unique => true
add_index :states, [:name, :country_id], :unique => true

The plugin will generate:

validates_uniqueness_of :name, :scope => [:country_id]

My next stop is to have a look at simple column constraints such as IN('male', 'female') and turn them into validates_inclusion_of :gender, :in => ['male', 'female'].

Perhaps tomorrow :)

September 14, 2006

validates_presence_of association Gotcha

Listen to this articleListen to this article

The more I use Rails (and the more plugins I create) the more quirks I find.

Imagine I have a one:many relationship between Country and State:

State.belongs_to :country
Country.has_many :states

We then issue the following sequence of statements (I've interleaved the output of tailing the development log):

c = Country.find_by_name('Australia')
  Country Load (0.006506)   SELECT * FROM countries WHERE (countries."name" = 'Australia' ) LIMIT 1
s = c.states.build(:name => 'Victoria', :abbreviation => 'VIC')
s.country
  Country Load (0.009738)   SELECT * FROM countries WHERE (countries.id = 1)

Notice the SELECT to find the country? Now why would that be necessary? I just used .states.build on the country. I would have thought that would set the association but that doesn't appear to be the case.

Looking at the code, my suspicions were confirmed: only the parent's id is set. That seems decidedly odd given that we know for a fact the parent exists—we just used it to create the child.

So anyway, I'm pretty sure this is considered a "feature" but to be honest, I can't see why it is desired behaviour over and above the fact that doing otherwise would be more work and why would you need this if you already have the parent yada, yada, yada.

Well, for a start, I'd like this behaviour because I'd like to use validates_presence_of on foreign-keys and have it work for newly constructed graphs. Usually this barfs no matter what but I concocted a work-around last night and committed it to my Foreign Key Associations plugin which, if done manually, would look something like this:

class State < ActiveRecord::Base
  validates_presence_of :country_id, :if => lambda { |record| record.country.nil? }
  ...
end

Essentially this says to validate the presence of country_id but only if there isn't an associated country. This means that for cases where the parent record is also new, the validation checks for the presence of the associated object rather than the foreign-key column. If you had simply used validates_presence_of :country_id then save would fail because country_id was still nil.

OK that's all very well and good but it still doesn't help because, as shown above, the association isn't set anyway. So, I'm now back to manually setting the association; at least the validation works hehe

I'm sure someone far smarter than I will point out why the behaviour as it stands is obviously the most appropriate and that no one in their right mind would want to do anything else, of course ;-)

September 13, 2006

Procrastinating in Ruby is Delicious

Listen to this articleListen to this article

As I was bookmarking something on del.icio.us today, I noticed the dates on which I had bookmarked the last couple of times and wondered if there was any correlation between frequency and day of the week. So, I downloaded a summary using https://api.del.icio.us/v1/posts/all? and whipped up a little ruby script to compile some statistics:

Wednesday = 41
Tuesday = 39
Thursday = 37
Friday = 32
Monday = 26
Saturday = 24
Sunday = 12

Looks like Wednesday is the biggest day for bookmarking—also known as procrastinating—and what do you know? Today is...Wednesday!

So then I thought I'd see if there was anything interesting in the time of day:

12 = 26
13 = 20
4 = 17
22 = 15
0 = 14
23 = 12
5 = 12
2 = 12
20 = 10
11 = 10
1 = 10
7 = 10
3 = 9
6 = 7
21 = 7
9 = 6
15 = 4
14 = 3
8 = 3
10 = 2
19 = 2

Phew! Most of my bookmarking is done around lunchtime although an awful lot were done at 4am!

September 09, 2006

Not My SQL

Listen to this articleListen to this article

Everyone else's favourite database just gave me the shits, again!

As part of my Schema Validations plugin for rails, I needed to see if a column has a default value. If it does, then there's no point in adding a validates_presence_of as the database will add one in. Ok, sounds sensible. Works just fine under PostgreSQL but my tests were failing when run against MySQL. Specifically, there was no validation being added for integer columns marked as NOT NULL. Huh?!

After a little investigation, I noticed that the meta-data that rails was collecting for mandiatory integer columns included a default of 0. So I looked in the test database and sure enough the columns all had a default of 0. But how? Why? I didn't put a default in my migrations...

A little more investigation and I noticed that the schema dump that is generated out of the development database and then run against the test database did indeed include the very same defaults. I then looked in the development database and to my surprise found no such defaults there. Aha! Mystery solved I presumed. Rails must have a bug for MySQL.

So I go and look at the code but alas, the code is the same for both PostgreSQL and MySQL. Something else must be happening. Time to get down and dirty on the command-line.

mysql> create table foo (col1 int, col2 int not null, col3 int default null) engine=InnoDB;

mysql> show columns from foo;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| col1  | int(11) | YES  |     | NULL    |       | 
| col2  | int(11) | NO   |     |         |       | 
| col3  | int(11) | YES  |     | NULL    |       | 
+-------+---------+------+-----+---------+-------+

If I have a nullable column then the default default (if that makes sense) is NULL. If I mark a column as mandiatory, the default default is...an empty string!? I wonder what would happen if I tried inserting a row and letting MySQL default all the values:

mysql> insert into foo () values ();

mysql> select * from foo;
+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
| NULL |    0 | NULL | 
+------+------+------+

You have to be shitting me! I attempted to insert a row into a table without specifying a value for a column that is marked as NOT NULL and it inserts 0!? Hold on a second...what if I force the default to be NULL so that it behaves just like every other sensible database on the planet:

mysql> create table bar (col1 int not null default null) engine=InnoDB;
ERROR 1067 (42000): Invalid default value for 'col1'

Egads! OK let me try that in PostgreSQL:

psql=# create table bar (col1 int not null default null);
CREATE TABLE

Thank-you!

Sure, I could make the assumption that 0 was never going to be a valid identifier for a record in another table but why should I have to? As far as I can tell, MySQL is just making shit up! No wonder my brother says it reminds him of using Microsoft Access.

So, now I'm left with the task of working out how to patch rails to get around this. I think I'll just have to presume that empty strings are equivalant to NULL for manditory columns. Sheesh.

RedHill on Rails Plugin Refactoring

Listen to this articleListen to this article

I mentioned in my previous entry that I'd done quite a bit of refactoring of the plugins. Among the various changes that will affect developers using them are:

  • Schema Defining (schema_defining) has been deleted;
  • Foreign Key Support (foreign_key_support) has been deleted; and
  • RedHill on Rails Core (redhillonrails_core) has been added to replace the previous two as well as subsuming some of the more generic functionality from other plugins.

So, why all these changes?

The main reason is manageability. We're actually eating our own dog food and using these plugins in production applications and we're adding functionality at quite a surprising rate. Each time we add something, we first put it into the plugin that needs it directly. That works great for a while but then, someday, we decide we need that functionality in two or more plugins. What to do?

Our original idea had been to create new plugins and this worked for us up to a point. Unfortunately, of late, the number of extra plugins—with very specific functionality mind you—was just getting out of hand and needed to be simplified.

In the end, we decided on a two-tiered approach to plugins: those which add functionality but no (or at least minimal) behaviour; and those that add behavioural magic.

As an example, the new core plugin adds functionality to manage foreign keys, lookup indexes, add unique column meta-data, etc. but doesn't do anything particularly magic that will affect the running of your application.

On the other hand, the foreign key migrations, foreign key associations, schema validations, etc. plugins—which all rely on core—add funky rails magic to automatically generate foreign keys, associations, model validation, etc.

Another change we made was in the way documentation is generated. We used to manually generate a nice HTML file containing all the plugins. This was becoming rather tedious and meant that the documentation was often quite out of date. We've now remedied this with a nice ruby script using Erb and RDoc to generate the online documentation directly from the README files.

I also mentioned previously that we've added "lots" of tests. I say lots because we're still playing catchup so relatively, there are lots but we still need lots more. As a group of developers that are ardent TDD evangelists, the conspicuous lack of tests was somewhat embarrassing to say the least. Unfortunately, testing plugins (especially those related to schema and database) is pretty difficult so we opted to bypass the whole problem and just create a standard rails app with standard rails tests and all is well again.

And lastly, besides all the extrat features we've added (see the CHANGELOGs for the specific plugins), you'll notice that the subversion URL has changed slightly—it used to contain an extra slash (/) which was not only unnecessary but caused SVN to regularly crap out.

My aplogies to all those that have been trying to keep up but we hope that's the last of it. From now on, we'll continue to beef up core as we need and then add plugins only when we need new behaviour.

Of course we'll always reserve the right to change our minds ;-)

September 07, 2006

Foreign Key Associations Plugin

Listen to this articleListen to this article

I've done quite a bit of refactoring of my Ruby on Rails plugins lately which, unfortunately, broke some stuff (thanks to all those that let me know) but the upshot is a much cleaner division of responsibility between plugins; and some sorely needed unit tests.

Another of the benefits from all of this was yet another plugin, this time to automatically generate associations based on foreign-keys.

For example, given a foreign-key from a customer_id column in an orders table to an id column in a customers table, the plugin generates:

  • Order.belongs_to :customer; and
  • Customer.has_many :orders.

(In the near future we intend to support has_one associations for foreign-key columns having a unique index.).

If there is a uniqueness constraint—eg unique index—on a foreign-key column, then the plugin will generate a has_one instead of a has_many.

For example, given a foreign-key from an order_id column with a uniqueness constraint in an invoices table to an id column in an orders table, the plugin generates:

  • Invoice.belongs_to :order; and
  • Order.has_one :invoice.

You can download the latest version directly from svn://rubyforge.org//var/svn/redhillonrails/trunk/vendor/plugins/foreign_key_associations

For all those that have asked for pure HTTP access, I hear you and I'm working on it. (It seems ./script/plugin install doesn't understand the format of the browse repository pages on RubyForge. DOH!)

Shameless Plugs

Recommend me on Working With Rails

Simian (Similarity Analyser): Rapidly identifies duplication in Java, C#, C, C++, COBOL, Ruby, JSP, ASP, HTML, XML, SQL, Visual Basic source code and even plain text files.

Beginning Algorithms: A good understanding of algorithms, and the knowledge of when to apply them, is crucial to producing software that not only works correctly, but also performs efficiently.

Blogroll

Creative Commons License
This weblog is licensed under a Creative Commons License.

Powered by
Movable Type 3.2