-
Setting up svnsync-ed (mirrored) SVN repositories on Ubuntu (part 2 of 2)
Note: if you haven't already, you may want to read part 1 of this article first.
Phew, and that was all the work you needed to do to migrate your Subversion repository to another server. We've barely touched that other server that we wanted to use for mirroring the repository!
"Bootstrapping" your mirror SVN server for svnsync
This mirror SVN server also needs Subversion 1.4.x installed, so go ahead and do the (almost) same things we've done in part 1 to get Subversion installed. You should be able to use the
.debpackage generated bycheckinstallon your main Subversion server to install Subversion 1.4.x on the mirror SVN server. Just scp it to the mirror SVN server and install it (dpkg -i subversion-1.4.3.deb).With Subversion installed, create a new Subversion repository (using
svnadmin create, remember?), but don't load the repository dump like we did on the main SVN server in part 1 (of course, since we are going to be mirroring the repository!).Setting up svnsync to mirror your repository
First, create a SVN user for svnsync to use - let's call this the 'svnsync user'. The easiest (and best) way to do this is edit the svnserve.conf and passwd files:
conf/svnserve.conf
# Uncomment this line. password-db = passwdconf/passwd
svnsync = secretThis gives read and write access to the 'svnsync user'. The
svnsyncprogram will authenticate with our repositories as this user via thesvn://protocol (i.e. via svnserve).Next, we need to create a
pre-revprop-changehook for the destination repository. The svnsync documentation has a detailed explanation. Create ahooks/pre-revprop-changefile under your destination repository's directory.#!/bin/sh USER="$3" if [ "$USER" = "svnsync" ]; then exit 0; fi echo "Only the svnsync user can change revprops" >&2 exit 1Make it executable, and then initialize the sync:
chmod +x hooks/pre-revprop-change svnsync init file:///var/svn/repositories/destination_repos svn://source.host/source_reposDon't worry, this only sets up the sync - there's no actual data copying yet. Syncing your repository data may take a long time if you have a big source repository, so I suggest using
nohupto run the code overnight (or something), or at least saving the output in a log. Either way, the command to start the sync is:svnsync sync --username svnsync file:///var/svn/repositories/testsync/You should start seeing svnsync committing in changes from your source repository. Instant gratification (well, almost)! Should your svnsync process get aborted or killed, you can remove the hanging lock by running:
svn propdel svn:sync-lock --revprop -r 0Setting up 'on-the-fly' syncing
So now you have your source and destination repositories synced, but what happens when you start committing changes to your source repository? Nothing! That's because svnsync is merely a passive syncing tool (meaning you have to run it to sync, instead of it knowing when to sync automatically).
There are two ways you can setup 'real-time' syncing:
-
Use
cron(or a similar scheduler) on the destination repository server. Add something like this to yourcrontab:* * * * * /usr/local/bin/svnsync --non-interactive sync svn://source.host/source_reposThis basically runs svnsync on your destination repository server every minute to pull down any changes to your source repository.
- Add a post-commit hook to the source repository. I found this svnsync entry by Paul Querna that has a sample post-commit hook. If I recall correctly I tried it but it didn't work for me, so I settled on using cron to sync up my repositories.
Things that I skipped
There're some things that I skipped over while writing this, mainly to do with SVN authentication.
-
If you're accessing your repository via the
svn+ssh://protocol, you've to manage the (group) permissions of the repository files in the filesystem appropriately (basically the repository should be group writable by your users).chmodandchownare your friends, as is NIS (or something similar) to manage your users. I use these steps to create a new SVN repository that gets access via thesvn+ssh://protocol:sudo mkdir /var/svn/repositories/funky_project mkdir /tmp/funky_project mkdir /tmp/funky_project/trunk mkdir /tmp/funky_project/branches mkdir /tmp/funky_project/tags sudo svnadmin create /var/svn/repositories/funky_project sudo svn import /tmp/YourProjectNameHere file:////var/svn/repositories/funky_project -m "Initial import." rm -rf /tmp/funky_project sudo chown -R www-data:www-data /var/svn/repositories/funky_project sudo chmod -R g+w /var/svn/repositories/funky_projectAs you can see, my SVN users are part of the www-data group, and the repository directory is made group-writable.
- The
svn://protocol has authentication configuration files in theconf/directory of your repository. The SVN book has a section explaining how to configure authentication forsvnserve. - Apache httpd can be used to expose your SVN repositories via the WebDAV protocol. This allows for the very commonly seen
http://repository URLs (especially for Open Source projects). Configuration is a little more involved and you would probably have to install Apache from source as well. The SVN book has the details.
Wrapping up
I hope someone found this entry useful - I know I could have used one when I was setting up Subversion and svnsync.
-
Use
-
Testing rescue_action_in_public with RSpec
After overriding
rescue_action_in_publicin theApplicationControllerto deal withActiveRecord::RecordNotFoundexceptions (a very common exception torescuein the canonical 'show' actions of your controllers), I decided to test it. I've been getting used to BDD with RSpec (and the Spec::Rails plugin), so I stumbled a bit when writing the spec.I finally settled on this:
class DummyController < ApplicationController def index end end context 'A child class of ApplicationController' do controller_name :dummy specify 'should render a 404 error for ActiveRecord::RecordNotFound, ActionController::UnknownController, ActionController::UnknownAction, ActionController::RoutingError exceptions (in public)' do exceptions_404 = [ ActionController::RoutingError.new('test'), ActiveRecord::RecordNotFound.new, ActionController::UnknownController.new, ActionController::UnknownAction.new] exceptions_404.each do |exception| controller.eigenclass.send(:define_method, :index) do raise exception.class, 'some message' end lambda { get 'index' }.should_raise(exception.class) controller.send :rescue_action_in_public, exception response.should be_missing end end endNotice the use of a dummy controller so that we can actually make a request to it (and get all the Rails magic and environment set up ready for testing). Also, I had to use instances of the exceptions rather than their classes because I'm
sending arescue_action_in_publicmessage to the controller without knowing how to instantiate the exceptions (for example,ActionController::RoutingErroractually has a constructor which requires at least 1 argument). So I create the exceptions first.The
eigenclassmethod simply returns Ruby's canonical singleton class or metaclass, depending on who you talk to (i.e.class << self; self; end;) and I modify the dummy 'index' action to raise the exception. And here's the stinky part:lambda { get 'index' }.should_raise(exception.class) controller.send :rescue_action_in_public, exceptionMake a GET to the 'index' action, make sure it raises the exception and catch it (with
should_raise- the assertion is unnecessary since I did override 'index' to raise the exception), and then forcerescue_action_in_publicto be called. Something's fishy here - why isn't the exception caught by default byrescue_action_in_public? I've set these to make sure that rescue_action_in_public is called but it seems like it never is called:ActionController::Base.consider_all_requests_local = false controller.eigenclass.send(:define_method, :local_request?) do false endI traced the code into ActionController::Rescue and everything seems to be in order. I'm stumped and weary, I think I'll look at this again tomorrow. Anyone see any obvious mistakes?
-
Setting up svnsync-ed (mirrored) SVN repositories on Ubuntu (part 1 of 2)
This is a 2-part journal on setting up migrating and upgrading a Subversion repository, and then using svnsync to mirror the newly created repository. (Part 2)
Initial setup
Ever since Subversion 1.4 was released, I'd been eying the new svnsync tool because we had a single repository that was not, erm, really backed up (we had daily server backups and occasional manual repository dumps but that was it). svnsync promised to make repository mirroring simple, and after doing some repository migration and upgrading, I can assure you it really does make things easier than any other (more manual) repository backup solutions I had seen before. This is a walkthrough of how you can upgrade your pre-1.4 SVN repositories to 1.4.x, and setup svnsync to mirror your repositories. It's going to be very biased to Ubuntu but I'm sure you can translate any Ubuntu specific steps to your favorite distros.
Here's our initial setup:
- A pre-1.4 (it was version 1.2.3) SVN repository that needed to be upgraded and migrated to another server.
- 2 cleanly installed Ubuntu 6.06 LTS VPSs, one of which is the intended target for the repository migration. The other would mirror the new 1.4.x repository (using svnsync).
An un-installable Subversion 1.4.x?
I wish I could have simply ran
sudo apt-get install subversionand have Ubuntu pull down the latest 1.4.x .debs. Unfortunately, the version of Subversion in the Ubuntu apt-get repository is still 1.3.1 (which doesn't have svnsync). If anyone knows a reliable way to install Subversion 1.4.x via apt-get, let me know! I looked around for a good edge sources.list but came back empty-handed.I balk at installing stuff from source because I never did figure out how to easily clean out the stuff that gets installed. All thanks to this reluctance, I went digging around and found checkinstall. This thing is awesome - I wonder why I didn't manage to find it earlier.
What
checkinstallbasically allows you to do is, instead of running the usualmake installafter the usualconfigureandmakesteps, it creates a Debian package (it also does RPMs and Slackware packages) for you that is easily un-installable withdpkg, and then proceeds to install the files just as it would have for any other deb.On Ubuntu it's really easy to install checkinstall, just:
sudo apt-get install checkinstallNow, you no longer should type 'make install' - always use the 'checkinstall' command instead:
./configure make sudo checkinstall # instead of "make install"checkinstallwill ask you a bunch of stuff but you can just go with the defaults for most of them - I did name my packages 'XXX from source', like 'Subversion 1.4.3 from source' so it's easier to check which packages arecheckinstall-generated with a simple grep todpkg -l.checkinstallgenerates a .deb (Debian) package before it actually installs your software (Subversion, in this case). It should tell you right at the end of its installation process about where to find this .deb and how to uninstall your newly installed (from source!) package (something likedpkg -r subversion-1.4.3). Don't delete this .deb yet as we will be using it to install Subversion 1.4.x on our mirror SVN server.Installing an un-installable Subversion 1.4.x on Ubuntu
Now, blessed with our new checkinstall-granted powers, we can install Subversion from source without any qualms. Before we start, be sure to purge any existing Subversion packages you may have installed (do note that if you're using any packages that depend on the official Ubuntu Subversion packages, you may run into library version problems).
dpkg -l | grep svn dpkg -l | grep subversion sudo dpkg --purge subversion sudo dpkg --purge libsvn0Now, it's time to get the source. Get it from the official Subversion website. Look for the source code download - the file should be something like this: http://subversion.tigris.org/downloads/subversion-1.4.3.tar.gz. Remember to get SVN dependencies (something like this: http://subversion.tigris.org/downloads/subversion-deps-1.4.3.tar.gz) as well as these are needed for access to 'http://' scheme SVN repositories. If you want to use Subversion to connect to a server via a http:// or https:// URL, you will require these dependencies (more specifically, the Neon library).
Use something efficient like wget, curl or Axel (love Axel) to get the sources on the server where you want to install Subversion. Unpack them to the same directory. configure. make. checkinstall.
tar zxf subversion-1.4.3.tar.gz tar zxf subversion-deps-1.4.3.tar.gz cd subversion-1.4.3 ./configure # Be sure to read the INSTALL file for any options you may want to set (such as SSL) make sudo checkinstallIf you get a warning "configure: WARNING: we have configured without BDB filesystem support" during your
configurestep, you'll get by just fine. Unless you specifically want your Subversion repositories in Berkeley DB format, we can ignore the warning (Subversion will use FSFS filesystem for your repositories) - see FSFS notes and Choosing a Data Store if you want to make an educated decision.Anyway, now with a brand new Subversion 1.4.x installed, we are finally ready for the real work - migrating your Subversion repository!
Dumping and importing a repository
Dumping a Subversion repository is dead easy:
svnadmin dump /path/to/repository > repository_name.dumpDepending on how big your repository is, you could end up with a pretty large dump file. gzip it, then scp it over to your new server, then gunzip it. Use svnadmin to load the repository dump.
cd /var/svn # I like to keep my svn repositories under /var/svn mkdir repository_name svnadmin create repository_name svnadmin load repository_name < /path/to/repository_name.dumpIf you have a good pipe between the source and destination servers, you can do this in a one-liner:
svnadmin dump /path/to/repository | ssh -C [IP/domain of destination server] svnadmin load /path/to/new_repositoryOf course, all this dumping would require a temporary suspension of any repository write actions otherwise you're just going to have an inconsistent dump - just send out an email to your fellow developers and disable svn access.
Setting up access to your new repository
Now, you have a Subversion repository that is only accessible via the local filesystem (file:// 'protocol'), which isn't very useful. We'll need to setup remote access. Your Subversion repository can be accessed in a variety of ways, including:
- svnserve standalone daemon (svn://)
- svnserve with
inetd(svn://) - svnserve over a SSH tunnel (svn+ssh://)
- over the HTTP protocol (http:// and https://)
The svnserve documentation details how to deal with the first 3, and setting up http:// and https:// access to your protocol is really a subject that deserves its own tutorial. Try the SVN book or Google.
Personally I prefer svn+ssh:// access for internal projects since it allows me to unify authentication for my Subversion repositories with UNIX user accounts. Be wary of an angry cadre Windows developers though, since they need to take quite a good number of steps to setup public key authentication and integrate it with their svn clients on Windows machines. Integration with TortoiseSVN is quite a pain, though my Windows-using colleague at work found these useful: Putty and TortoiseSVN, Using Cygwin, Keychain, SVN+SSH and TortoiseSVN in Windows.
svn:// access
I also expose my repositories via svn:// (as we'll see later, this is useful for allowing access to a svnsync user without messing around with any UNIX user accounts) and use the
xinetddaemon (apt-get install xinetdon Ubuntu to install) to launch svnserve process. If you're taking this path, create a file (I name it 'svn') in/etc/xinet.dto tell xinetd about svnserve.In /etc/xinet.d/svn:
service svn { port = 3690 socket_type = stream protocol = tcp wait = no user = www-data server = /usr/local/bin/svnserve server_args = -i -r /var/svn }Notice that I needed to use the full path to svnserve (do a
which svnserveto get the full path, making sure this is the 1.4.x version that you just installed). Theserver_argsparameter also bears some explanation. The-ioption tells svnserve to use inetd (xinetd is a variant of inetd, sorta). The-r /var/svnoption tells svnserve to only expose repositories below that path. This basically translates your repository at/var/svn/my_cool_projectto be accessible viasvn://your.hostname/my_cool_project.svn+ssh:// access
Accessing your repository this way basically logs in to the host server of your repository over SSH, invokes the svnserve process, and accesses your repository in a very file://-like manner. What this means is that your repository path is taken from the root of your filesystem. An example: a repository located in
/var/svn/my_cool_projectwould be available atsvn+ssh://your.hostname/var/svn/my_cool_project. For this reason I often symlink/svnto/var/svn(to get repository URLs likesvn+ssh://your.hostname/svn/my_cool_projectinstead).Relocating working copies
Now, all your working copies are still pointing to the old Subversion server - no need to fret, a simple
svn switchfixes things:svn switch --relocate [from] [to]Replace '[from]' and '[to]' with the source and destination Subversion repository URLs.
Remember to stop access to your old server so no one is making commits to the wrong place.
Setting up svnsync
I'd intended to write this entire piece in one blog post, but I'm running out of steam at this point. In Part 2, we'll actually setup svnsync for some repository mirroring goodness!
-
Checking for duplicate ActiveRecord objects
I've been writing a database importer plugin for a Rails application that needs to data on some "legacy" production databases (well, not really legacy, but the schema differs from ActiveRecord conventions) with the intention of scheduling a cron job to run the imports. Why not connect the Rails app to the legacy databases? Hmm, let's see:
- the records don't have to be up to date (so I can afford to, say, import yesterday's records today),
- less jumping through hoops molding ActiveRecord models to the legacy databases,
- the production database schema is liable to change - but this should not affect my Rails application,
- there will be lower loads on the legacy databases which are in full-blown production use, and
- most importantly, it gives me an excuse to figure out writing a data importer for a Rails application.
And I am surprised that it actually was rather fun writing the importer plugin (data importing stuff is normally one of the most unexciting things a programmer can do, right next to writing lengthy requirements documentation and any kind of contact sport). It's basically a plugin that defines ActiveRecord models on the source (legacy) databases and then creates our Rails app's models from these. Importer classes allow me to then run the imports using
script/runnerlike so:script/runner "HotelsImporter.import :start => 2.days.ago.to_date, :end => 1.day.ago.to_date" -e productionPut that in a cron job and there you go, scheduled daily (or hourly, whatever) imports.
But I digress. What was I actually going to talk about? Oh yes, checking for duplicate ActiveRecord objects. Now, the importers I wrote were run daily but there was the risk of re-importing the same data again (due to failed cron jobs, running the same job twice, acts of god, etc.). To be defensive, I needed to check that there were no existing records before importing them from the legacy databases.
At this point I could decide to run uniqueness checks on any natural keys of each table (and Rails makes this really easy with AR validations, as we all know), or rely on a more convenient "the whole hog" field-by-field comparison. I settled on doing a field-by-field comparison after realizing that:
- it's easier and I don't have to specify which natural fields constitute the natural keys, and
- there are some tables which don't really have a natural keys (these generally belong to has_many side of an association).
Update: As choonkeat pointed out in a comment below, I can simply use
Post.find(:all, :conditions => new_post.attributes)since that stood out very clearly as the way to do it. This was actually the first way I tried to do this but it didn't work in the importer - I must have been doing something stupid! Doh! Thanks choonkeat for pointing out my blooper. Anyway you can mostly ignore what follows below but I'll keep it here to remind myself of my error.So I went looking for an easy way or a Railism to check whether an existing new ActiveRecord object already exists in the database. Hmm, I couldn't find anything helpful - I guess everyone is relying on AR validations. Still, I went ahead and mixed in a
to_conditionsinstance method to ActiveRecord::Base - looks like my answer to everything nowadays is to re-open existing classes.module Bezurk #:nodoc: module ActiveRecord #:nodoc: module Extensions def to_conditions attributes.inject({}) do |hash, (name, value)| hash.merge(name.intern => value) end end alias :to_conditions_hash :to_conditions end end end # ... ActiveRecord::Base.send(:include, Bezurk::ActiveRecord::Extensions)So now in my importers I can easily check for potential duplicate entries:
new_post.save! if Post.find(:all, :conditions => new_post.to_conditions).empty?Now, I just have this nagging suspicion that there is a better way to do this...
-
irb and script/console tab-completion
Ugh, I wish I found this earlier: Tab Completion in IRb. I only went googling for this after I realized I have been tabbing to get auto-completion on
script/consolefor a bit but it never sunk in that tab-completion wasn't ever working. Useful stuff, go set it up if you haven't already.
subscribe via RSS