bmpercy

Friday, March 30, 2012

Some fun with ruby 1.9(.3) and string encoding..

Okay, I should probably start by directing you here: http://blog.grayproductions.net/articles/understanding_m17n If you wanna really get dirty in character encoding in ruby, read up (by the way, i think m17n = multilingualization or somesuch).

Anyway, so I absorbed some percentage of that, but was a little surprised to see some of the default behavior of ruby 1.9, in particular what happens when you do some string concatenation/interpolation with mixed ASCII and UTF-8 encoded strings. Surprisingly, if you combine two such strings, it will sometimes result in an ASCII-encoded string, sometimes UTF-8-encoded string, depending on whether there are multibyte chars or not in the UTF-8 substring!:


> irb
1.9.3p125 :001 > foo = "foo"
 => "foo" 
1.9.3p125 :002 > bar = "bar"
 => "bar" 
1.9.3p125 :003 > baz = "báz"
 => "báz" 
1.9.3p125 :004 > foo.encoding.name
 => "UTF-8" 
1.9.3p125 :005 > bar.encoding.name
 => "UTF-8" 
1.9.3p125 :006 > baz.encoding.name
 => "UTF-8" 
1.9.3p125 :007 > foobar1 = "#{foo.force_encoding(Encoding::US_ASCII)}#{bar}#{bar}"
 => "foobarbar" 
1.9.3p125 :008 > foobar2 = "#{foo.force_encoding(Encoding::US_ASCII)}#{bar}#{baz}"
 => "foobarbáz" 
1.9.3p125 :009 > foobar1.encoding.name
 => "US-ASCII" 
1.9.3p125 :010 > foobar2.encoding.name
 => "UTF-8"

As a result, at Goodreads, we had to do some monkey-patching as we were getting some US-ASCII strings back from some rails helper code (pluralize(), number_with_delimiter()) as well as some ruby built-in classes (to_s() from NilClass, Float, Fixnum, Array). There must be a better way, but we've now got this force-utf8 monkey patch file with stuff like this:


module ActionView
  module Helpers
    module NumberHelper
      def number_with_delimiter_with_force_utf8(*args)
        number_with_delimiter_without_force_utf8(*args).force_encoding(Encoding::UTF_8)
      end
      alias_method_chain :number_with_delimiter, :force_utf8
    end
  end
end

# bunch of to_s that need fixing...maybe see if there's a [Class1, Class2].each way of
# doing this that's a little DRYer...
class Array
  def join_with_force_utf8(*args)
    join_without_force_utf8(*args).force_encoding(Encoding::UTF_8)
  end
  alias_method_chain :join, :force_utf8
end
class Fixnum
  def to_s_with_force_utf8(*args)
    to_s_without_force_utf8(*args).force_encoding(Encoding::UTF_8)
  end
  alias_method_chain :to_s, :force_utf8
end
class Float
  def to_s_with_force_utf8(*args)
    to_s_without_force_utf8(*args).force_encoding(Encoding::UTF_8)
  end
  alias_method_chain :to_s, :force_utf8
end
class NilClass
  def to_s_with_force_utf8(*args)
    to_s_without_force_utf8(*args).force_encoding(Encoding::UTF_8)
  end
  alias_method_chain :to_s, :force_utf8
end

Note: we also started marking a bunch of our code files with the magic comment (more here) because that seems to be the most effective way to force ruby to default new strings to UTF-8 encoding (there are a few other options, but this has been easiest and most effective). Being an emacser, I tend to propagate the # -*- coding: utf-8 -*- form...

# posted by brainpercy @ 10:20 PM 0 comments

Tuesday, February 21, 2012

How to test code in your ActiveRecord after_commit callbacks disabling transactional fixtures per-test without hiding bugs

Okay, so at Goodreads we're finally moving to rails 3.2, and in the process we've discovered the cool after_commit and after_rollback callbacks that easily let you execute things outside the transaction that wraps an active record save/destroy. This was exactly what we needed for a few cases where we were doing non-mission-critical updates in callbacks that can on occasion take a bit of time (hitting memcached or redis servers when our resque queue was overwhelmed) inside transactions.

So we had reason to try converting some of our after_(save|update|destroy|create) callbacks to after_commit calls. The easy step is figuring out how to deal with code that used to do name_changed? or name_was helpers (they retain their expected behavior only within the transaction; once the commit takes place they all get reset). So all we did was keep some after_xxx methods around that set instance variables, then let the after_commit methods read those instance variables (then reset to some disabled state lest we re-process the same code again if models get saved repeatedly for some reason).

Okay, but what about testing? By default, "transactional fixtures" are enabled. I'm not really sure why that's the terminology, as we use this behavior, but I really, really hate testing with fixtures. In fact, I just spent about 18 hours straight yesterday ripping a ton of them out of our codebase to get this all working. But I digress.

The problem is that if we want our after_commit callbacks to fire, they can't be wrapped by transactions around the entire test, because the commit never happens (the transaction gets rolled back after the test runs, pass or fail). So we had to find another way. The only options we could come up with were:

Hack the callbacks to call the commit callbacks without actually committing the records (a la something like this)

Disable all transactional fixtures and handle the data cleanup ourselves (set use_transactional_fixtures = false in your TestCase class)

hack Test::Unit to not wrap (or unwrap) the tests in transactions

As with any good blog post, we chose door number 3. Reasoning:

The first option leaves us wide open to bugs, especially of the variety where we might depend on name_changed?-type logic, which may still return true if the transaction hasn't yet been committed yet...which would result in tests passing even if the code would fail in production (!). This was too much of a risk for me to stomach

The second option sounds like a lot of work, a pain every time you add a new model/table, and potentially slow if we're deleting data from every table after each test (could do "TRUNCATE TABLE foo" to be fairly fast, but it seemed to be about 0.5 sec each time we did this for our schema. With thousands of tests, it starts to add up. We're also kind of sick of having "data leakage" between tests, and this approach seems to encourage it. One note: I prefer rspec syntax, but we're kinda stuck with Test::Unit for historical reasons...I'm not sure if rspec's hierarchical structure would allow enabling/disabling transactional fixtures at a more granular level? I guess I think I remember having to do it at the class-level there too (all tests are either transactional or not, not determined at the individual test level).

So option three: find a way to hack Test::Unit to not wrap the tests in transactions, allowing us to select on a per-test basis whether to wrap in a transaction or not. I actually never dug deep enough to find where that code was...and considered monkey-patching or subclassing to maintain a separate queue of tests that are to be run outside transactions. In the end I took a bit of a shortcut and actually rollback the transactions at the beginning of the test. Yeah, the first thing we do is rollback the transaction, turn off transactional fixtures (so activerecord doesn't try to roll things back), run test, then restore everything the way it was (minus the transaction....it appears ActiveRecord plays well and doesn't try to rollback transactions that aren't there. Bam, done!

So we're using rails 3.2.1, hopefully this is stable across several revisions, I'd hate to have to chase this down again. :( But here's the key:


# activerecord-3.2.1/lib/active_record/fixtures.rb:
...
module ActiveRecord
  module TestFixtures
    ...
    def run_in_transaction?
      use_transactional_fixtures &&
        !self.class.uses_transaction?(method_name)
    end
    ...
    def teardown_fixtures
      return unless defined?(ActiveRecord) && !ActiveRecord::Base.configurations.blank?

      unless run_in_transaction?
        ActiveRecord::Fixtures.reset_cache
      end

      # Rollback changes if a transaction is active.
      if run_in_transaction?
        @fixture_connections.each do |connection|
          if connection.open_transactions != 0
            connection.rollback_db_transaction
            connection.decrement_open_transactions
          end
        end
        @fixture_connections.clear
      end
      ActiveRecord::Base.clear_active_connections!
    end
    ...
  end
end

So two things:

run_in_transaction?: by setting use_transactional_fixtures = false, we can force this method to return false

so in teardown_fixtures, it won't bother trying to do any rollback at all...so we can "safely" roll back the transaction before even beginning our test

And here's the code I wrote. I created a module that we just monkey-patch/mix in to ActiveSupport::TestCase. I aimed for a one-liner to deactivate fixtures (and not require an explicit cleanup call at the end, cause someone's gonna forget).

gist here

A few notes:

setup_fixtures is a complement to teardown_fixtures, in ActiveRecord::TestFixtures. I'm just calling it here to restore any fixture data that tests actually need

delete_everything is a method specific to our code that knows which tables/models to delete. There are a few options for this...maybe just do a "SHOW TABLES" query and delete what you've got (except maybe schema_migrations ;) ). Maybe query all descendant classes of ActiveRecord::Base. We chose to maintain a list of models and tables so we can be a little more selective of which tables we clear out (to save a little processing time). It'll mean a little more maintenance, but we're a little more stable in terms of our schema, so saving a few minutes on running our full set of tests is probably worth it.

You just need to call a single method at the top of your test (or a setup method if you want to apply it to a all tests in a suite--again, here's a place I prefer rspec, as you can effectively have a separate setup method for an arbitrary grouping of tests within a given test suite, but meh):


test "some_method is supposed to do something interesting" do
  disable_transactional_fixtures # optionally pass in args telling specific tables/models to delete
  # do your tests...
  # everything gets deleted on magically on its own! :)
end

---

Goodreads is hiring! Please check us out and make the world a better place for readers!

Labels: fixtures, rails 3.2, testing, transactions

# posted by brainpercy @ 6:02 PM 1 comments

Sunday, May 09, 2010

mailx, msmtp and gmail

So we wanted to send automated emails out from our ubuntu server via gmail. We found a recipe and all was fine...till google changed their certificate. We had sort of blindly followed directions, which included downloading a single CA certificate, and pointing msmtp to that one cert.

(From the comment below, I'm guessing our recipe came from http://philogroky.blogspot.com/2009/08/fixing-msmtp-to-send-mails-via-gmail.html, so you may want to start there to see the early steps for mailx and msmtp...though if you're on ubuntu, think you'll be set with sudo apt-get install msmtp bsd-mailx (or some other flavor of mailx)...

We eventually caught on and realized that was silly when we had a whole slew of CA certificates that we already trusted on the server. So instead, point msmtp to that!

So here's our .msmtp file and we've been pretty happy ever since!:


  account gmail
  auth on
  host smtp.gmail.com
  port 587
  user ouraccount@somedomain.com
  password somepassword
  from ouraccount@somedomain.com
  tls on
  tls_starttls on
  # tls_trust_file argument is the full path to the certificate
  # changed to this as suggested on
  #   http://philogroky.blogspot.com/2009/08/fixing-msmtp-to-send-mails-via-gmail.html
  # hopefully never have to update this stupid thing again...
  tls_trust_file /etc/ssl/certs/ca-certificates.crt
  maildomain gmail.com
  account default : gmail

# posted by brainpercy @ 7:00 PM 0 comments

Friday, April 16, 2010

Getting cassandra up to hit with ruby and C++ clients

So at Discovereads we're giving cassandra a spin on the dance floor to see how she moves. We also need to connect via ruby and C++ clients (most of our writing will come from C++ and mostly reading (though some writing as well) from the ruby).

Seems everyone has a blog post like this one where he or she says "it took forever, kept looking at other sites and none of them just worked, so hopefully I'll save someone else some time and put the steps I took here". Well, this is mine. Pretty skeptical it'll actually help anyone else, but I hope it does!

I got a lot of help (on the ruby side) from http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/

And after accomplishing hitting from the ruby client I moved on to trying to hit with the libcassandra lib from posulliv: http://posulliv.github.com/2010/02/22/cpp-cassandra.html , http://github.com/posulliv

First, get cassandra (the ruby way), following evan weaver's instructions (link above)

Now, to get the C++ lib working, there were a few unsatisfied dependencies:
- boost (I used 1.4.2.0): http://sourceforge.net/projects/boost/files/boost/1.42.0/
- thrift (I used thrift 0.2.0): http://incubator.apache.org/thrift/download/

Cassandra C++ lib

get boost:

tar xzf boost_1_42_0.tar.gz
cd boost_1_42_0
./bootstrap.sh
./bjam
sudo ./bjam install --prefix=/usr/local (i think i'm being redundant with the /usr/local but oh well)

get thrift:

tar xzf thrift-0.2.0-incubating.tar.gz
cd thrift-0.2.0
(note: i was getting desperate when encountering errors so installed libevent (sudo port libevent) in here, but i don't think it was necessary for thrift to install. so skip this and see if everything still works....)
quick check of autoconf version:
autoconf -V (note the capital "V")
if version is 2.61 you're fine otherwise, see http://wiki.apache.org/thrift/ThriftInstallationMacOSX and try to figure something out
./configure
make
sudo make install

libcassandra:

git clone git://github.com/posulliv/libcassandra.git
cd libcassandra
./config/autorun.sh
./configure
make
It was complaining about about variodic macros and C99, so i went into the Makefile and removed the two instances of -Werror. yes, total hack, but it did compile.
sudo make install

still verifying if all looks well...

# posted by brainpercy @ 3:49 PM 0 comments

Wednesday, August 20, 2008

boo!

# posted by brainpercy @ 10:23 AM 0 comments