Friday, March 30, 2012

 

Some fun with ruby 1.9(.3) and string encoding..

Okay, I should probably start by directing you here: http://blog.grayproductions.net/articles/understanding_m17n If you wanna really get dirty in character encoding in ruby, read up (by the way, i think m17n = multilingualization or somesuch).

Anyway, so I absorbed some percentage of that, but was a little surprised to see some of the default behavior of ruby 1.9, in particular what happens when you do some string concatenation/interpolation with mixed ASCII and UTF-8 encoded strings. Surprisingly, if you combine two such strings, it will sometimes result in an ASCII-encoded string, sometimes UTF-8-encoded string, depending on whether there are multibyte chars or not in the UTF-8 substring!:

> irb
1.9.3p125 :001 > foo = "foo"
=> "foo"
1.9.3p125 :002 > bar = "bar"
=> "bar"
1.9.3p125 :003 > baz = "báz"
=> "báz"
1.9.3p125 :004 > foo.encoding.name
=> "UTF-8"
1.9.3p125 :005 > bar.encoding.name
=> "UTF-8"
1.9.3p125 :006 > baz.encoding.name
=> "UTF-8"
1.9.3p125 :007 > foobar1 = "#{foo.force_encoding(Encoding::US_ASCII)}#{bar}#{bar}"
=> "foobarbar"
1.9.3p125 :008 > foobar2 = "#{foo.force_encoding(Encoding::US_ASCII)}#{bar}#{baz}"
=> "foobarbáz"
1.9.3p125 :009 > foobar1.encoding.name
=> "US-ASCII"
1.9.3p125 :010 > foobar2.encoding.name
=> "UTF-8"

As a result, at Goodreads, we had to do some monkey-patching as we were getting some US-ASCII strings back from some rails helper code (pluralize(), number_with_delimiter()) as well as some ruby built-in classes (to_s() from NilClass, Float, Fixnum, Array). There must be a better way, but we've now got this force-utf8 monkey patch file with stuff like this:

module ActionView
module Helpers
module NumberHelper
def number_with_delimiter_with_force_utf8(*args)
number_with_delimiter_without_force_utf8(*args).force_encoding(Encoding::UTF_8)
end
alias_method_chain :number_with_delimiter, :force_utf8
end
end
end

# bunch of to_s that need fixing...maybe see if there's a [Class1, Class2].each way of
# doing this that's a little DRYer...
class Array
def join_with_force_utf8(*args)
join_without_force_utf8(*args).force_encoding(Encoding::UTF_8)
end
alias_method_chain :join, :force_utf8
end
class Fixnum
def to_s_with_force_utf8(*args)
to_s_without_force_utf8(*args).force_encoding(Encoding::UTF_8)
end
alias_method_chain :to_s, :force_utf8
end
class Float
def to_s_with_force_utf8(*args)
to_s_without_force_utf8(*args).force_encoding(Encoding::UTF_8)
end
alias_method_chain :to_s, :force_utf8
end
class NilClass
def to_s_with_force_utf8(*args)
to_s_without_force_utf8(*args).force_encoding(Encoding::UTF_8)
end
alias_method_chain :to_s, :force_utf8
end

Note: we also started marking a bunch of our code files with the magic comment (more here) because that seems to be the most effective way to force ruby to default new strings to UTF-8 encoding (there are a few other options, but this has been easiest and most effective). Being an emacser, I tend to propagate the # -*- coding: utf-8 -*- form...

Tuesday, February 21, 2012

 

How to test code in your ActiveRecord after_commit callbacks disabling transactional fixtures per-test without hiding bugs

Okay, so at Goodreads we're finally moving to rails 3.2, and in the process we've discovered the cool after_commit and after_rollback callbacks that easily let you execute things outside the transaction that wraps an active record save/destroy. This was exactly what we needed for a few cases where we were doing non-mission-critical updates in callbacks that can on occasion take a bit of time (hitting memcached or redis servers when our resque queue was overwhelmed) inside transactions.

So we had reason to try converting some of our after_(save|update|destroy|create) callbacks to after_commit calls. The easy step is figuring out how to deal with code that used to do name_changed? or name_was helpers (they retain their expected behavior only within the transaction; once the commit takes place they all get reset). So all we did was keep some after_xxx methods around that set instance variables, then let the after_commit methods read those instance variables (then reset to some disabled state lest we re-process the same code again if models get saved repeatedly for some reason).

Okay, but what about testing? By default, "transactional fixtures" are enabled. I'm not really sure why that's the terminology, as we use this behavior, but I really, really hate testing with fixtures. In fact, I just spent about 18 hours straight yesterday ripping a ton of them out of our codebase to get this all working. But I digress.

The problem is that if we want our after_commit callbacks to fire, they can't be wrapped by transactions around the entire test, because the commit never happens (the transaction gets rolled back after the test runs, pass or fail). So we had to find another way. The only options we could come up with were:

As with any good blog post, we chose door number 3. Reasoning:

So we're using rails 3.2.1, hopefully this is stable across several revisions, I'd hate to have to chase this down again. :( But here's the key:

# activerecord-3.2.1/lib/active_record/fixtures.rb:
...
module ActiveRecord
module TestFixtures
...
def run_in_transaction?
use_transactional_fixtures &&
!self.class.uses_transaction?(method_name)
end
...
def teardown_fixtures
return unless defined?(ActiveRecord) && !ActiveRecord::Base.configurations.blank?

unless run_in_transaction?
ActiveRecord::Fixtures.reset_cache
end

# Rollback changes if a transaction is active.
if run_in_transaction?
@fixture_connections.each do |connection|
if connection.open_transactions != 0
connection.rollback_db_transaction
connection.decrement_open_transactions
end
end
@fixture_connections.clear
end
ActiveRecord::Base.clear_active_connections!
end
...
end
end

So two things:
And here's the code I wrote. I created a module that we just monkey-patch/mix in to ActiveSupport::TestCase. I aimed for a one-liner to deactivate fixtures (and not require an explicit cleanup call at the end, cause someone's gonna forget).

gist here

A few notes:
You just need to call a single method at the top of your test (or a setup method if you want to apply it to a all tests in a suite--again, here's a place I prefer rspec, as you can effectively have a separate setup method for an arbitrary grouping of tests within a given test suite, but meh):

test "some_method is supposed to do something interesting" do
disable_transactional_fixtures # optionally pass in args telling specific tables/models to delete
# do your tests...
# everything gets deleted on magically on its own! :)
end


---

Goodreads is hiring! Please check us out and make the world a better place for readers!

Labels: , , ,


Sunday, May 09, 2010

 

mailx, msmtp and gmail

So we wanted to send automated emails out from our ubuntu server via gmail. We found a recipe and all was fine...till google changed their certificate. We had sort of blindly followed directions, which included downloading a single CA certificate, and pointing msmtp to that one cert.

(From the comment below, I'm guessing our recipe came from http://philogroky.blogspot.com/2009/08/fixing-msmtp-to-send-mails-via-gmail.html, so you may want to start there to see the early steps for mailx and msmtp...though if you're on ubuntu, think you'll be set with sudo apt-get install msmtp bsd-mailx (or some other flavor of mailx)...

We eventually caught on and realized that was silly when we had a whole slew of CA certificates that we already trusted on the server. So instead, point msmtp to that!

So here's our .msmtp file and we've been pretty happy ever since!:



account gmail
auth on
host smtp.gmail.com
port 587
user ouraccount@somedomain.com
password somepassword
from ouraccount@somedomain.com
tls on
tls_starttls on
# tls_trust_file argument is the full path to the certificate
# changed to this as suggested on
# http://philogroky.blogspot.com/2009/08/fixing-msmtp-to-send-mails-via-gmail.html
# hopefully never have to update this stupid thing again...
tls_trust_file /etc/ssl/certs/ca-certificates.crt
maildomain gmail.com
account default : gmail

Friday, April 16, 2010

 

Getting cassandra up to hit with ruby and C++ clients

So at Discovereads we're giving cassandra a spin on the dance floor to see how she moves. We also need to connect via ruby and C++ clients (most of our writing will come from C++ and mostly reading (though some writing as well) from the ruby).

Seems everyone has a blog post like this one where he or she says "it took forever, kept looking at other sites and none of them just worked, so hopefully I'll save someone else some time and put the steps I took here". Well, this is mine. Pretty skeptical it'll actually help anyone else, but I hope it does!

I got a lot of help (on the ruby side) from http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/

And after accomplishing hitting from the ruby client I moved on to trying to hit with the libcassandra lib from posulliv: http://posulliv.github.com/2010/02/22/cpp-cassandra.html , http://github.com/posulliv

First, get cassandra (the ruby way), following evan weaver's instructions (link above)

Now, to get the C++ lib working, there were a few unsatisfied dependencies:
- boost (I used 1.4.2.0): http://sourceforge.net/projects/boost/files/boost/1.42.0/
- thrift (I used thrift 0.2.0): http://incubator.apache.org/thrift/download/

Cassandra C++ lib

get boost:get thrift:libcassandra:still verifying if all looks well...

Wednesday, August 20, 2008

 
boo!

This page is powered by Blogger. Isn't yours?