A Memcached Trick and an Old Watchdog

Posted by joe Oct 9th, 2014 | Comments

This article describes some code we developed to fix an issue with our Memcached environment. This took the form of adding a method to a class that we didn't own. The article then describes the way that we used the Watchdog gem to ensure that our class extension won't cause problems further down the road.

The problem

We use Memcached both for page fragment caching and for caching large collections of objects that require significant work to generate and fetch. For example, we have a page where we need to display a dot on a scatter plot for every building owned by the user. Associated with each dot is a small amount of work to 1) get associated data out of the database and 2) perform calculations on the data to get derived values that are used in the display of the scatter plot. This work is negligible for small numbers of buildings, but for users with thousands of buildings the work adds up to unacceptably long request times (actually they hit our self-imposed server timeouts first). For these larger accounts we simply can't afford to perform all those database queries and calculations during the server request cycle.

A scatter plot showing building values

Instead we cache all that work server-side so that a request for the same data can be served up directly without the need to fetch most data from the database or perform any calculations. We also employ an aggressive cache-warming strategy so that data is usually pre-cached when a user requests it.

There's much more to say about how to implement effective caching in an enterprise Rails application. Here we just want to talk about a specific pitfall we ran into with our caching setup.

For our caching we use the built-in caching mechanism in Rails along with the dalli gem, which provides a Memcached caching store called ActiveSupport::CacheDalliStore. This allows us to cache objects using the following simple API:

def meaning_of_the_universe
  a_bunch_of_work_to_calculate # => 42
end
 
Rails.cache.fetch('meaning_of_the_universe') { meaning_of_the_universe } # => 42

The way this works is that if the key "the_meaning_of_the_universe" is found in Memcached (called a cache hit) then we can skip the expensive calculation contained in the block. If the key isn't set (called a cache miss) then we do the work but then send the value to Memcached so that we can avoid the work the next time around. We can also do Rails.cache.read, which will read the value but do nothing if there is a cache miss, and Rails.cache.write, which will write the value without checking first to see if it already exists.

Every time you check the value of a cache, or write to it, you are making a request to a Memcached server (and a cache miss followed by a write will actually be two requests). Unless your Memcached server is running on the same machine as your application server there is going to be some network overhead associated with performing a lookup. The more keys you check for the more that network IO will add to the time of your request, which will eventually defeat the purpose of having a cache in the first place.

To avoid this it can make sense to perform a Rails.cache.read_multi operation, where you pass a list of keys and get back a hash of keys and values, with the value set to nil where there was a cache miss. This was the strategy we took on our page with the thousands of buildings displayed on a scatter plot. The code looked (very roughly) something like this:

buildings_hash = Rails.cache.read_multi(*building_keys)
buildings_hash.each do |key, value|
  buildings_hash[key] = Rails.cache.fetch(key) unless value
end

That is, you perform a bulk read on the keys and then go back and fill in the ones that were missing, instead of checking each one.

In general this strategy has worked out well for us. The problem (finally) that we ran into is that with our particular hosted Memcached provider and very large accounts we would get strange behavior. Sometimes the page would load right away and other times it would take minutes to generate, with no rhyme or reason. When we looked into things it turned out that sometimes the #read_multi operation was failing and we were treating the result as an empty hash. This meant that the page had to go recalculate and write the values for all of those buildings, even though the values actually existed in Memcached.

The solution

For whatever reason our hosted Memcached provider couldn't handle the large bulk read request we were sending them. The obvious solution was to break up the request into smaller chunks. To this end we created the following logic:

def read_multi_chunked(cache_keys, chunk_size: nil)
  return {} if cache_keys.empty?
  chunk_size ||= ENV.fetch('MEMCACHED_READ_MULTI_CHUNK_SIZE') { 1_000 }
  chunked_keys = cache_keys.in_groups_of(chunk_size, false)
  chunked_keys.reduce({}) do |hash, chunk|
    result = begin
               Rails.cache.read_multi(chunk)
             rescue Dalli::RingError
               {}
             end
    hash.merge(result)
  end
end

A quick explanation of the code:

take in a list of cache keys
get the chunk size from a parameter or from an environment variable
chunk the keys
perform read_multi operations for each chunk
combine the results into a single hash

In practice this new method has the same method signature (when the chunk size parameter is omitted) and output as the original read_multi method.

Next problem

The only problem remaining was deciding where to put this method. We try to keep methods out of the global scope, so we wanted to put it on some logical module or class. We discussed briefly the possibility of overriding ActiveSupport::DalliStore#read_multi and using alias trickery to call the original read_multi inside the new implementation, but decided against it on moral grounds. We decided, however, that it would make sense to put the method on ActiveSupport::DalliStore as an additional method, so that any code that was previously using Rails.cache.read_multi could just switch over to using Rails.cache.read_multi_chunked. In other words we could do this:

module ActiveSupport
  module Cache
    class DalliStore
      def read_multi_chunked(cache_keys, chunk_size: nil)
        return {} if cache_keys.empty?
        chunk_size ||= ENV.fetch('MEMCACHED_READ_MULTI_CHUNK_SIZE') { 1_000 }
        chunked_keys = cache_keys.in_groups_of(chunk_size, false)
        chunked_keys.reduce({}) do |hash, chunk|
          result = begin
                     read_multi(chunk)
                   rescue Dalli::RingError
                     {}
                   end
          hash.merge(result)
        end
      end
    end
  end
end

(Note that we can drop the reference to Rails.cache and just use read_multi now, since an instance of DalliStore is what Rails.cache referred to previously).

We liked this because it allowed us to look for our read_multi_chunked operation in a familiar place, but we were also aware that we had opened us up to a somewhat unlikely but nevertheless troubling risk that is inherent in all such extensions: what if the maintainers of Dalli decided that they needed a method that could handle chunked read_multi operations and what if they implemented a method with the same name? We wouldn't necessarily notice the change when we updated the gem. There might be subtle differences or implications of their implementation that could affect our application. What if the maintainers updated internal code to use their new method, which would now be using our own implementation instead?

That might sound like a lot of ifs and in this case it's hard to think of a way that there would be a catastrophic effect (since it's unlikely that "read_multi_chunked" could mean something radically different), but it highlights the general issue with creating extensions on classes that you don't own. If nothing else, you might want to know about and use the maintainer's version of the method because it could be more efficient and you can stop maintaining your own version.

The extended solution

WegoWise maintains and uses a gem called Watchdog (written by a former Wegonaut, Gabriel Horner) that you can use to put an end to all this fretting about hypotheticals involving collisions with upstream code. By now it's a pretty old dog (last updated in 2011) but that's because what it does is fairly simple. Watchdog will check to make sure that your modification to a class doesn't collide with any existing methods. It performs this check at runtime so the moment upstream code creates a method that you are extending you will get an exception.

To use Watchdog you need to convert your extension into a mixin module. This is necessary because Watchdog needs to hook into the extend_object callback method that is run when extend is called on a class. This is what our code looks like with Watchdog:


 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
module Ext module ActiveSupport module CacheDalliStore extend Watchdog 
      def read_multi_chunked(cache_keys, chunk_size: nil) return {} if cache_keys.empty? chunk_size ||= ENV.fetch('MEMCACHED_READ_MULTI_CHUNK_SIZE') { 1_000 } chunked_keys = cache_keys.in_groups_of(chunk_size, false) chunked_keys.reduce({}) do |hash, chunk| result = begin read_multi(chunk) rescue Dalli::RingError {} end hash.merge(result) end end end end class="k">end 
module ActiveSupport module Cache class DalliStore include Ext::ActiveSupport::CacheDalliStore end end class="k">end

The new version is certainly more verbose, but we think that's acceptable.

If it wasn't clear by now, we are okay with occasionally modifying classes that we don't own. Sometimes monkeypatching (or freedom patching!) can be the most elegant solution to a problem, especially when you want to make a change to an interface that is called in many places throughout your codebase.

That said, we don't think that class modification should be your first, or even second, option when thinking about a software problem. There are usually better approaches, such as creating a wrapper class that you can define your own methods on, or creating a subclass of the class that you want to modify. If, after careful consideration, you still decide that you want to modify that class, we think it's reasonable for there to be some ceremony and cruft involved.

Watchdog makes extending an upstream class considerably less dangerous, for those occasions when a patch seems like an appropriate option.

Great New Feature in RSpec 3: Verifying Doubles

Posted by Joe Sep 3rd, 2014 | Comments

We recently upgraded our test suite to RSpec 3. We did this to stay up to date with the most recent code rather than because we wanted any particular new feature. And for the most part RSpec 3 doesn't seem to add many features so much as it cleans up the API (the biggest change is that you can now use expect(something) syntax everywhere, instead of something.should). In fact, a lot of the changes that we had to make when upgrading involved adding gems that had been extracted from RSpec. This post by RSpec core team developer Myron Marston is a comprehensive summary of the changes.

There is one new feature in RSpec 3 that introduces a new and powerful behavior that has important implications for testing practice. I would go so far as to say that the project seriously buried the lede on this. The new feature is called verifying doubles, and in a nutshell it allows you to create test doubles that verify themselves as conforming to the same interface as a "real" object. Thanks to Xavier Shay for adding this great feature.

There are two steps for gaining access to this magic. The first is by using instance_double (and the related class_double) instead of double throughout your code. The second is to enable verifying partial doubles in your test suite.

Using instance_double in your code

Here is an example of using instance_double in a spec. Note that you need to pass the class that the double is meant to represent as the first argument to instance_double:

class Blog
  attr_accessor :posts
 
  def descriptions
    posts.map(&:description)
  end
end
 
class Post
  attr_accessor :title, :author, :body
end
 
# the test
require 'spec_helper'
 
describe Blog do
  specify '#descriptions returns descriptions from posts' do
    post = [instance_double(Post, description: 'My great post')]
    blog = Blog.new(posts: [post])
    blog.descriptions.should eq(['My great post'])
  end
end

When you run this test you will get the following error:

Failure/Error: post = [instance_double(Post, description: 'My great post')]
  Post does not implement: description

The verifying double alerted us to the fact that we hadn't implemented #description on Post. Hopefully this bug would have been caught further down the line in an acceptance test but there are often gaps in coverage.

Even better, verifying doubles will verify the number of arguments on a method:

class Invoice
  attr_accessor :total
 
  def add_line_item_to_total(line_item)
    self.total ||= 0
    self.total += line_item.amount
  end
end
 
class LineItem
  attr_accessor :amount
 
  def amount(include_cents)
    include_cents ? amount.to_f : amount.to_i
  end
end
 
# the test
require 'spec_helper'
 
describe Invoice do
  specify '#add_line_item_to_total adds line item amount to total' do
    line_item = instance_double(LineItem)
    line_item.stub(:amount).and_return(10)
    invoice = Invoice.new
    invoice.new.add_line_item_to_total(line_item)
    invoice.total.should eq(10)
  end
end

This results in a different error:

Failure/Error: self.total += line_item.amount
ArgumentError:
  Wrong number of arguments. Expected 1, got 0.

Just in case you missed it, RSpec is telling us that we are using the wrong number of arguments to call a method on the double. This is really powerful when you think about how we tend to extend method signatures. First we start off with a method call without arguments. Then a little later we decide we need more flexibility and add a switch flag like include_cents. If our test suite uses a lot of doubles we might not realize that we've just broken a lot of code (in the real world we would probably be more careful and make include_cents an optional argument).

You can also use class_double, which works the same as instance_double but for classes.

(Update: see Myron Marston's comment below about other ways that the method signature is verified)

Enabling verifying partial doubles

A partial double is what is more commonly known as a stubbed method. It is arguably even more important that stubbed methods on real objects correspond to reality, especially when you are stubbing code for external libraries that you do not test directly:

class FileUploader
  def upload(file)
    ExternalLibrary.upload(file)
  end
end
 
describe FileUploader do
  specify '#upload sends the file to External Service' do
    ExternalLibrary.should_receive(:upload).with('somefile.csv')
    FileUploader.new.upload('somefile.csv')
  end
end

I haven't included the code for ExternalLibrary because the point is that we don't own this code and so we tend to be less sure about the interface. What if we update the gem version and the method signature changes? As written this test will keep on happily chugging along, even though the code will throw errors in production.

I call this situation when your code is broken but your test suite is all green due to use of stubs a La La Land. To avoid a La La Land you can enable RSpec's verify partial doubles feature:

RSpec.configure do |config|
  config.mock_with :rspec do |mocks|
    mocks.verify_partial_doubles = true
  end
end

Now your stubs will behave like verifying doubles, throwing an error if you try to stub a method that doesn't exist on the object, or if you stub a method with the wrong number of arguments.

I presented this step second because I wanted to explain verifying doubles through examples using instance_double first, but you probably want to enable this option first. After that you can go through your test suite replacing double with instance_double at your leisure. You might find it eye-opening when you turn on verifying partial doubles. Suddenly a lot of tests that you thought were sound might turn out to contain stubs on methods that don't exist, or that take different numbers of arguments. It can be an opportunity to find out what your test suite is really testing.

Gotchas

There are two gotchas involving verifying doubles that we've run into so far.

The first involves redefinition of #respond_to?. Under the hood RSpec uses #respond_to? to determine if it should throw an error when a verifying double tries to implement a method. Sometimes you'll find that certain classes redefine #respond_to? (for example if they lean on #method_missing? to respond to undefined method calls), but they don't always implement it correctly, leaving off the optional second argument. This can cause an ArgumentError when stubbing on these classes. For example, this used to be an issue with Rails::Railtie::Configuration but it appears to be fixed in recent releases. To be clear, this was never a bug in RSpec but an issue of classes incorrectly implementing #respond_to?.

The second issue is more endemic. Classes that uses tricks to defer defining methods will cause your verifying doubles to fail if an actual instance of the class hasn't been created and put through its paces first. This can be frustrating when working with ActiveRecord models, for instance, since they use a ton of fancy introspection on database columns to define methods on the fly. Depending on the order of your tests you can end up with false positive errors where RSpec will complain that a method doesn't exist on the class. Merely using the method on a real instance in your test anywhere prior to the stub request will silence the error, however. We don't have a good solution for this problem, so we've been avoiding using instance_double on ActiveRecord models for the time being, instead using mock_model from the RSpec-activemodel-mocks gem. Something you can try is to create in-memory instances of your models before your test suite runs. (Update: in the comments Myron Marston points out that you can use object_double with an instance of the object)

Some testing theory

You can skip this section if you just wanted to know how to use verifying doubles. That's all done.

There have recently been some (in my opinion) rather productive debates in the Ruby world about the role of mocks and stubs in testing. On the one side were the proponents of mockist testing as a component of Test Driven Development (TDD). On the other side was the argument that mockist tests are poor tests that produce (in my terms) dangerous La La Lands where all the code is small and "testable" but doesn't necessarily work together. The anti-mockist argument further went that if you can get your tests to run "fast enough" (since another argument for using doubles is that they have less overhead) then you're better off using doubles much less frequently and simply exercising your real code.

Something that seemed to get lost in that debate is the role that doubles play in the most important aspect (he asserted) of Ruby programming, which is duck typing. When you write a method definition that takes in an object, like

def add_line_item_to_total(line_item)
  self.total ||= 0
  self.total += line_item.amount
end

you may think that you wrote a method that takes in a LineItem, but you actually wrote a method that takes in anything that quacks like a LineItem, that is anything with an #amount method. Doubles in specs serve as a form of documentation of the always implicit contract that methods require incoming objects to uphold. They help keep the programmer honest by forcing them to explicitly specify all the methods that an object is required to implement. If you need to stub 10 values on a double to make a test pass, that should tell you something about the degree of coupling in your code and should make you wonder if your classes are following Single Responsibility Principle.

I'm excited about verifying doubles because they introduce something close to the Interface concept that other languages have, but without the formality of having to create an explicit definition. Interfaces as language constructs in typed languages like Java are used to allow for more flexibility in passing arguments and returning values. Instead of saying that a method can only receive an argument of type Widget you can say that a method can receive any object that implements the Widget interface, which has to be formally specified in a separate class-like definition. Often the interface will be named IWidget to distinguish it from the Widget class that implements IWidget. These languages perform compile and runtime checks to make sure that classes correctly implement the interfaces that they claim to implement. So if Widget claims to implement the IWidget interface then it must implement all the methods (with the correct number of arguments and argument types) that are specified by IWidget, or else an error will be raised.

In essence, verifying doubles do an interface check in your test rather than at runtime in your code. Instead of maintaining a separate formal interface definition you can simply refer to the method signatures of a canonical class, like String or File. You can also use verifying doubles to make sure that calling code only uses the common interface for an inheritance structure of classes, by passing in the abstract class to instance_double:

class Publication
  def title
    raise NotImplementedError, 'Subclasses must implement #title'
  end
end
 
class Magazine < Publication
  attr_accessor :title
end
 
class Book < Publication
  attr_accessor :title, :summary
end
 
class Publisher
  attr_accessor :publication_listings
 
  def publish(publication)
    self.publication_listings ||= []
    self.publication_listings << "#{publication.title} -- #{publication.summary}"
  end
end
 
# test
 
require 'spec_helper'
 
describe Publisher do
  describe '#publish' do
    it 'adds a publication to the publications list' do
      magazine = instance_double(Publication, title: 'A book', summary: 'words')
      publisher = Publisher.new
      publisher.publish(magazine)
      publisher.publication_listings.should eq(["A book -- words"])
    end
  end
end

This test fails because somebody extended the #publish method's contract beyond what Publication's interface promises. Note however that this test would not help us detect that Magazine does not implement a particular interface if we went back and implemented #summary on Publication. That is a different problem which in other languages would be solved by, well, an interface check. In her Practical Object Oriented Design in Ruby book Sandi Metz addresses this problem by using a set of shared tests for the common interface that she mixes into the unit tests for each concrete class. It would be nice to have an automated way of doing this that reuses the verifying double mechanism.

Conclusion

Verifying doubles may sound intimidating, but they are easy to switch over to and use today. Start by turning on verifying partial doubles and then gradually switch over your existing doubles to use instance_double. Expect to have to fix some tests, but take comfort in the fact that you're making your test suite so much more rigorous and meaningful with a relatively small amount of work. Verifying doubles make RSpec a much more powerful and relevant testing tool.

Timezones Are Hard: A Presentation

Posted by Joe Aug 25th, 2014 | Comments

In this post I will describe an interesting presentation that one of the WegoWise team gave at work. But before I do that I need to give some context.

Prologue

Toward the end¹ of every week the WegoWise development team takes time out to hold a small hour-long "conference". We call it, creatively enough, Mini-conference, or just mini-conf. Mini-conf is a very informal, very optional occasion for developers to present on various topics that are loosely related to software development². The purpose of Mini-conf is to share information or techniques, and also simply to provide an occasion for us to hang out³. Our team is completely remote and although we see each other on video chat during standups and pairing sessions we don't otherwise get many opportunities to all be "in the same room" for an extended period of time.

Here are some topics that have come up during mini-conf:

Jeff - Using JRuby, i.e. 'Software as a platypus'
Janet - Notes from Burlington Ruby Conference
Joe - Everything You Wanted to Know About Joe's Workflow but Were Afraid to Ask
Dan - Introduction to Compass: more than just a bunch of mixins
Nathan - Learn this one weird trick to decrease your Vim startup time 97.62%
John - BEES! (a statistical analysis of the chance that a Git ref will have the string 'bee' in it.)
Barun - A primer on color theory
Jed - Search & Replace Across Files in Vim

As you can see, the topics range from information that is directly useful to our daily work, to tips about useful tools and ways to improve our software environments, to presentations about concepts from other technologies or domains that can provide us with new perspectives. The topics aren't always completely serious, but there is usually something that we can take away and apply. We will often have two or three presentations per mini-conf, although sometimes one presentation will take up the whole hour or go even longer.

What follows is the shortest presentation we have ever had at Mini-conf.

The Presentation

This presentation was given by Nathan and goes as follows:

Timezones are Hard

... unless you use Date.current and Time.current.

The end.

Epilogue

Despite its brevity, this was one of the more useful and memorable presentations we've had at Mini-conf. Let me explain.

The issue this presentation addresses, if you are not familiar with it, is that in Rails datetime columns are usually stored in the database as UTC and converted to a local timezone that is set in the application configuration files. If you save a new datetime to the database it will automatically be converted from the application time back to UTC.

The problem comes in when the application server (or a developer's computer) is in a different timezone than the one that is set for the application, for example if the application is set to Eastern Time and the server is located in Central Time. In this case if you use Time.now in your code you will get an instance of Time with the time zone offset for Central Time. This can lead to a variety of subtle problems, mostly to do with intermittent tests but potentially leading to serious bugs in production as well. The worst of these bugs involve Date.today, because Date does not know about timezones. Consider this somewhat realistic accounting bug:

class Invoice < ActiveRecord::Base
  def overdue?
    (Date.today - issued_at.to_date) >= 10
  end
end
 
# test file
 
describe Invoice do
  let(:invoice) { create(:invoice) }
 
  describe '#overdue?' do
    it 'is overdue if 10 or more days have gone by' do
      contract.issued_at = 10.days.ago
      contract.overdue?.should eq(true)
    end
  end
end

So the issue here is that this test will fail between 11PM and midnight CDT, if the computer running the test is in CDT and the application is configured for EDT. Let's say it is midnight EDT on the 29th of August. This means that it is 11PM on the 28th for the computer in CDT. As a result the statement Date.today will return the date as the 28th. But the statement 10.days.ago will use the Rails application timezone, because these methods come from active_support. So the issued_at date for the contract will be the 19th, but today's date will be the 28th, and less than 10 days will have elapsed.

Or something. Timezones are hard. The point is that you can avoid this confusion in Rails code by just using the active_support methods Date.current and Time.current instead of Date.today and Time.now to always get the current time or date in the application timezone.

Also, team-building is hard, especially when everybody is remote. One thing that can help is to set aside some time during the week for people to hang out and share what they've learned with their colleagues.

The end.

Mini-confs were initially held at the end of the workday on Friday, but our team is highly distributed and this meant that some developers in places like, say, Spain, had to log into their computers late on a Friday night to participate.↩
Sometimes instead of a presentation we will just take an article or hot button issue and have a conversation for the hour.↩
Using, of course, Google Hangouts.↩

← Older Blog Archives