Friday, June 12, 2009

Ruby case statements and kind_of?

You're an object - Stand up straight and act like one!


Imagine you have this code:

Code 1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class MyView
attr_reader :target

def initialize(target)
@target = target
end

def double
case target
when Numeric then target * 2
when String then target.next # lazy example fail
when Array then target.collect {|o| MyView.new(o).double}
else
raise "don't know how to double #{target.class} #{target.inspect}"
end
end
end

It does just what you'd expect.

Output 1
1
2
3
4
5
6
7
8
9
10
11
>> MyView.new('aaa').double
=> "aab"
>> MyView.new(49).double
=> 98
>> MyView.new(['b', 6]).double
=> ["c", 12]
>> MyView.new({'x'=>'y'}).double
RuntimeError: don't know how to double Hash {"x"=>"y"}
from (irb):73:in `double'

from (irb):80
from :0

You're probably familiar with this pattern. Its everywhere in Rails and you likely use it in your own code.

I want to say, in the nicest possible way, that this style of code is wrong, wrong, wrong and that you should do a different thing.

Okay, now that I have your attention, I'm not trying to start a fight. I'm not the best Ruby person around and I'm definitely not the best OO designer, but I do have an alternative pattern to suggest.

I'm aiming to start a discussion, not a religious war. Strong opinions are welcome.

What's happening up there?

MyView needs to operate on several other objects.  It knows:
  • the classes of all the objects that it can interact with, and
  • the behavior that should occur for each of those classes.
The case statement above is really an 'if' statement that checks 'kind_of?' on each collaborating object.

I object to this code because:
  • use of 'kind_of?' is a code smell that says your code is procedural, not object oriented, and
  • if you write procedural code your application will gradually become impossible to change and everyone will hate you.

Why is it wrong?

If I change how 'double' works on any of these classes, MyView must change, but that's not the real problem.  What happens if MyView wants to double some new kind of object? I have to go into MyView and add a new branch to the case statement.  How annoying is that?

But that's the least of it.  If I'm writing code that follows this pattern, I likely have many classes that do stuff based on the classes of their collaborators.  My entire application behaves this way.  Every time I add a new collaborating object I have to go change code everywhere.  Each subsequent change makes things worse.  My application is a teetering house of cards and eventually it will come tumbling down.

Also, what if someone else wants to use MyView with their new SuperDuper object? They can't reuse MyView without changing it since MyView has a very limited notion of what kinds of objects can be doubled.

MyView is both rigid and closed for extension.

What should I have done instead?

Something like this.

Code 2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class Numeric
def double
self * 2
end
end

class String
def double
self.next
end
end

class Array
def double
collect {|e| e.double}
end
end

class Object
def double
raise "don't know how to double #{self.class} #{self.inspect}"
end
end

class MyView
attr_reader :target

def initialize(target)
@target = target
end

def double
target.double
end
end

Using this new code, Output 1 will be the same as before, but now we can also:

Output 2
1
2
3
4
5
6
>> 'aaa'.double
=> "aab"
>> 49.double
=> 98
>> ['b', 6].double
=> ["c", 12]
In this example, objects are what they are and because of that they behave the way they do.

That statement is deceptively simple but incredibly important.  Objects ARE what they are so they do what they do.

It is not the job on any object to tell any other object how to behave.  Objects create behavior by passing messages back and forth.  They tell each other WHAT, not HOW.  

WHAT is the event/notification/request and it is the responsibility of the sender.

HOW is the behavior/implementation and it should be completely hidden in the receiver.

Code 2 is object oriented because it relies on a network of interacting objects to produce behavior.  Each object knows its own implementation and it exhibits that behavior when it receives a message.

No object in your system should have to know the class of any other object in order to know how to behave.  Everything is a Duck.  Tell the Duck WHAT and the Duck should know HOW.

This way you can change how any object is doubled by changing its own specific implementation.  More importantly, MyView can now collaborate with any kind of object, as long as the object implements 'double'.

This is a nice theory, but it seems impossible in practice.

Nah, it's easy.  But it means thinking about objects in a more Object Oriented way.

In order to write code like Code 2, you have to believe in objects.

In Ruby, everything is an object.  I know that you know this, but I suspect that you don't feel it in your bones.  You came from those languages where you couldn't change String so now you operate as if String (and Symbol and Array and ...) are static types.

Throw off your shackles.  Ruby is a network of interacting objects and you can add behavior to all of them.  When you find yourself saying, if you're this kind of thing, do this, but if you're that kind of thing, do that, it's your cue to push the behavior back into those individual objects.  

Sound bites.

Always send messages if you can.  
Implement behavior only when there's no one left to ask.  
The last object that could possibly receive the message is the object that should implement the behavior.

Therefore:

MyView should not know how Numeric implements 'double'.
MyView should just ask target to double itself.
All possible targets must implement 'double'.
Object should implement 'double' to help you get your Ducks in a row.

Lest you think this is all academic...

Here's the code to use this Pattern in some of the form_for methods in ActionView::Helpers.  Ponder the implications.
Diff of the changes.
actionpack/lib/action_view/helpers/form_helper.rb
actionpack/lib/action_view/helpers/prototype_helper.rb
actionpack/lib/action_view/helpers/form_helper_core_extensions.rb

12 comments:

  1. Arg! I completely agree with you, but it pains me to see methods added to core classes. Unfortunately, in this case, I can't think of a better way.
    ReplyDelete
  2. @Aaron Can you articulate why you feel that pain? Fear of method naming collisions, or something else?
    ReplyDelete
  3. Name collisions in particular. To me, adding methods to core classes is like modifying a global variable. It's especially painful when your implementation of "double" doesn't agree with mine. Take your modification of the Array class for example. Array responds to "*", so it may be perfectly reasonable to think that the implementation of Array#double would be "self * 2". Which one is right? IMHO, neither should be added.

    Take it one step further. What if I packaged my implementation of Array#double in a gem, and my gem depended on that functionality but you didn't know that! Redefining that method could have unexpected results. A prime example of this problem would be the native YAML gem vs. ActiveSupport's YAML generator. They both extend core classes, and have slightly different API's. Because of that do not play nicely together at all.

    Switching on class type seems like a hack because we really should be duck typing, but extending core classes seems dangerous. Unfortunately I don't have a "good" solution. Subclasses or module extension maybe?
    ReplyDelete
  4. Smalltalk has a pattern which allows adding this kind of behavior while avoiding method naming collisions.

    It's called double-dispatch and basically it involves creating somewhat ugly, but unique, method names by using a namespace (usually something about the original receiver) to make the method names unique.

    The technique is used extensively in Smalltalk. It works. I'd like to think we can use it in Ruby to make our code more OO and save us some pain.

    Would some form of double-dispatch solve the problem you see?
    ReplyDelete
  5. Yes. I use double dispatch heavily for writing visitors, usually when dealing with ASTs. It would be nice if all objects had an "accept" method for doing this pattern. The only problem is, we have to examine the type, and isn't that what we're trying to avoid?
    ReplyDelete
  6. Hmm, I see double-dispatch as avoiding the problem of having to examine the type of any object. Are we talking about the same thing? Can you post an example to help me understand what you're saying?
    ReplyDelete
  7. Hopefully. I could be wrong. ;-)

    Here is a simplified implementation of the double dispatch / visitor pattern I typically use:

    http://gist.github.com/131378

    When it calls "send", it's still examining the type in order to figure out which method to dispatch. IIRC, if I was doing this in Java, I would have to implement a "visit" method for each type I expect to receive. In ruby I cheat by using the class name in the method call, but I'm still asking the object for it's type.

    Are we talking about the same thing?
    ReplyDelete
  8. Naming collisions don't come up nearly as often as critics suggest.

    You could avoid reopening these classes if you wanted to, wrapping them with an object that implements double and then delegates the rest to the original object.

    Anyway, the principle at work here is the Open-Closed principle. With the original case-based implementation, if you want to extend the behavior by adding a new type, you would have to modify the original code. Bad.

    Nice post :)
    ReplyDelete
  9. @Aaron, modifying core classes in gems is something I'm cautious of, but in application code I do it regularly without incident. And don't forget that Ruby gives us the option of modifying an individual object, limiting the potential scope for error even further.
    ReplyDelete
  10. Technically, you're not dealing with a double-dispatch problem here. Double dispatch occurs when you have TWO objects which should determine which implementation of a method to call; "double" only depends on the type of its receiver (since it has no arguments).

    The "CollideWith" method in the Wikpedia article is a double-dispatch issue because to know which one of the four implementations to call, you need to examine BOTH the type of the asteroid AND and the type the spaceship. C++ (and Java and Ruby) cannot do this on its own; it'll handle a dispatch on the receiving object, but in the general cases, the programmer will have to handle dispatching on the argument.

    That being said, the visitor pattern IS a solution for this "double" method, and all the issues raised here are perfectly valid. It's just not technically a double-dispatch issue.
    ReplyDelete
  11. This comment has been removed by the author.
    ReplyDelete
  12. NIce article, and yes I dont yet feel it in my bones. I guess need to really believe in what i am doing.
    ReplyDelete