Sunday, March 11, 2007

Ruby builtin in pure Ruby

[Update 03/12/2007 If you know how to implement Interge#times in pure ruby and make it have the same behavior as Ruby 1.8.5, please let me know. Thank you!]

One of the best things I love about rubinius project is: their developers try to keep the dependency on system language (C in their case) minimal, and they take it seriously. Take a look of http://code.fallingsnow.net/svn/rubinius/trunk/kernel/core/, you will find out lots of buitlin libraries are implemented in pure ruby, even lots of string functions.

Pure ruby builtin has some drawbacks, though. One is performance penalty, the other is potential side effects. For example, basically Integer#times can be implemented as simple as this in pure ruby:


class Integer
def times
i = 0
while i < self
yield i
i += 1;
end
self
end
end


But this version of Integer#times does not work exactly the same as Ruby 1.8.5. If an user want to override Fixnum#+ (this may never happen in real life):


class Fixnum
def + x
return 9999
end
end


Our Integer#times's behavior will change, while Ruby 1.8.5 won't. That is because Ruby 1.8.5's implementation (int_dotimes() in Numeric.c) optimizes for Fixnum: it does not call '+' method dynamically for Fixnum, instead, it just increases the integer directly. If you want to implement this method as same as Ruby 1.8.5 then you have to write code in system language.

This kind of optimization is all over the place in c ruby. I am not clear about its motivation, but I guess performance is one of the reasons. For example, "30000000.times {|x|}" is about twice faster than "i = 0; while i < 30000000; i +=1; end" in Ruby 1.8.5.

Difference people may have different opinions on what the 'right' behavior should be. As for me, I like the behavior of the pure ruby implementation better.

6 comments:

cabo said...

See also the Rubinius mailing list.

(Note that this independence property of MRI only applies to the core; once you are in the standard library, MRI and Rubinius behave the same.)

For one attempt at solving this, you may want to google "selector namespaces".

xue.yong.zhi said...

Thank you cabo for the link.

As ruby does not expose the behavior of native math operations at all. I think "selector namespaces" does not apply as for Interge#times, right?

I am very interested to see a pure ruby implementation of Interge#times which has the same behavior as Ruby 1.8.5.

Unknown said...

You could save the original implementation of those methods and use the originals in methods like Integer#times. For example

class Fixnum
  alias_method :basic_plus, :+
end

and then in Integer#times use

  i = i.basic_plus 1

instead of

  i += 1

xue.yong.zhi said...

Thank you for the suggestion, Pit. But the user can override basic_plus again, then the old problem reappears...

riffraff said...

mh.. I don't know if it would work but using a closure on an unbound method seems reasonable, something akin to

plus=Fixnum.instance_method '+'
Fixnum.class_eval do
define_method :times_again do
i=0
while i < self
yield i
i=plus.bind(i).call(1)
end
end
end

but take care that this would require you to also override "<", and it could possibly not work because blocks don't take block args in ruby < 1.9 (I don't know about xruby).
You can still use instance vars or class vars for this, though, and it would work even if it still allows breakage and would probably be uberslow.

But you want another crazy special case? Try

def puts(*args)
super *args
super *args
end

Few people know that builtins can use super to access old definitions :)

xue.yong.zhi said...

Hi, Riff, it is a very good idea. Thank you!

Since your solution uses 'bind' and 'call', we need to preserve them as well. It seems we can use the same technique you provided.

I will give it a try to see how long the final solution is going to be.

And thank you for providing a new challenge:)