Technical Archives

AddThis Social Bookmark Button

Sometimes a test case must detect when production code creates new records, as a side-effect. assert_latest{} detects all new records of a given type, and returns them like this:

    f1, f2 = assert_latest Foo do
               2.times{ Foo.create }
             end

    assert 'items return ordered by id' do
      f1.id > 0 and f2.id > f1.id
    end

This post shows how to use assert_latest{} in more advanced configurations. It can detect records of more than one type, and can detect records that belong to only one association. Our platform is Ruby on Rails, yet - as usual - the lessons apply to any unit tests.

AddThis Social Bookmark Button

Object-Relational Models, like Rails’s ActiveRecord, help generate very complex queries. Sometimes we need that complexity - without slow execution times! We might not notice when our queries have grown beyond our databases’s tuning and indices.

This post shows how to trap, interpret, and EXPLAIN SQL SELECT statements under MySQL. Our platform is Ruby on Rails, yet the lessons apply to any ORM.

AddThis Social Bookmark Button

This post is not about your boss growling at you. Another title could have been “whisper-driven development”. It’s about your code whispering its status to you.

To learn this newfangled kind of development, whip out an Apple computer, and start a project using Test-Driven Development. We use Rails, but you can use any platform you like. (You can use any computer you like, so long as you can find and configure a Growl-style program for it. See below.)

In Rails, use ZenTest’s autotest command, or write your own batch file, that triggers a test run each time you save any source files. I have one here; it’s a little more “generic” than ZenTest’s version. “Generic” is a programming euphemism for “scrappy - a fixer-upper”.

Gregory Brown

AddThis Social Bookmark Button

I’m happy to announce I just cut the first alpha release of Prawn. It is chock full of features, and since the release notes are fairly detailed with good links, I’ve just pasted them for your review after the cut.
But first, I have a few words that I couldn’t quite convince myself deserved a separate post:

On a somewhat sad note, this will be my last post to O’Reilly Ruby. Things are changing a lot behind the scenes here, and there has been a lot of re-shuffling of resources. I’ll still be blogging for O’Reilly but over on the News Site. However, since that blog will be a bit more ‘big story’ oriented, I’m going to also be starting up my own tech blog at majesticseacreature.com. Keep an eye out for that, but it may take me a few days to find time to write some blog software I can live with, and the page will be a ghost town until then.

So O’Reilly Ruby, So long, and thanks for all the fish. I had fun here and I hope that folks will still find my stuff over on the news site interesting.

Prawn 0.1.0 Release notes to follow…

Gregory Brown

AddThis Social Bookmark Button

I’m happy to announce that the Prawn PDF library has hit another milestone on the Ruby Mendicant project roadmap. This time we’ll look at Prawn’s shiny new table drawing support, as well as some of the other features that have been added over the last several weeks. We’ll also look at where things are headed in the future, including when to expect a first gem release. All this and more lies just beyond the cut…

Gregory Brown

AddThis Social Bookmark Button

I really like the open command on OS X, but I was too lazy to look for its Linux equivalent.

Actually, my solution probably took less time than sifting through a google search:

system(case ARGV[0]
when /\.pdf/
“epdfview”
when /\.html/
“firefox”
when /\.(rb)|(pl)|(pm)/
“vim”
end + ” #{ARGV[0]}”)

Anyone else have fun little hacks they want to share?

Gregory Brown

AddThis Social Bookmark Button

Back in March, I announced the Ruby Mendicant project after several readers of this blog encouraged me to pursue the idea. For those who didn’t see the follow up details elsewhere, here’s the readers digest version:

Thanks to 70 donors, and donation matching from Ruby Central, Inc. and MountainWest Ruby LLC, I am now able to take 22 weeks off from my commercial work to focus on open source development in Ruby. I had a number of project ideas, but the general consensus is that my time would be best spent building a fast, sleek, and simple PDF library for Ruby. I’ve decided to call this library Prawn, for the time being.

Since I have just reached my first checkpoint on the project, I figured I’d give folks an update on where things are.

Gregory Brown

AddThis Social Bookmark Button

While working on Prawn, I ran into this (not-so) fun little gotcha:

>> 1.to_sym
=> nil
>> 101241.to_sym
=> nil

Anyone cool enough to tell me what this feature is all about?

Update:

I guess it isn’t totally clear what I was asking here. I’m not actually trying to convert integers to Symbols. In fact, my specs were failing because I expected some_number.respond_to?(:to_sym) to be false!

As it turns out, Fixnum#to_sym does have a purpose, but it is quite different than something like String#to_sym.

>> :foo.to_i
=> 14369
>> 14369.to_sym
=> :foo

I knew about the existence of #id2name and it didn’t surprise me much, given the way Symbol objects are implemented. Still, for the folks who keep reminding me about what Symbol objects are, please save your comments because that’s not the point here.

The point is that when I see “foo”, I can easily say. Oh… “foo”.to_sym will give me :foo.
When I see [16393, 16401, 16409], I don’t think “Gee… that must be: [:cat, :monkey, :tomato], I just need to map it with to_sym”.

So what this boils down to is an API clarity thing. Even if a Symbol is closely bound to an integer implementation-wise, I think it’s a bit of a flaw to assume that to be important conceptually. Among our readers, has anyone used Fixnum#to_sym in real code? I’d be very interested in seeing a common use case for this feature. If there isn’t one, maybe a less ambiguous name, such as id2sym might be more appropriate.

What’s more, even with the existing name, it’d be nice if some_number_that_has_no_symbol.to_sym would blow up with an error rather than return nil.

Sorry for the earlier confusion, hopefully this update clarifies that I was talking about a design peculiarity., and not some burning desire to get myself back a :1.

Gregory Brown

AddThis Social Bookmark Button

Reposting from the official announcement on RubyTalk

Gobi version 1.0.0 has been released!

* http://gobi.stonecode.org
* http://metametta.blogspot.com
* gregory.t.brown@gmail.com

I am happy to announce the first release of my new fork of Ruby called
Gobi. The goal of Gobi is to implement features that I have noticed to
be completely missing.

For example, Ruby’s standard library does not even implement a
datastructure that can easily represent a Go board. Gobi has this
built in as an NArray based, highly efficient structure:

 >> x = Goban.new
 >> x.place_stone(:black, :at => "a4")
 >> x.place_stone(:white, :at => "c16")
 >> x.place_stone(:black, :at => "a4")
 StoneCollisionError: There is already a stone at a4.
   from (irb):3 in `place_stone`
   from (irb):6

As you can see from the example above, Gobi is very friendly to those
writing computer Go applications. For those wishing to write AI bots
to play the game, Gobi also goes through a lot of effort to make Ruby
more efficient.

A major improvement in performance was gained through the removal of
automatic garbage collection. This means that programmers need to be
sure to clean up after themselves, but any Rubyist who also has an
interest in Go will surely be sufficiently skilled to design programs
that avoid memory leaks.

The implementation of object destructors in Gobi is simple, due to the
addition of a release_resources hook in Object. A delete keyword has
also been added, which will explicitly start up the garbage collection
process.

Here’s an example of manual garbage collection in Gobi:

 class Foo
   def initialize
     @board = Goban.new
   end
   def release_resources
     delete @board
   end
 end

Please keep in mind that although the built in classes all have
sensible release_resources implementations, that if you’re feeling
adventurous, you can of course override them. A current fun game in
Gobi is to run a stopwatch and see how quickly memory runs out when
you write some code like this:

 class String
   def release_resources; end
 end
 string = 'a'
 1_000_000_000.times do
   string = string.succ
 end

Of course, though humorous, this should serve as a warning to you: Be
sure to properly discard your objects!

This announcement just scratches the tip of the iceberg of what Gobi offers.

Other cool features include:

- The removal of Ruby 1.9’s giant interpreter lock, as Go programs
tend to benefit from the power of true concurrency. (Unfortunately,
these patches are very platform dependent)

- A major reshifting of Ruby’s standard library. Things like option
parsing, zlib support, and fileutils aren’t really that useful for
programming computer Go applications, so they have been removed. Many
new libraries have been added, including an SGF analysis tool and a
GTP network implementation.

- An interface to a special (Proverb Semantics Parsing) PSP tool,
which allows you to train Go playing robots by simply reminding them
of proverbs such as “Hane at the head of two stones”, and “The empty
triangle is bad”, rather than resorting to low level, complicated AI
programming. This system can be used via irb while a game is under
review in Gobi’s built in Tk based SGF editor. Gobi shows that by
mindlessly memorizing proverbs, Go playing bots can decrease their
rank by two stones in half the time that an average human can.

- The removal of all lesser data structures such as the Array, the
Hash, and the Set. In Gobi, all of these structures could trivially
be built as a subclass of the Goban, so there is no need to keep them
around.

Though I will be taking off 6 months from Gobi development to work on
the Ruby Mendicant project, I hope that people enjoy this early
experimental release and that soon Ruby will be free from the core
team’s shackles to do what it truly deserves to: Reach 30 kyu on KGS
with a Gobi based bot!

Though only time will tell, I am considering reworking Gobi to fork
Aaron Patterson’s excellent Brobinius implementation, as Gobi deserves
some high quality Grosenbach screencasts.

* http://gobi.stonecode.org
* http://metametta.blogspot.com
* gregory.t.brown@gmail.com

AddThis Social Bookmark Button

This project upgrades an online forum to add a search engine, using Test Driven Development. Our tools are RoR’s Beast, Sphinx, and (naturally) assert{ 2.0 }.

We follow this MVC guideline:

Anything a user can do to the data through the Views,
a unit test can do, the same way, through the Models
Our test cases simulate a user searching.

AddThis Social Bookmark Button

I like developer tests, but I don’t like the primitive assertions - assert_equal, assert_match, assert_not_nil, etc. They only exist for one reason - to print out their input values when they fail. And they don’t even reflect their variable names.

So I wrote an assertion to replace all of them. Put whatever you want into it; it prints out your expression, and all its values. Essentially like this:

__source__ __failure_diagnostic__
x = 43
assert{ x == 42 }
assert{ x == 42 } --> false - should pass
    x --> 43
deny{ x == 43 }
deny{ x == 43 } --> true - should not pass
    x --> 43

The classic versions require a lot more typing, and reflect much less information:

  assert_equal(x, 42)     --> <43> expected but was \n<42> 
  assert_not_equal(x, 43) --> <43> expected to be != to \n<43>

Install this system with:

  gem install assert2

Some systems might require sudo, to tell the ‘puter who’s boss. The “assert2” gem will pull in RubyNode, the library that inspects Ruby blocks. Then add require 'assert2' to your test suites, or to your test_helper.rb file.

AddThis Social Bookmark Button

With the addition of Java Native Access (JNA) to JRuby, systems programmers using JRuby now have greater flexibility in terms of interfacing with underlying operating system.

Some Ruby users are familiar with the ‘Win32API’ library that ships as part of the Ruby standard library. That library lets you interface with the Windows API by defining function pointers from specific DLL’s that you later call. With JRuby’s JNA interface you can now interface with Windows in a similar fashion.

AddThis Social Bookmark Button

To ensure your test cases call efficient MySQL

  def test_my_case
    assert_efficient_sql do

      # just wrap them in this block!

    end
  end

The assertion intercepts and copies out your MySQL SELECT statements, then calls EXPLAIN on each one, and inspects the results for common problems.

The goal is test cases that resist database pessimization, even as you change your data relations, to add new features. If you run your tests after every few changes, you can easily detect which change broke your database’s indices and relations.

This article is a reference for this assertion’s options. The techniques should be ported to any database with an EXPLAIN or similar system.

AddThis Social Bookmark Button

Did you know you can do this with Ruby out of the box?

# A real lambda
λ { puts ‘Hello’ }.call => ‘Hello’

# Sigma - sum of all elements
∑(1,2,3) => 6

# Square root
√ 49 => 7.0

How difficult was this to implement? Keep reading!

Gregory Brown

AddThis Social Bookmark Button

These days, it seems I hardly have the time for doing fun random hacks. So here I’ve started one, and if anyone finds it interesting, please take it from here and let me know how it turns out.

Loosely based off of AIML, kind of, but not really:

class Conversation

  def initialize(person)
    @person = person
    @response_id = 0
  end  

  attr_reader :response_id

  def say(msg)
    print "#{@person}: "
    @response_id = Response[@response_id].respond_to(msg)
  end 

  class Response

    def self.responses
       @responses ||= {}
    end

    def self.[](id)
      responses[id]
    end  

    def initialize(id)
      @id = id
      self.class.responses[id] = self
      @matchers = []
      @messages = []
    end               

    attr_reader :id,:matchers 

    def when(pattern,id)
      @matchers << [pattern,id]
    end     

    def inherits(arr)
      arr.each do |id|
        @matchers += Response[id].matchers
      end
    end

    def might_say(msg,weight=1)
      weight.times do
        @messages << msg
      end
    end        

    def talk
      puts @messages.sort_by { rand }.pop
    end

    def respond_to(msg)
      @matchers.each do |pattern,id|
        if msg =~ pattern
          Response[id].talk
          return id
        end
      end
      return @id
    end

  end     

end           

def response(id)
  r = Conversation::Response[id] || Conversation::Response.new(id)
  yield(r)
end

UPDATE: Check out this annotated code submitted by a reader.

AddThis Social Bookmark Button

I’m looking for someone to take over PDF::Writer, color-tools, and Transaction::Simple. I do not have time to maintain these anymore. I should have done this months ago, but pride of ownership and a belief that more free time would be just around the corner got in the way.

You can read more details on my original blog posting at my personal blog.

Anyone interested? Anyone know anyone interested?

Gregory Brown

AddThis Social Bookmark Button

I think most Rubyists have picked up a good trick or two from Jim Weirich. Though it’s only a tiny part of his latest article (Using Flexmock to Test Computational Fluid Dynamics Code), I got excited to see his ‘Existence Test’ in his code:

  def test_initial_conditions
    q = F3DQueue.new
    assert_not_nil q
  end

Looks pretty simple, eh? You might be quick to say that this doesn’t do anything. However, it is actually a pretty clever practice. This test makes sure the tests themselves are working as expected. I was already in the habit of starting with a failure, usually something like:

  def test_doomed
    flunk
  end

The purpose of the above is simply to make sure your tests are picked up within your suite, and aren’t being overlooked by your Rakefile, autotest, or whatever runner you’re using. But the existence test actually goes a little farther. Because you’re initializing an object, you’re making sure that the files you need to be loading are present, that you can build your objects, *and* that your tests are hooked up.

After you’ve got a couple tests passing, you can remove this sanity check or morph it into a setup(), whatever makes sense.

Many people think this is a little paranoid, and most of the time, it is. Still, all it takes is one bad experience coding under falsely passing tests, and you’ll be converted in no time. :)

AddThis Social Bookmark Button

From Nick Sutterer, A Computer Science Undergraduate at Albert-Ludwigs University (Freiburg, Germany)

When writing an article about Apotomo I had to make a decision: either introduce it as a simple widget plugin for rails or - as the name Apotomo (”all power to the model”) implies - end up in monologues about model-driven component-oriented enterprize concepts. Today I will simply introduce Apotomo as a widget library for rails.

Gregory Brown

AddThis Social Bookmark Button

Hey folks, I’ve picked a winner for June’s Ruby project spotlight and will have a post out within the next few days about it, but I’d like to remind folks that this is an ongoing project.

What that means is that I’m now accepting July submissions. Every submission we got for June was excellent, and if you were not selected, you can always resubmit for a later month. Here’s a recap of the rules, but see the original post for details.

  • Project must be fresh / actively developed
  • Project must be released before the time I post
  • Proposal should consist of nothing more than a code example and a link to your project page, no additional commentary needed
  • Project can’t be Rails-centric
  • You must be a developer on the project
  • My project choices will be entirely subjective and unjustified. :)

Please email me if you’ve got a cool project to submit!

Gregory Brown

AddThis Social Bookmark Button

To answer a question on RubyTalk the other day, I had to reference Mauricio Fernandez’s nicely compiled list of Changes in Ruby 1.9. While I was there I took another walk through the whole thing.

There are of course some features I *don’t* like.

a = ->(b,c){ b + c }
 a.call(1,2) # => 3

But there are quite a few that I do, and here I’ve listed ten I think will totally rock. I use Mauricio’s examples, so all credit goes to him. Also, this article is from February, so if you find any features below that have changed, shout and I’ll update.

AddThis Social Bookmark Button

Consider this fact: Multi-core CPUs are not only the future, they’re the only way CPUs can continue to grow at their current pace. It’s also a hotly debated subject in the software world. Multi-threaded programming is different and not seen as often as procedural programming, and therefore it’s not yet as well understood. So the question is, how can programming languages (and Ruby in particular) make it easier to harness these systems?

As Ruby struggles to graduate from its current implementation into something more powerful, we’ve already seen several projects attempt to update Ruby to help developers cope. Those who’ve been working with Ruby for awhile may remember YARV, which promises to provide more threading support. JRuby offers all the power of Java’s threads to Ruby, if it can harness it. And Evan Phoenix’s small but rapidly growing project Rubinius is attempting to be the next big contender.

No matter what implementation becomes the next de-facto Ruby platform, one thing is clear: People are interested in taking advantage of their newer, more powerful multi-core systems (as the recent surge in interest in Erlang in recent RailsConf and RubyConfs has shown). As Ruby becomes increasingly part of solutions that deal in high volumes of data processing, this demand can only increase.

That’s why it’s so very surprising to see David Heinemeier Hansson dismiss the whole notion out of hand regarding Rails. His argument seems to be that Rails already scales to multiple cores in the same way it scales to multiple machines, via UNIX process distribution. After all, isn’t this the very crux of “Share Nothing?”

Jeremy McAnally

AddThis Social Bookmark Button

I was not mowing my lawn like Gregory, but I was reading this blog when I got an idea for a Rails version of the Ruby Project Spotlight series Gregory is spinning up.

The idea is that I’ll post an entry once or twice a month about a new and active Rails project that’s looking for more exposure. The project can be a gem, a plugin, an open source Rails application: basically anything that’s related to Rails. The process for getting a project mentioned is simple: send me an e-mail about your project. It should follow these simple rules:

  • Keep it fresh — I’d rather not look at code from pre-Rails 1.0 or that’s been sitting on Rubyforge for 11 months with no activity.
  • Keep it real — Project should be released before the time I post.
  • Keep it brief — A code sample, a project link, and up to two sentences of commentary is all I need.
  • You must be a developer on the project. (Sorry. No clever phrase for this one.)

The rules are so specific because the coverage is plentiful of things that don’t fit within them, but a lot of the newer and (usually) more interesting projects aren’t really gaining the exposure they need to get a flourishing community to spring up around them.

So, if you’re a developer on one of these projects, e-mail me your submission by June 30th. I’ll judiciously pick through them and make my rather subjective choice for July soon after. Hopefully this can bring some really cool Rails projects some exposure, and expose our readers to tools and projects they can make use of.

Gregory Brown

AddThis Social Bookmark Button

As I was mowing the lawn, I had an idea for a series I thought might be fun.

I’d like to put out an entry once a month about an new or highly active Ruby project that’s looking to gain some extra exposure. People can send me proposals for their projects, and I’ll pick one each month to write about here. This way, if you’re way behind on your mailing list reading, you’ll be able to easily find at least one new project announcement each month.

Below are the semi-arbitrary rules for submission:

  • Project must be fresh / actively developed
  • Project must be released before the time I post
  • Proposal should consist of nothing more than a code example and a link to your project page, no additional commentary needed
  • Project can’t be Rails-centric
  • You must be a developer on the project
  • My project choices will be entirely subjective and unjustified. :)

The reason these rules are fairly specific is that I’m hoping that this series will be a sort of grass-roots effort to have Ruby recognized as a useful language standing on its own two legs. If someone else wanted to start up a similar series about Rails projects on this blog or elsewhere, that’d be great.

I’m also stipulating that the project needs to be relatively new and fresh, because there is no shortage of coverage of the more popular Ruby projects out there. This is a chance to give new folks and new projects some time to shine.

Please email me your submissions by June 30th. I’ll get in touch with my favorite pick some time in the first week of July, and have a post out then. Hopefully this will be a fun little series, and a useful service to those having trouble keeping up on the latest Ruby software.

AddThis Social Bookmark Button

At RailsConf 2007 DHH mentioned that Rails 2.0 would support query caching on the client side in order to speed up AR. I immediately thought to myself, “Huh? Why do it on the client side when the database server will handle that?”.

AddThis Social Bookmark Button

Last year, one of the most difficult things about keeping track of the progress of the Ruby projects in Google’s Summer of Code was finding where the students / mentors were talking about their projects. Since several of the bloggers on O’Reilly Ruby are directly involved in the the Summer of Code in one way or another, we decided that we’d try to make things a little easier for the community for GSoC 2007.

We’ve sent out an open invite to all students and mentors who are assigned to RubyCentral for the summer. Rather than just relaying second hand news, we’ve encouraged those involved to submit blog posts to us, and we’ll post them all here using the special GSoC account. If you missed the original announcement, please contact Gregory Brown, as he’ll be coordinating the effort.

Better than half of the students involved this summer have expressed interest in participating with us. We’re busy collecting bios, and will soon make a post that introduces the folks who will be blogging with us this summer, and a little more detail about their projects.

One of the students involved has plans to have an announcement about their project ready by the end of the month, so keep an eye out for that!

Gregory Brown

AddThis Social Bookmark Button

Just because Everyone Is Here In The Future, doesn’t mean you should be too!

The culprit, in camping’s reloader:

86  # The timestamp of the most recently modified app dependency.
87  def mtime
88    ((@requires || []) + [@script]).map do |fname|
89      fname = fname.gsub(/^#{Regexp::quote File.dirname(@script)}//, '')
90      begin
91        File.mtime(File.join(File.dirname(@script), fname))
92        rescue Errno::ENOENT
93        remove_app
94        @mtime
95      end
96    end.max
97  end

If your most recent modified time is more recent than your current system time, your reloader will break until you go Back To The Future.

I figure this is probably a rare case, and not really a bug, but if you’re playing around with system time dependant apps (I am), this might bite you.

Gregory Brown

AddThis Social Bookmark Button

I’m excited to be able to finally get around to another post in this Digging Deep series, in which I hope to delve head first into some Ruby esoterica. This time around, I have an interview with Ara T. Howard about some of the hackery he does with packaging rq

Hope you enjoy it! Questions follow.

Gregory Brown

AddThis Social Bookmark Button

I spend most of my time building relatively large applications with Ruby, and this makes me forget how easy the quick and dirty hacks are. In less than the time it’d take me to google the right UNIX tool for escaping HTML, here is my tiny script that I use for things like blog entries and mass spam emails.

#!/usr/bin/env ruby

require "cgi"
puts CGI.escapeHTML(ARGF.read)

Mmh,… sweet simplicity. If you’ve not worked with the CGI lib before, there are probably other goodies in there so have a look at the API docs.

UPDATE: Sam Aaron does a good job of explaining what this script actually does in the comments

Gregory Brown

AddThis Social Bookmark Button

For those who saw my other post on what the RubyForge forum is about, I apologize for the redundancy here. However, I feel like perhaps if some folks pass this reminder along, it’ll get the message out. I’ve seen a huge resurgence of off topic posts, and I’m actually feeling bad that people end up waiting for replies only to get the same ‘we don’t deal with those questions here’ reply.

The RubyForge support forum is meant to support RubyForge itself. That means that if you have a feature request you might want to talk about before submitting a formal proposal, if you think you might have found a broken service in RubyForge, but you’re not sure, or if you just want to talk to us about some of the stuff we offer, you’ve found the right place.

If you have svn access that works on Windows but not on Linux, If you can’t install rails but you don’t suspect our gem servers are broken, or if you just want to ask what a particular library does, please, don’t use the RubyForge forum. You will get much better help elsewhere.

I am the only active volunteer monitoring our forum right now, so please… help me out a bit by using the great mailing lists out there!

This isn’t to discourage people from using our forum, in fact, if in doubt, post to us anyway. But please, read the FAQ before you post. If others can spread the word by linking my other article on what our forum is for, that’d be very helpful!

Gregory Brown

AddThis Social Bookmark Button

Here’s a little problem I ran into in some old code of mine.

Sometimes we’ve got methods where we want to return a new instance of the same class.

It’s tempting to write the following code:

>> class A
>>   def a_whole_new_me
>>      A.new
>>   end
>> end

Sure enough, that seems to work:

>> a = A.new
>> a.class
=> A
>> a.a_whole_new_me.class
=> A

So what’s wrong with it? We forgot about subclasses!

>> class B < A; end
>> b = B.new
>> b.class
=> B
>> b.a_whole_new_me.class
=> A

If we’re expecting a copy of B and get A, this is certainly going to cause trouble, but the scary part is it might not right away (since B usually has all of A’s methods, but not necessarily the other way around)

Luckily, this is easy to fix, just use self.class

>> class A
>>   def a_whole_new_me
>>     self.class.new
>>   end
>> end
>> a = A.new
>> a.class
=> A
>> a.a_whole_new_me.class
=> A
>> class B < A; end
>> b = B.new
>> b.class
=> B
>> b.a_whole_new_me.class
=> B

This practice is usually a good idea whenever we want to refer to our class object. Rather than making things rigid, if you use self.class when possible, your code will be easier to extend and behave better in general. Of course, your mileage may vary depending on your task.

AddThis Social Bookmark Button

Over the last couple of Ruby releases I’ve made some improvements (with Eric Hodel’s help and blessing) to RDoc for C extensions that I thought I would share with you. If you write C extensions with Ruby then keep reading. If you don’t do C and/or don’t care that much about RDoc, this post may not be that interesting for you. :)

Ryan Leavengood

AddThis Social Bookmark Button

I haven’t posted here in a very long time, but I recently got a full-time job using Ruby and Rails (hurray) so Ruby is more on my mind lately. In fact I’ve gotten a better understanding of what life is like for the average Rails developer by seeing how my co-worker Alex writes his Ruby code. Now Alex is a smart guy, he has been doing web-sites for years, is proficient in ColdFusion, PHP , Flash, HTML and CSS, yet his Ruby code is not always as elegant as he or I would like. Of course I’ve been using Ruby for almost 6 years so know it quite well (and that is a big reason why I was hired.)

Still even I find the occasional new nugget and figured this blog would be a good forum to expose some of my new insights. This way other Rails developers like Alex who aren’t as proficient in Ruby as they would like can benefit from my experience.

Recently I was perusing the documentation for the Enumerable module and took a closer look at the grep method. This method is surprisingly more powerful than it might seem at first glance. To learn more, please continue reading this entry…

Gregory Brown

AddThis Social Bookmark Button

This came up in #camping today and I figured it was worth at least a mention:

Vanilla HashWithIndifferentAccess is slightly more choosy than Camping’s.

A quick irb session with each shows the difference.

From active_support

>> require "active_support"
=> true
>> a = HashWithIndifferentAccess.new
=> {}
>> a.apple
NoMethodError: undefined method `apple' for {}:HashWithIndifferentAccess
        from (irb):4
>> a.apple "bar"
NoMethodError: undefined method `apple' for {}:HashWithIndifferentAccess
        from (irb):5
>> a.apple = "bar"
NoMethodError: undefined method `apple=' for {}:HashWithIndifferentAccess
        from (irb):6

From camping

>> require "camping"
=> true
>> a = HashWithIndifferentAccess.new
=> {}
>> a.apple
=> nil
>> a.apple "bar"
NoMethodError: apple
        from /usr/local/lib/ruby/gems/1.8/gems/camping-1.5/lib/camping.rb:51:in `method_missing'
        from (irb):5
>> a.apple = "bar"
=> "bar"

This is not a complaint, just an observation I hope will be helpful. :)

Jim Alateras

AddThis Social Bookmark Button

Since picking up RoR I have had to dig in to technologies which I had previously only glanced at. One of these is CSS, part of the DHTML set of technologies.

The experience has been rewarding and at times frustrating but the outcomes has been positive. It is a technology that people need to get their heads around in order to develop an effective and usable front end to your web application.

Here are some resources that eased the learning curve

Books

Debugging Tools

Editors

Gregory Brown

AddThis Social Bookmark Button

The following may be a bit of an ‘Advanced NubyGem’ but I think this is an interesting idiom that most people will have to work with.

Sometimes in Ruby you’ll see things, both in core and third party libraries, that look a bit like class names that take arguments.

You may of already seen these in the form of Integer(), Array(), etc.

>> Integer(1)
=> 1
>> Integer("1")
=> 1
>> Integer("")
ArgumentError: invalid value for Integer: ""
        from (irb):3:in `Integer'
        from (irb):3

These weird looking methods are just alternate constructors. They can really come in handy as a way to simplify the most common ways of building a given object, or giving them some different behaviors.

Here is an example from Ruport:

The long way of constructing a Table object:

  table = Ruport::Data::Table.new :data => [[1,2,3],[4,5,6]],
                 :column_names => %w[a b c]

This is the shortcut interface:

 table = Table(%w[a b c]) << [1,2,3] << [4,5,6]

This is usually less typing and looks a little cleaner. Of course, it doesn't support the same exact options of the original constructor, but that's what we're going for afterall.

How it's done

Implementing these methods is as simple as using capitalized method names. In Ruport, we are currently a little surly and do this right on Kernel so you can get the constructors everywhere:

module Kernel
   def Table(*args)
        #implementation here
   end
end

A safer way

Instead of sticking these methods in Kernel, you can play it safe and put them in a module,
this way, they only are available where you include them. For example, if enough people complain, we’ll probably do something like this:

module Ruport::Data::Shortcuts
   def Table(*args)
       #....
   end
end

This way, people would be able to use these shortcuts just in certain classes by including the module in them. If you’re worried about your names conflicting with others, this might be a good idea.

Don’t Overuse Your Shortcuts!

I should fix this problem before I tell folks not to do it, but right now bits of Ruport code use the shortcut methods internally. Unless you have a really compelling reason (other than laziness) to do this, Don’t!

The very thing that makes it ’safe enough’ to stick these shortcuts in core is that a conflict should only break the shortcut. If you use your shortcuts internally in your code, there is a chance of some serious issues if things collide.

Jim Alateras

AddThis Social Bookmark Button

Recently on the ror-talk mailing list there was a question on navigational menus, which revealed some interesting resources.

link_to_unless_current helper
Goldberg, Integrated Site Design for RoR
NavTab Plugin”

The navtab plugin, which is part of SeeSaw toolbox, is a neat little plugin, which provides instant support for tabs in your web applications. We will definitely be using this on our next RoR application.

Gregory Brown

AddThis Social Bookmark Button

I’ve been running local special need gem servers as well as a gem server for beta versions of Ruport and support code (namely plugins). I got tired of manually wget’ing the gems I needed and their dependencies, so with a little help from mechanize and archive_tar_external, my friend Dinko and I came up with this quick (pretty messy) hack.

require 'rubygems'
require 'archive/tar_external'
require 'mechanize'
require 'fileutils'
include FileUtils
include Archive

module RFDownloader

  def populate(archive)
    mkdir ".rf_downloader"
    cd ".rf_downloader"

    yield

    Tar::External.new( "../#{archive}.tar",
                       "*.gem", "gzip")
    cd ".."
  end

  def get(pattern,options={})
    link = "http://rubyforge.org/projects/#{options[:project]}"
    agent = WWW::Mechanize.new
    page = agent.get(link)
    page.links.find { |l| l.text[/File/] }.click
    link = agent.page.links.find { |l| l.text[pattern] }
    link.click.save_as(link.text)
  end

  def fetch(opts,&block)
    populate(opts[:archive]) do
      module_eval &block
    end
  end

  module_function :fetch, :populate, :get

end

With this, we were able to grab automatically a tarball filled with all the gems needed to run Ruport 0.9.0 with a tiny little script:

require "lib/rf_downloader"

RFDownloader.fetch(:archive => "ruport_deps") do
  get /fastercsv.*gem/,          :project => "fastercsv"
  get /pdf-writer.*gem/,         :project => "ruby-pdf"
  get /color-tools.*gem/,        :project => "ruby-pdf"
  get /transaction-simple.*gem/, :project => "trans-simple"
  get /RedCloth.*gem/,           :project => "redcloth"
  get /hoe.*gem/,                :project => "seattlerb"
  get /rubyforge.*gem/,          :project => "codeforpeople"
  get /mailfactory.*gem/,        :project => "mailfactory"
  get /mime-types.*gem/,         :project => "mime-types"
  get /scruffy.*gem/,            :project => "scruffy"
  get /build.*gem/,              :project => "builder"
  get /gem_plugin.*gem/,         :project => "mongrel"
end

Elegant? Nope! A way to be lazy and quickly snapshot the newest versions of a number of gems, you bet ya. Maybe someone will see this and build a nice little library that does this ‘the right way’ :)

UPDATE: Thanks to Aaron Patterson for helping me get rid of the wget call

Gregory Brown

AddThis Social Bookmark Button

RubyForge has been growing fast. As the primary resource for project hosting in Ruby, it’s been expanding with the language. One thing that is interesting about it as a service is that it is more-or-less maintained by just one person (Tom Copeland) with the help of a few other folks and some RubyCentral funding.

A few months ago I joined the RubyForge team to try to lighten Tom’s load a bit, and have been monitoring the support forum since then. I think a whole lot of folks don’t even realize we have one while some others have mistakenly assumed it was a general Ruby forum. This post is meant to explain a little bit about what the forum is for and how you can use it to get the help you need.

Gregory Brown

AddThis Social Bookmark Button

Yet another series attempt

I think my NubyGems series has gone over pretty well, so I figured I’d try my luck with something else.

This new series, digging deep, will be targeted towards the more experienced developer, or at least folks who want to know a little more about some deeper concepts in Ruby. I’m going to do what I can to make these as clear as possible, but I’m relying on input from gurus and skeptics to help improve these posts and make them a good resource for folks.

So this post will focus on a recent topic on RubyTalk, the merit of using Mixins as a suitable replacement for multiple inheritance.

Gregory Brown

AddThis Social Bookmark Button

This was actually inspired by some thoughts at a new_haven.rb meeting I wasn’t able to attend, reintroduced in a RubyTalk thread, and solidified in another blog entry that’s worth a look.

The topic here is simple, and it’s that for most common needs, there is absolute no reason to use class variables. I will show how you can use class instance variables to avoid the problems associated with class variables in most cases at the end of this article, but first, take a look at two compelling and mind melting reasons for *not* using class variables.

From Gary Wright:

>> class A
 >> @@avar = 'hello'
 >> end
=> "hello"
 >> A.class_variables
=> ["@@avar"]
 >> A.class_eval { puts @@avar }
NameError: uninitialized class variable @@avar in Object
        from (irb):5
        from (irb):5
 >> class A
 >> puts @@avar
 >> end
hello
=> nil
 >> class A
 >> def get_avar
 >> @@avar
 >> end
 >> end
=> nil
 >> a = A.new
=> #
 >> a.get_avar
=> "hello"
 >>   a.instance_eval { puts @@avar }
NameError: uninitialized class variable @@avar in Object
        from (irb):16
        from (irb):16
 

And the real scary one, from David A. Black:

   @@avar = 1
  class A
    @@avar = "hello"
  end
  puts @@avar  # => hello

  A.class_eval { puts @@avar }  # => hello

Yes, there are reasons why both of these problems occur. Yes, they are likely to give you a headache.

If you’re just looking to store some data in a variable which is unique to your class, but accessible from instances or externally, why not just use instance variables?

>> class A
>>   @foo = "bar"
>>   class << self; attr_reader :foo; end
>> end
=> nil
>> A.foo
=> "bar"

Yup, classes are just regular old ruby objects, and class instance variables are just regular old instance variables. This avoids the above problems, as well as simplifies the concepts of where your variables actually live.

Yeah, maybe @@foo is easier to type than self.class.foo
But good luck debugging it! :)

Gregory Brown

AddThis Social Bookmark Button

Beauty is in the eye of the beholder

In my experience, there are a few different attitudes when it comes to beautiful design. Many consider Rails to be a beautiful framework to work with. In it’s own right, I certainly would agree with these folks. If you’re willing to accept the 80/20 divide and the tool fits the job, it could be dreamy to work with Rails. But in my limited experience, I sort of feel like a decision to work with Rails is more or less an all-or-nothing decision.

Like a train that wants to go east but only has tracks running North to South, a change in course is going to create some major problems. I don’t think this is a bad design, that’s what a train is meant to do. But maybe for some jobs, you need a dune buggy. That’s where the little wheels come in.

The camping framework is designed in light of another definition of beauty, and that definition is closer to the austere. When I started a job in it, I asked the usual questions one might run into when dealing with a small web app. “Does camping support sessions|file uploads|static files”

The answers were, ’sort of’,'nope’,'nope’. But all three were also appended with a “but it’s easy to roll your own”. The rest of this article will show you one way to do all three, and hopefully show some appreciation for the simplicity of the task.

Geoffrey Grosenbach

AddThis Social Bookmark Button

In preparation for two Ruby on Rails workshops in Sydney, Australia in a few weeks, I’ve discovered a few testing tidbits.

Mike Loukides

AddThis Social Bookmark Button

I was going to write some more about DSLs, but I realized I had something more important to say. And nowhere near as verbose.

Lucas Carlson (author of the Ruby Cookbook) has been doing some great stuff. Check out starfish. It’s a simple, easy-to-use, implementation of Google’s MapReduce for Ruby. It’s available as a Gem, so just gem install starfish and you’re ready.

Mike Loukides

AddThis Social Bookmark Button

About a month ago, there was a flurry of blogs and articles on what is, and what isn’t, a domain-specific language. I’m not sure who started it, but articles include Martin Fowler’s, Nutrun’s, and Jeremy Voorhis’. At the same time, I was having some interesting conversations about DSLs with Terence Parr, the author of ANTLR. So much for link-dropping.

I think there’s a question that’s more interesting than “what’s the lower bound for a DSL” (essentially the question that Martin is asking), or “is Rails a DSL?” (the question that Jeremy and Nutrun are asking). What makes a working language? Well, there are lots of components, but basically, you have a vocabulary and a grammar. Ruby DSLs are structurally very similar: they define a lot of methods (often via metaprogramming); they typically don’t define any new grammar, but use Ruby’s relatively loose and flexible grammar. So what looks like a nice domain-specific language that requires minimal knowledge of a formal programming language is, in fact, just a bunch of method calls in a formal programming language. The method names are the vocabulary, and the grammar is Ruby. Jeremy calls this a “fluent interface” (and notes that it’s often just good design).

Creating a really good fluent interface is a neat trick: it looks like you’re getting a new language “for free” (almost). You could argue that these fluent interfaces aren’t really DSLs–they’re really just Ruby. But that’s not a very useful or interesting argument. Here’s a more interesting question, which I haven’t seen discussed: what’s the point at which it becomes necessary for a DSL to have a distinct grammar? (I don’t think it was discussed at the DSL session at our FOO conference, which I missed). What’s the point at which the control structures, etc., in Ruby (or whatever the parent language) are no longer sufficient or adequate for the domain? What’s the point where you have to go beyond creating classes and metaprogramming, and generate your own parser with the venerable YACC, or a successor like ANTLR?

After all, it doesn’t matter whether you call it Ruby or Java or Python; most programming languages have fairly similar control structures (if/else, for, while, unless), and in most OO languages, the class system is fairly similar. That works well for computing–but what about other domains? When does the grammar of a traditional programming language, even one as flexible as Ruby, become inadequate?

Whether or not Rails is a DSL, it’s not terribly surprising that it doesn’t require any new grammar. There’s certainly an impedance mismatch between OO languages and relational databases, but it’s not a huge one: Web interfaces, OO languages, and RDBMSs are all things designed by us computer folk, and they’re more similar than different. Could you use a language like Ruby to model business strategies, musical compositions, and other domains where the impedance mismatch is greater? Or would you need a different grammar entirely?

Geoffrey Grosenbach

AddThis Social Bookmark Button

How much memory do you need to run a Rails app?

A few months ago I moved my Typo-based blog to a Virtual Private Server. A VPS is a server that is shared with a few other sites. You are guaranteed a certain amount of RAM and have root access to the box. I followed Ezra’s instructions and easily installed MySQL, Lighttpd, FastCGI, and Rails. Finally, I removed Apache since I wouldn’t be using it.

I didn’t do much tuning. MySQL is a basic install and I have one Rails FCGI process and one PHP-FCGI process running (my site stats application runs on PHP). The site gets about 1,500 to 2,000 page views a day, so it’s not a high activity site. Still, it has been much more reliable and I have had only one hour of downtime in the last two months when the server rebooted and my reboot script was incorrectly configured.

A few weeks after signing up, I was notified that the hosting company had made a mistake and would be moving my site to a new server. In exchange for the inconvenience, I received a free upgrade from 128MB to 192MB of RAM. The upgrade revealed a few facts about the memory usage of a Rails stack.

AddThis Social Bookmark Button

In May, John Lam (behind Ruby CLR) put me in touch with the Visual Studio team at Microsoft. They were very concerned that the Ruby interpreter was being built on an obsolescent version of the Visual C++ compiler and were wondering what might be able to be done toward getting Ruby built with Visual Studio 2005. On June 26th, I had a conference call with the team and pointed out issues that I had experienced in the past and where I had stumbled currently. I posted a message (
[ruby-talk:199211]) immediately after seeking specific issues to deal with. So here I am, seeking even more than I’ve gotten already.

They asked for as much information as I can gather on this matter. If you have had problems trying to get Ruby or an extension compiled or running on Windows for any reason, but especially because of Microsoft runtime DLL differences, please provide me as much information as possible so that I can pass it on to the VS team at Microsoft.

I’ve already heard complaints about compile-time compiler options and general annoyance at incompatible ABIs between runtimes, and suggestions on how to reduce the dependency of an extension on any given Windows runtime. It doesn’t solve the fundamental issue of memory management (and general resource management) between an extension, Ruby, and the library that the extension is written for possibly being tied to three different runtimes.

For what it’s worth, I have Ruby itself compiled with VS 2005, although some of the extensions are not being built automatically, from what I can tell. That may be a build script problem. However, based on earlier reports of runtime version incompatibilities, it was looking like I was going to have to recompile a *lot* of code. The discussion today suggested that as long as function accesses are used and not variable accesses, things will be okay (e.g., GetError() in Windows, not errno). The problem, as I pointed out to them, is that *most* Unix developers only ever have to worry about a single C runtime being on their system and therefore don’t need to worry about errno being in a different runtime DLL. Ideally, they would be able to give us an external way of getting at the errno from a specific
runtime that may not be “our” runtime (e.g., the runtime with which we were linked).

So, I ask you: What other problems have people had and what can you provide me as evidence? Also, can I give them your name and email for direct contact? I will be headed to Europe soon and won’t be able to respond quickly.

[Update 2006.09.11: I am closing this entry’s comments as several spam attempts have been made. If you have issues to report, please report them on comp.lang.ruby and I’ll see them on ruby-talk.]

Geoffrey Grosenbach

AddThis Social Bookmark Button

I try to expand my knowledge of Ruby by reading code from other developers I admire. In the past few weeks, that has meant reading the meager source of Camping, Why the Lucky Stiff’s tiny website framework.

It also helped to learn by writing a test framework for Camping. It works pretty well so far and has taught me a lot about Ruby, testing, and Camping. I hope to package it as a gem in the next few weeks.

Last night at the Seattle.rb, Evan Webb answered a few questions about the finer points of Ruby and metaprogramming. (Note: This is not a Camping tutorial.)

Gregory Brown

AddThis Social Bookmark Button

Well, I just started using a very cool (albiet non ruby) wiki/blogging service called Infogami. The not so cool thing about it is that as far as I could tell, it only had Atom feeds. Since some of the syndication stuff I use is only compatible with RSS, this was almost a show stopper. Until I found out about FeedTools, of course.

The script below is now tied to a cronjob which runs every 20 minutes converting my atom feeds into rss feeds and mirroring them on my own host.

#!/usr/bin/env ruby
%w[rubygems feed_tools fileutils date].each { |l| require l }
URL = 'http://ruport.infogami.com/blog/atom.xml'
File.open("index.rss","w") do |f|
  f.puts FeedTools::Feed.open(URL).build_xml("rss")
end
`scp ~/index.rss deleted@stonecode.org:~/reporting.stonecode.org/blog/`
FileUtils.rm('index.rss')
puts "feed synced at #{Time.now}"

Just another example about how Ruby makes life easier all over the place :)

Derek Sivers

AddThis Social Bookmark Button

I love that when Ruby surprises me, it’s never the kind of surprise like, “Huh? Why does it do that?” - but instead it’s always, “Oh, wow, OF COURSE that makes perfect sense, I just never thought of it like that!”

Newest case of that: passing blocks to gsub

I had always done gsub using a string as the replacement:

> s = ‘pretty little ponies’
> s.gsub(/[^aeiou]/, ‘_’)
=> “__e_____i___e__o_ie_”

But the book Ruby for Rails had an example of passing a block into gsub:

> s.gsub(/[^aeiou]/) {|c| c.upcase}
=> “PReTTY LiTTLe PoNieS”

Beautiful

Steve Mallett

AddThis Social Bookmark Button

Yoichiro Hasebe has sent in this link to a Japanese translation of ONLamp’s seminal Rolling with Ruby on Rails - Part2 article.

If anyone is Japanese to English and the reverse capable & notices any error Yoichiro would love to hear from you.

Kampai!

Gregory Brown

AddThis Social Bookmark Button

Now, be forewarned my gentle Nuby friend, I am not going to be explaining symbols here. Though the question is asked on a weekly basis on RubyTalk and other places, and the general consensus is “They’re named numbers… they’re handy to work with, and they aren’t the same thing as Strings, so just use ‘em when it feels right”, I am going to point out a simple little pitfall that you might want to be careful with.

If you want some primer reading on Symbols, there are already a few on this blog alone, and many more discussions in the RubyTalk archives. In fact, most of the top links from a google search for “Ruby Symbols” should get you on the right track.

So today’s topic is memory management. Though Ruby’s garbage collection makes it pretty easy for us to not even think of this topic day to day, occasionally, the topic must rear it’s ugly head, If for nothing else, it is to remind us why we took that stupid “Intermediate C Programming” course or it’s equivalant where seemingly innocuous things such as dynamic arrays were potential sources of memory leaks.

In Ruby, you see a lot of Hashes which use Symbols for their keys. It’s mostly because

{ :my => :super, :duper => :hash, :looks => :cooler, :this => :way }
{ "than" => "it", "does" => "this", "way" => "..." }

and because symbols are really fast little deallies to be working with.

If your hash is going to be used internally, you never need to worry about anyone except programmers indexing your data, and if they want a persons[:phone_number] , they can just ask using Symbols. But what if you were using a Hash to store some data accessible via a web form, or a command line application? Then you’d have to convert the Strings to Symbols.

Of course, that is very, very easy. "hello".to_sym happily will generate you a Symbol :hello.

So it’s very tempting to put something like this in your code:

do_something_funky_with(my_hash[some_string.to_sym])

High fives all around and you can happily use Symbols internally and let people type into text boxes or scribble bits of Cuneiform which get translated to strings which then are converted to symbols for indexing and you’re getting at your data as easy as pie. Ruby’s garbage collection will happily do away with those converted strings, won’t it?

The unfortunate answer is no. Symbols are designed so that once they spring into existence, they never die. Not of a natural death, anyway. With sufficient validation and sparse use of these immortal little constructs, no problem will ever arise. However, leave the floodgates open and the flood will come.

Take a look at the memory usage in a simple little irb session I was running:

At start:

sandal@harmonix:~$ pmap 24683 | grep total
 total     5980K

I then run this bit of code.

>> a = "a"
=> "a"
>> 10000.times { a.succ! }
=> 10000
>> a
=> "ntq"

You wouldn’t expect substantial memory growth here and you don’t find any at all:

sandal@harmonix:~$ pmap 24683 | grep total
 total     5980K

A minor change is made. I convert the values to symbols after I iterate to the next letter. But I don’t store the value of this anywhere, so you’d expect it to just disappear peacefully.

>> a = "a"
=> "a"
>> 10000.times { a.succ!.to_sym }
=> 10000
>> a
=> "ntq"

But, alas.. it is not so.

sandal@harmonix:~$ pmap 24683 | grep total
 total     6244K

We’ve grown by 264k! Now this may seem tiny, but imagine this on the end of a long running high volume server process that accepted user input… even potentially that of spambots and malicious Skr1ptk1ddz trying to crack in to set up the l33t35t w4r3z site.

Now we’ve got ourself a memory leak, and that is generally considered A Bad Thing.

This memory doesn’t get released, either. I ran these tests right before I started blogging, and with irb still running and unchanged, running pmap still shows me at exactly 6244K.

If you already have a grasp for symbols and what they are and how they should be used, it’s not very hard to see why this problem is something that’s not really a problem, but just something you need to be careful about. It’s also important to note that symbols get mapped one to one to unique values, and if you call the same symbol again, it’s not going to suck up more memory. This makes them completely safe to use if you just protect yourself a bit.

So that’s your NubyGem for the day… If you’re dealing with user input, it’s probably better not to convert it to symbols. But if you do, be sure to validate before your conversion, to avoid creating something like :some_really_l33t_symbols_that_will_eventually_starve_this_process.

Next time… we’ll get immediate with values. Anyone with a specific question or experience to share might want to email me, because I’m trying to find a good example for this upcoming article.

Gregory Brown

AddThis Social Bookmark Button

Introduction

Over the next few weeks, I will be posting short little blog entries about things that I’ve found myself tripping over when I first learned Ruby. They will be minimal little blurps that will hopefully help new users either understand a trap they’ve fallen into, or help them avoid a trap.

I hope the RubyGems folks forgive me for the terrible pun, but these are meant to be little gems of information for those new to Ruby. If you are an experienced rubyist, these posts will probably bore you, so be warned!

The Hash Initialization Problem

One fairly common need in programming is to set a default value for keys to map to in a hash. For example, if you are dealing with a hash of a bunch of numbers, you might just want a key that isn’t there to map to zero.

There are two different ways you can use the hash constructor to do this. One is to just pass the number in as a parameter, like Hash.new(0) and the other is to use a block such as
Hash.new { |h,k| h[k] = 0 }

Now, in this simple example, these two pieces of code do the same thing from a users perspective. Both result in the ability to get results like this:

>> a[:foo]
=> 0
>> a[:foo] += 1
=> 1
>> a[:foo]
=> 1
>> a[:bar]
=> 0

Now, as a lazy coder who didn’t initially have a strong grasp of blocks when I first started using Ruby, I much prefered the parameter form. However, there is a subtle difference between the two that can be very problematic if you aren’t careful.

The thing that is important to know is that the parameter form will return the same exact object for all the default values. Though the example before used a Fixnum, which is an immediate value, if you use something like… say a string, it’s not so simple. Take a look at the two chunks of code below, and note the difference.

irb(main):001:0> a = Hash.new("")
=> {}
irb(main):002:0> a[:foo]
=> ""
irb(main):003:0> a[:foo] << "bar"
=> "bar"
irb(main):004:0> a[:foo]
=> "bar"
irb(main):005:0> a[:train]
=> "bar"
irb(main):006:0> a = Hash.new { |h,k| h[k] = "" }
=> {}
irb(main):007:0> a[:foo]
=> ""
irb(main):008:0> a[:foo] << "bar"
=> "bar"
irb(main):009:0> a[:foo]
=> "bar"
irb(main):010:0> a[:train]
=> ""

See? The first bit of code shares a common string object, where the second bit creates a new string for each new key which is not mapped to a value. Though you might find the first behavior useful at times, it is usually the case that the second is what is desired, and this is the rule of thumb I go by to keep from getting snagged:

If I am using an immediate value for my default, I tend to use the parameter method. Otherwise, I tend to use the block form, especially when dealing with any type of collection or string.

Anyway, I hope this helps people understand what the two different initializers do and prevents some gotchas. Happy Hacking!

Gregory Brown

AddThis Social Bookmark Button

For those who have read Higher Order Perl, you might be interested in the work in progress which is James Edward Gray II’s Higher Order Ruby series.

James is doing his usual thing (being hardcore), and in the process has come up with 6 articles so far, many of them full of neat little tricks. Even if you haven’t read Higher Order Perl and are just looking for a little extra Ruby lovin’, this is a fine way to get some.

Being a former perl guy myself, it’s nice to see the way a lot of these things are handled so elegantly in ruby.

AddThis Social Bookmark Button

As you are probably aware, the Ruby interpreter and some of the core libraries are written in C. Over the next few weeks I plan to share a look at some of the internals of Ruby and how it achieves some of the things it does from the C side of things.

The first point of interest is the VALUE - Ruby’s internal representation of its objects. In the general sense, a VALUE is just a C pointer to a Ruby object data type. We use VALUEs in the C code like we would use objects in the Ruby code.

some_function(VALUE arg_object)
{
  some_method(arg_object);
}

One would expect that the VALUE is just a typedef to a C pointer and there’s a lookup table as to which object it represents, and this would be partially correct. However, there’s also some trickery involved.

Instead of implementing the VALUE as a pointer, Ruby implements it as an unsigned long. It just so happens that sizeof(void *) == sizeof(long) - at least on the platforms I’m familiar with. After all, what is a pointer? It’s just an n-byte integer that represents a memory address.

But because of this, there’s some tricks Ruby can perform.

First, for performance purposes, Ruby doesn’t use the VALUE as a pointer in every instance. For Fixnums, Ruby stores the number value directly in the VALUE itself. That keeps us from having to keep a lookup table of every possible Fixnum in the system.

The trick lies in the fact that pointers are aligned in 4 byte chunks ( 8 bytes on 64 bit systems ). For example, if there was an object stored at 0×0000F000, then the next would be one stored at 0×0000F004. This jump from 00 to 04 in the lower nibble is important. Expanding out as bits, it is: 00000000 and 0000100. This means that if we use the VALUE as a pointer, the lowest two bits will always be 0s.

Ruby uses this to its advantage. It will tuck a 1 in the lowest bit, and then use the rest of the space (31 bits) to store a Fixnum. One of the bits will be used for the sign, so a Ruby Fixnum can be up to 30 bits in length.

irb(main):021:0> (2 ** 30).class
=> Bignum
irb(main):022:0> (2 ** 30 - 1).class
=> Fixnum
irb(main):024:0> (-(2 ** 30)).class
=> Fixnum
irb(main):025:0> (-(2 ** 30)-1).class
=> Bignum

Ruby uses the other bit to help distinguish other common types, like false, true, and nil. Symbols and their IDs are also stored with this bit on, so Ruby recognizes it as a special instance and interprets accordingly.

The rest of the time a VALUE is a good old fashioned memory address, which points to an object structure in memory.

ruby_value_diagram.png

So there you have it. I hope this little snippet was of some VALUE to you.

Gregory Brown

AddThis Social Bookmark Button

UPDATE: The bug mentioned below is probably harmless in your applications. My issue was actually with some dependencies who were fighting each other, and not with the bug mentioned below. However, if you are annoyed about gems returning false when you require them, feel free to read below how to fix it.

UPDATE2: The bug in one of Ruport’s example programs (Ruport itself never actually requires active_record), was that Transaction::Simple is included in a vendor dir of ActiveRecord and is required via gems in PDF::Writer, so they clash, dumping a few warnings. I’m not sure if this will have any side effects or not.

“If there were any more hurdles on this track, the thing would be made out of hurdles”

For those who have been following Ruport development, you know that I’ve gone through 4 computers in the last 5 weeks. That’s insane. But now, as I finally have had a chance to spend a few hours on a machine without it’s hardware breaking on me, I’ve run into software issues.

The current stable version of RubyGems (0.8.11) does not correctly load gems which use the auto_require feature.

So in the course of Ruport development, i found require 'lafcadio' and require 'redcloth' returning big fat falses on me! (Which was causing unsavory results elsewhere in the system)

Searching RubyTalk, Jim Weirich mentioned that it had been fixed in the CVS head.
Sure enough, it was.

I deleted all of the rubygems stuff from my ruby and site_ruby dirs, and then ran:

cvs -d :pserver:anonymous@rubyforge.org:/var/cvs/rubygems login
cvs -d :pserver:anonymous@rubyforge.org:/var/cvs/rubygems checkout rubygems
cd rubygems
sudo ruby setup.rb

I had to reinstall my gems, but once I did, I was back to getting trues instead of falses ;)

Now I have no clue how stable the CVS head is, so this is not a recommendation, but rather just a little experience I wanted to share :)

The good news is, I am back to coding… so those of you out there on the Ruport mailing list, be on the lookout for a beta preview of Ruport in the next day or two.

Thanks to the RubyGems team for quickly fixing this bug so that people who want to be on the bleeding edge rather than downgrading can do so :)

AddThis Social Bookmark Button

Last week, I wrote about the Singleton pattern in Ruby much like Rusty Divine did about Java/C#/.NET at CodeSnipers. Since his second entry was about the Observer pattern, I thought I’d write about its Ruby implementation.

AddThis Social Bookmark Button

Note, Jan 10th 2005 — I removed this post after receiving enough angry letters about it to make me feel bad for having written it. In the 10 days or so since then, I’ve received angry letters telling me that I’m rewriting history by turning it off, and so on. It’s a lose-lose situation for me. At the risk of becoming the most hated person in blogdom, I’m turning it back on. Please make sure to read the follow-up, “Bambi Meets Godzilla”, before sending me your angry flames. Thanks. -steve

Everyone’s buzzing about Bruce Eckel’s “anti-hype” article. I hope the irony isn’t lost on him.

The thrust of Eckel’s article appears to be that hyper-enthusiasm is diminishing the Ruby camp’s message, and it’s spoiling a good gentleman’s argument. Those darn hyper-enthusiasts are focusing relentlessly on how cool Ruby is and how much they like it, when what’s really needed here is a balanced, objective, neutral, moderated, standards-based, point-by-point, academic discussion of Python vs. Ruby, in which we can all make well-informed decisions, and may the best language win, as long as it’s Python.

Python folks never really did understand marketing.

I’m surprised we need a history lesson here; we’ve all been through this so many times before. But let’s look once again at the basics of language adoption.

First, inferior languages and technologies are just as likely to win. Maybe even more likely, since it takes less time to get them right. Java beat Smalltalk; C++ beat Objective-C; Perl beat Python; VHS beat Beta; the list goes on. Technologies, especially programming languages, do not win on merit. They win on marketing. Begging for fair, unbiased debate is going to get your language left in the dust.

You can market a language by pumping money into a hype machine, the way Sun and IBM did with Java, or Borland did back with Turbo Pascal. It’s pretty effective, but prohibitively expensive for most. More commonly, languages are marketed by a small group of influential writers, and the word-of-mouth hyping extends heirarchically down into the workplace, where a bunch of downtrodden programmers wishing they were having more fun stage a coup and start using a new “forbidden” language on the job. Before long, hiring managers start looking for this new language on resumes, which drives book sales, and the reactor suddenly goes supercritical.

Perl’s a good example: how did it beat Python? They were around at more or less the same time. Perl might predate Python by a few years, but not enough for it to matter much. Perl captured roughly ten times as many users as Python, and has kept that lead for a decade. How? Perl’s success is the result of Larry Wall’s brilliant marketing, combined with the backing of a strong publisher in O’Reilly.

“Programming Perl” was a landmark language book: it was chatty, it made you feel welcome, it was funny, and you felt as if Perl had been around forever when you read it; you were just looking at the latest incarnation. Double marketing points there: Perl was hyped as a trustworthy, mature brand name (like Barnes and Noble showing up overnight and claiming they’d been around since 1897 or whatever), combined with that feeling of being new and special. Larry continued his campaigning for years. Perl’s ugly deficiencies and confusing complexities were marketed as charming quirks. Perl surrounded you with slogans, jargon, hip stories, big personalities, and most of all, fun. Perl was marketed as fun.

What about Python? Is Python hip, funny, and fun? Not really. The community is serious, earnest, mature, and professional, but they’re about as fun as a bunch of tax collectors.

One could write a fat book about this, but just to give you the flavor, consider what happens when you type “python” at a command prompt. It fires up a little interactive interpreter. At the prompt, if you type “quit”, it responds with ‘Use Ctrl-D (i.e. EOF) to exit.’

Well that’s not very nice, is it? It *knows* you want to quit, even going so far as to call you an EOF, whatever that means. (Yes, you and I both know, but is it really the right thing to show to a beginner? Hardly.) Why didn’t it just quit, then?

If you were to bring this issue up on a Python newsgroup at any time in the past 10 years, someone would tersely have instructed you to go look at the FAQ. Or they’d have explained that having ‘quit’ quit would be a strict violation of the semantics of the REPL, which has no a priori knowledge of English, and as Ctrl-D is universally recognized as the EOF char on most terminal emulators, excepting of course broken ones on win32 and VAX platforms, and the interactive shell’s clean design allows the interpreter to treat the input as if it were coming from a file or similar stream, blah Blah BLAH, ergo, the current behavior is correct, quod erat demonstrandum.

Never mind that it’s patently obvious that “quit” should just quit the frigging shell, semantics be damned. They don’t care a whit, because they’re focused on the “right thing” at the expense of the user experience. There’s an old adage for this; it’s called “missing the forest for the trees.”

Of course it’s just as difficult to figure out how to exit the Perl shell, if not more so. But if you were to bring it up on a mailing list or newsgroup, some nice Perl person would come along, eager to show you how to add one more snippet of job security to your lineup of Perl folklore, and would spend an hour explaining how cool it is that you can quit the shell with a single keystroke, one that works in other Unix commands as well, and then maybe show you how to hack the Perl binary so that “quit” also exits the shell for you. The difference is huge: both shells have that crappy misfeature, but Python folks will bore you with justifications while the Perl folks excite you with marketing.

Pedantry: it’s just how things work in the Python world. The status quo is always correct by definition. If you don’t like something, you are incorrect. If you want to suggest a change, put in a PEP, Python’s equivalent of Java’s equally glacial JSR process. The Python FAQ goes to great lengths to rationalize a bunch of broken language features. They’re obviously broken if they’re frequently asked questions, but rather than ‘fessing up and saying “we’re planning on fixing this”, they rationalize that the rest of the world just isn’t thinking about the problem correctly. Every once in a while some broken feature is actually fixed (e.g. lexical scoping), and they say they changed it because people were “confused”. Note that Python is never to blame.

In contrast, Matz is possibly Ruby’s harshest critic; his presentation “How Ruby Sucks” exposes so many problems with his language that it made my blood run a bit cold. But let’s face it: all languages have problems. I much prefer the Ruby crowd’s honesty to Python’s blaming, hedging and overt rationalization.

As for features, Perl had a very different philosophy from Python: Larry would add in just about any feature anyone asked for. Over time, the Perl language has evolved from a mere kitchen sink into a vast landfill of flotsam and jetsam from other languages. But they never told anyone: “Sorry, you can’t do that in Perl.” That would have been bad for marketing.

Today, sure, Perl’s ugly; it’s got generations of cruft, and they’ve admitted defeat by turning their focus to Perl 6, a complete rewrite. If Perl had started off with a foundation as clean as Ruby’s, it wouldn’t have had to mutate so horribly to accommodate all its marketing promises, and it’d still be a strong contender today. But now it’s finally running out of steam. Larry’s magical marketing vapor is wearing off, and people are realizing that Perl’s useless toys (references, contexts, typeglobs, ties, etc.) were only fun back when Perl was the fastest way to get things done. In retrospect, the fun part was getting the job done and showing your friends your cool software; only half of Perl’s wacky features were helping with that.

So now we have a void. Perl’s running out of steam for having too many features; Java’s running out of steam for being too bureaucratic. Both are widely beginning to be perceived as offering too much resistance to getting cool software built. This void will be filled by… you guessed it: marketing. Pretty soon everyone (including hiring managers) will see which way the wind is blowing, and one of Malcolm Gladwell’s tipping points will happen.

We’re in the middle of this tipping-point situation right now. In fact it may have already tipped, with Ruby headed to become the winner, a programming-language force as prominent on resumes and bookshelves as Java is today. This was the entire point of Bruce Tate’s book. You can choose to quibble over the details, as Eckel has done, or you can go figure out which language you think is going to be the winner, and get behind marketing it, rather than complaining that other language enthusiasts aren’t being fair.

Could Python be the next mega-language? Maybe. It’s a pretty good language (not that this really matters much). To succeed, they’d have to get their act together today. Not in a year, or a few months, but today — and they’d have to realize they’re behind already. Ruby’s a fine language, sure, but now it has a killer app. Rails has been a huge driving and rallying force behind Ruby adoption. The battleground is the web framework space, and Python’s screwing it up badly. There are at least five major Python frameworks that claim to be competing with Rails: Pylons, Django, TurboGears, Zope, and Subway. That’s at least three (maybe four) too many. From a marketing perspective, it doesn’t actually matter which one is the best, as long as the Python community gets behind one of them and starts hyping it exclusively. If they don’t, each one will get 20% of the developers, and none will be able to keep pace with the innovation in Rails.

The current battle may be over web frameworks, but the war is broader than that. Python will have to get serious about marketing, which means finding some influential writers to crank out some hype books in a hurry. Needless to say, they also have to abandon their anti-hype position, or it’s a lost cause. Sorry, Bruce. Academic discussions won’t get you a million new users. You need faith-based arguments. People have to watch you having fun, and envy you.

My guess is that the Python and Java loyalists will once again miss the forest for the trees. They’ll debate my points one by one, and declare victory when they’ve proven beyond a doubt that I’m mistaken: that marketing doesn’t really matter. Or they’ll say “gosh, it’s not really a war; there’s room for all of us”, and they’ll continue to wonder why the bookshelves at Barnes are filling up with Ruby books.

I won’t be paying much attention though, ‘cuz Ruby is soooo cool. Did I mention that “quit” exits the shell in Ruby? It does, and so does Ctrl-D. Ruby’s da bomb. And Rails? Seriously, you don’t know what you’re missing. It’s awesome. Ruby’s dad could totally beat up Python’s dad. Check out Why’s Poignant Guide if you don’t b’lieve me. Ruby’s WAY fun — it’s like the only language I want to use these days. It’s so easy to learn, too. Not that I’m hyping it or anything. You just can’t fake being cool.

AddThis Social Bookmark Button

Recently, I starting an attempt to upgrade one of our critical pieces of software to a new underlying library revision, and it proving to be quite a challenge.

I’ll spare the details, but suffice to say it’s not pretty. One aspect I’m currently focusing on is a nice way to share data between modules, those already written and those yet to be written. Historically we’ve used IPC shared memory, which for ruggedly defined structures works very well. However, for dynamic streaming data, it doesn’t fare so well. Furthermore, I need a well defined interface - an abstract one, if you will, that works between various programming languages and toolkits.

Enter the filesystem. I recently discovered the fuse filesystem, and its Ruby counterpart FuseFS. Man oh man what a gem this little guy is.

Here’s a small snippet of what I’m currently accomplishing with FuseFS:

I have an application config file, which is stored in INI format like this:

[Group1]
key1=value1
key2=value2
...

And here’s a snippet of Ruby:

require 'fusefs'

$configfile = “/home/me/blah/myconfigfile.ini”

module INI
  def self.read(filename)
    inimap = { }
    hdr = nil
    File.open(filename).each do |row|
      if row =~ /^[(.+)]$/
        hdr = $1
        inimap[hdr] = { }
      elsif row =~ /^(.+)=(.+)$/
        hdr && inimap[hdr][$1] = $2
      end
    end
  inimap
  end
end

class SettingsDir < FuseFS::MetaDir
  def initialize
    super
    confighash = INI.read($configfile)
    confighash.each do |key,value|
      mkdir(’/’ + key)
      value.each do |key2,value2|
        write_to(’/’ + key + ‘/’ + key2, value2+”n”)
      end
    end
  end
end
root = SettingsDir.new
FuseFS.set_root(root)
FuseFS.mount_under ARGV.shift
FuseFS.run

This code is simple. It reads an INI file, and creates a hash of hashes, and uses that to create a fake directory structure within the operating system. If I run this code, I can open a shell in another window and access this information via the filesystem:

tc@tc8 ~/configfs $ ls settings
Group1 Group2 Group3

tc@tc8 ~/configfs $ ls settings/Group1
key1 key2 key3

tc@tc8 ~/configfs $ more settings/Group1/key1
value1

As you can see, my hash of hashes is now available via the filesystem. Any application now has access to this abstracted information. From this point, it’s fairly trivial to implement a reverse setup, where when you write data to one of the files it saves that back to the hash, which in turn updates the INI file.

Granted, this is a mildly convoluted way to read and write to an INI file, but the point here is that we don’t have to worry about the INI file at all. This system would work for any backend, such as a SQL system or CSV file backend.

I’m pretty impressed so far, and I’ve only scratched the surface of what’s possible.

James Britt

AddThis Social Bookmark Button

So here’s the spiel: Object-oriented programming is all about sending messages. See, given some object, you don’t directly call methods; oh no, you send messages, like asking a favor; “Hey, object, do you think, like, you could, ….”, and the object gets to decide if it wants do you the favor or not. As a practical matter, the distinction between methods and messages varies among OO languages. For example, with Java, code using the standard message-invocation syntax will only compile if the message directly corresponds to a method.
foo.someMessage() // Compiler says the class behind foo must have a someMessage method
In Ruby, on the other hand, you are free to send pretty much any message you choose; the receiver does not need a message-method mapping in order to handle it.
foo.some_message # Code behind foo may or may not have a some_message method.
In the extreme case, you can have a Ruby class that handles all messages:
  class Pushover
    def method_missing( sym, *args )
      puts "Do I want to  #{sym.to_s.gsub( '_', ' ' )}?  Sure!" 
    end
  end

 po = Pushover.new
 po.buy_a_fake_rolex
 po.make_money_fast  
But even though Ruby affords this complete disconnect between messages and methods, most people code with the view that messages map to methods, or some well-defined set of message-based behavior. Maybe because coding pushover classes opens one up to too many risks. (But note that, in the right hands, you get great results .) I wonder, though, if part of the reason is the syntax itself, coupled with the phrase “object-oriented”. Alan Kay, creator of Smalltalk, has said .
Again, the whole point of OOP is not to have to worry about what is inside an object. Objects made on different machines and with different languages should be able to talk to each other—and will have-to in the future. Late-binding here involves trapping incompatibilities into recompatibility methods—a good discussion of some of the issues is found in [Popek 1984].
He’s also reported as saying, “Smalltalk is object-oriented, but it should have been message oriented.“ From a certain point of view, while objects handle the gory details, it’s the message exchange that’s really key. But objects always seem to be the center of attention; they always come first:
big_shot_object.poor_message_comes_last
In fact, you can’t even think about sending a message without first having an object. Or can you? Let’s see. First, we’ll need some new syntax. I’m not yet loopy enough to go invent my own language, so we’ll munge up Ruby code for our purposes. So this may not look as slick and clean as we might like. But, instead of the familiar
receiver.message( arg1, arg2 )
we’ll use flip things around a bit , and use
:my_message.>>( receiver1, receiver2 )
What this syntax means is, “Send the message my_message to all of the listed receivers.” And this
:my_message[ arg1, arg2, arg3 ].>>( receiver1, receiver2 )
means the same thing, but also passing arg1, arg2, etc. as message arguments. Oh, and before you wig out over funky syntax or the violation of the sanctity of Symbols or countless other details, know that much of what drives me to code (or do much of anything, for that matter) is an attitude of, “Gee, what happens if I push this button?” So there’s an element of the, um, unpolished and experimental in the mix. (Though I do expect that admissions of this sort merely prompt many readers to leer and start rubbing their hands as the imagination revs.)

Message-oriented Programming

What I wanted to explore was some form of message-oriented programming, where one could start with a message, then decide on a set of receivers. I thought of different ways to write this, and settled on using Symbols because I wanted something that afforded a decent literal syntax (e.g., “my string”, [ :my, :array ], 123). The idea of having to instantiate an instance of a Message class before using one seemed clumsy. (String literals were my first choice, but having to type all those quote marks got tiresome, and it tended toward the fugly side of things.) Symbols are also handy because they don’t actually do anything. There have very few methods, and are unlikely to pop up in unexpected places where the results of a classotomy might cause conflict. Adding new methods to the Symbol class seemed reasonably safe.

I picked the double-angle syntax as something of a visual indicator of transmission. My original version used block notation for passing arguments; this followed the list of receivers. But while writing this I decided it didn’t look very good, and, more important, didn’t seem to correctly convey its role. So I added the [] method. The basic goal was to have something that made some sort of visual sense. YMMV, blah blah blah, this is an experiment. Syntax aside, the neat stuff is what happens inside Symbol. In the basic case, something like
:message.>>( recv1, recv2 )
would be translated somehow into
recv1.send( :message )
recv2.send( :message )
(And I of course if there are any arguments involved they’d get passed along, too.) Of course, this raises a new issue: what is the return value, if any, of a cross-receiver message dispatch? I picked an array (and we’ll be getting to implementation details shortly). So, send a message to some number of receivers, and get back an array of response values.

Shooting in the Dark

I also wondered about possible use cases. If you already have a list of known receivers handy, then you might simply write a loop to do the message sending. But what if you wanted to send a message to some unknown set of objects, a set where the recipients are defined by some property or behavior? For example, imagine your application has all sort of IO-like objects floating around. You have no obvious way to locate them all, but you’d like to tell each of them to close, perhaps in preparation of some shutdown command.

My approach is to to allow the list of receivers to include Proc objects. Each Proc would need to define a conditional to determine if an object should be included in the receiver list. If, while processing the >> command, a Proc is encountered, the code loops over all objects in ObjectSpace, using the Proc to determine if an object qualifies as a receiver.

Code, please

So here we go:
class Symbol

  def []( *args)
    @args = args
    self
  end

  def >>( *objs )
    results = []
    arglist  = @args 
    objs.each{ |obj| 
      begin
        if obj.class == Proc    
          temp_ary =  select_by_proc( obj ) 
          results.concat(  self.>>( *temp_ary ){ arglist  }  )
        else
          results << dispatch( obj, arglist  )
        end

      rescue Exception
        results << $!.clone
      end
    }
    # Symbols stick around like a bad cold, so we need to reset the 
    # arg list after dispatching the message 
    @args = nil
    results
  end

  def dispatch( obj, arglist  )
    return obj.send( self.to_s, *arglist ) if arglist && ( arglist.size > 0 )
    obj.send( self.to_s )
  end

  def select_by_proc( pk )
    objs = []
    ObjectSpace.each_object do |obj| 
      begin
        objs << obj  if pk.call( obj )  
      rescue Exception
        warn "select_by_proc exception: #{$!}" 
      end
    end
    objs.uniq 
  end

end

Pass me the MOP, please

So let’s see an example. Earlier we looked at dynamic code that might a bit too easygoing. Now let’s look at some really gullible code:
 class Gullible
    def initialize( name )
      @name = name 
    end

    def herbal_cialas
      "Sure, I respond to herbal cialas!" 
    end

    def bank_account_details
       "@name: #{@name.intern.to_i}" 
    end
  end

Suppose then that there are a few gullible objects floating around:
 jim = Gullible.new( 'Jim' )    
 greg = Gullible.new( 'Greg' )
Now, if we decided to engage in some object-level spamming, we could really, um, mop up; we don’t need to know who these poor souls are, we just need to go find all objects that say they respond to messages about, say, herbal cialas, and ask them a favor, such as “Give me your bank account details”. Like so:
 poor_souls_accounts = :bank_account_details.>>( lambda { |o| o.respond_to?(  :herbal_cialas ) })
 p poor_souls_accounts # ["Greg: 17221", "Jim: 17229"]>
Now I just need to figure out spam filters. Shouldn’t be too hard.
James Britt

AddThis Social Bookmark Button

Francis Hwang once posted an item about modifying Ruby’s require method so that you can load files over HTTP (or, really, pretty much any file transfer protocol).

It’s really quite clever. I think, though, that having to explicitly include the protocol scheme and path when calling require spoils half the fun.

My take on this is to alter Kernel#require so that it knows to look for requested files outside of the local file system.

# File: hyperactive-require.rb
require 'open-uri'

def require( resource )
  begin
    super
  rescue LoadError
    $:.each do |lp|
      if lp =~ /http:///i
        begin
          lp << '/' unless lp =~ //$/
          s = open( "#{lp}#{resource}" ) { |f| f.read}
          eval s
          return
        rescue; end
      end
    end
    raise LoadError.new( "Cannot find '#{resource}'")
  end
end

The trick here is that the load path must be given one or more Web addresses; these sites will then become just more places to look for code. Aside from that, client code does not need to know if some required file is local or remote.

A nice side-effect: No more installations. Well, maybe fewer, simpler installations. Distribute a basic script, define the appripriate URLs, and have it require the app libs as if they were local. You could also switch library version by changing LOAD_PATH URLs to fetch from different repositories.

Dangerous? Um, perhaps. But c’mon, it’s a new year. Live it up!

Besides, everyone knows how dangerous Ruby is . So dangerous that, in the spirit of the season, I even dreamed up Yet Another Ruby Motto.

Ruby: You’ll Shoot Your Eye Out.

So let’s try an example and see if we lose an eye or something.

# File: trusted-sites.rb
# I trust these guys!
%w{
  http://www.30secondrule.com/living-dangerously/
}.each { |uri|
 $:.push uri
}

#!/usr/local/bin/ruby
# File: example.rb
# See if we can grab my vSocial.com RSS feed

require 'hyperactive-require'
require 'trusted-sites'
require 'vsocial' 

vs = VSocial.new( 'jamesbritt' )
puts vs.rss

Whew! Scary, but no ocular mishaps. It’s certainly no worse than running with scissors.

One side detail: The hyperactive-require code issues an HTTP request for a file with no extension, but it is likely that the actual source file on disk at the server will end in .rb. I used the MultiViews option with Apache2 to allow the server to return a disk file with a .rb extension even if the request URL did not specify that.

Oh, and note, too, that file requests going through an intermediary process opens the door for all sorts of entertainment. You could, for example, dynamically assemble the code returned, perhaps returning different versions based on the client’s IP address. Or have the server invoke a CVS or Subversion checkout to grab the latest code. Or send back a new version of Kernel#require or another set of LOAD_PATH URLs.

Or something. Just don’t hurt anyone.

Gregory Brown

AddThis Social Bookmark Button

    class OpenNode < OpenStruct
      include Enumerable
      def initialize(my_name, parent_name, children_name, name, options={})
        @my_children_name = children_name
        @my_parent_name   = parent_name
        @my_name          = my_name
        super(options)
        self.name = name
        self.send(@my_children_name) || self.send(:"#{@my_children_name}=",{})
      end
      def each(&p)
        self.send(@my_children_name).values.each &p
      end
      def add_child(klass,name,options={})
        options[@my_name] = self
        self << klass.new(name, options)
      end
      def <<(child)
        child.send(:"#{@my_name}=", self)
        self.send(@my_children_name)[child.name] = child.dup
      end
      def [](child_name)
        self.send(@my_children_name)[child_name]
      end
    end

Update: Thanks to steer on #ruby-lang for re-writing my messy each method

James Britt

AddThis Social Bookmark Button

Symbols: They’re numbers with a human face.

AddThis Social Bookmark Button

I really like Austin’s earlier blog about Symbols and how they are just names.

I think the real confusion around symbols is three fold:

  1. The colon in the name
  2. The name Symbol
  3. How they relate to Strings
Gregory Brown

AddThis Social Bookmark Button

This will not come as a surprise to the experienced Rubyist, but for those in the crowd who are just learning, here is a bit of neat trivia for you to ponder.

Well, we all know that Fixnums (i.e. 1,2,3..) are objects in Ruby, right?
If you didn’t, now you do!

So it’s easy to do little tricks like

class Fixnum
   def +(other)
     self - ( -1 * other ) - 2
   end
end

which will break your math and let you impress people who are new to open class systems.
(i.e. 4 + 9 == 11 with the above definition)

You could also do functional hacks like

class Fixnum
  def squared
    self ** 2
  end
end

and get yourself 2.squared == 4

But one thing that I didn’t think about until Gary Wright mentioned it on RubyTalk is how Fixnums can have instance variables!

class Fixnum
  attr_accessor :letter
end

('a'..'z').each_with_index { |letter, index| index.letter = letter }

now 0.letter == 'a', 1.letter == 'b', etc etc

Is this useful? I’m not really sure, to be quite honest. But it sure is neat!
See the discussion in it’s original context at: http://www.ruby-forum.com/topic/50170

If you get any killer ideas for how to use such a feature, drop a comment here or there.

AddThis Social Bookmark Button

Lots of people have been discussing symbols in Ruby, and seem have converged on the explanation that symbols should be used whenever you’re referring to a name (i.e. an identifier or keyword, essentially), even if you’re talking about a hypothetical name that doesn’t really exist in actual code yet.

I think this is the correct idiomatic usage, and it’s a pretty good way to explain symbols. But I also think it’s going to feel a bit hollow or contrived to someone coming to Ruby from a background in (say) JavaScript, Python, or even Java. If I were them, I’d be thinking: “Um, OK. Intent, intent, intent. Got it. But… isn’t a program-source identifier a fairly abstract notion to reify as a first-class object type, especially going so far as to give it a special syntax? And did I just use the word ‘reify’? Geez.”

I mean, Ruby symbols are right up there with numbers, strings, regexps and the like as first-class lexical entities. I’m guessing that this feels like a really odd decision to a lot of programmers. They might be comfortable with the “intent” explanation (which, incidentally, is similar to why I tell people I like tuples in Python so much — they help me express tuple-ish intent better than a list). Comfortable, sure, but they’re probably not wholly satisfied. It still smells a little fishy.

Am I right?

I’d like to offer my own humble take on Ruby symbols, in the hope that it’ll clear things up a teeny bit more. Nothing I’m going to say in any way negates what folks have concluded already, which is that symbols are best viewed as representing names in program code, not as “lightweight strings”.

Metaprogramming crash-course

Symbols as first-class objects are an idea that’s usually associated with Lisp. I don’t want to force you to learn any Lisp, and I won’t show you any Lisp today. But hopefully I can give you the flavor of how symbols are used in Lisp by describing a “hole” in Ruby that I hope will be fixed someday.

As a toy example, let’s take a look at the following Ruby code, which dynamically creates four methods and attaches them to an empty holder class, using eval.

#!/usr/bin/env ruby
# define a blank class as a holder for some methods
class BigMeanGiant
end

# Now add some silly-ish methods, using a flavor of eval.
# They're going to be instance methods, because it's as if
# we defined them inline inside the class definition above.
# When invoked, the giant yells the name of the method.

%w(fee fi fo fum).each do |name|
  BigMeanGiant.class_eval <<-EOS
    def #{name}() 
      puts 'Giant says:  #{name.upcase}!'
    end
  EOS
end

# invoke the methods, just for fun
begin
  g = BigMeanGiant.new
  g.fee
  g.fi
  g.fo
  g.fum
end

When you run this little program, it obligingly prints:

Giant says:  FEE!
Giant says:  FI!
Giant says:  FO!
Giant says:  FUM!

This program is roughly the “hello, world” of metaprogramming in Ruby. We’ve written some code that generates code on the fly: in our case, four nearly identical methods on BigMeanGiant called ‘fee’, ‘fi’, ‘fo’, and ‘fum’. It’s almost the same as if we’d written the code like this instead:

#!/usr/bin/env ruby

class BigMeanGiant
  def fee() puts "Giant says FEE!" end
  def fi()  puts "Giant says FI!"  end
  def fo()  puts "Giant says FO!"  end
  def fum() puts "Giant says FUM!" end
end

# invoke the methods, just for fun
begin
  g = BigMeanGiant.new
  g.fee
  g.fi
  g.fo
  g.fum
end

Running this version of the program has the same output.

What did we do that for?

Although this isn’t meant to be a lesson in metaprogramming, let’s make sure we’re all on the same page here. The second version is clearer, right? Why would you ever do the first version?

You almost certainly wouldn’t do it in an example this small, but the DRY principle tells us to avoid duplicating code. You can only get so far with function abstraction. Without metaprogramming, you can’t really compress the BigMeanGiant class much. You might factor out some of the repetition with a helper function:

class BigMeanGiant
  def say(msg) puts "Giant says #{msg}!" end
  def fee() say "FEE" end
  def fi()  say "FI" end
  def fo()  say "FO" end
  def fum() say "FUM" end
end

But it’s not much of a savings, because you still have to write all the stubs. Imagine you’re writing an HTMLOutputter class, with one method for every HTML tag — you’ll have to write a few dozen stubs, which is more than just annoying. It’s also probably more error-prone, since you’ll have so much code it’ll be harder to spot missed tags, duplicated tags, incorrect method bodies, and so on. And if you have to go back and change them all in some minor way, your refactoring editor may or may not be able to help, depending on what change you have in mind.

In short, having lots of similar-looking code is a Bad Thing.

To solve problems like this in Java, you either have to build elaborate and inevitably awkward dispatching infrastructure, or you have to use external code generators, then hack your build system to know how to generate and then use the generated code.

This, incidentally, is why you so often see generated code in large Java projects — it’s because Java offers no language-level ways to deal with problems like this. And of course, this is only one type of problem that’s solved elegantly with metaprogramming; there are many other classes of problem that are equally difficult to implement cleanly in Java.

OK, we’re all on the same page now, right? Generating code on the fly can lead to cleaner, more maintainable code, assuming you use taste and good judgement and blah blah blah. You get the idea.

The example explained

Continuing with my quest to get us all on the same page, let me make sure you understand the code in the first example. The relevant part is this blob right here:

%w(fee fi fo fum).each do |name|
  BigMeanGiant.class_eval <<-EOS
    def #{name}() 
      puts 'Giant says:  #{name.upcase}!'
    end
  EOS
end

This weird-looking snippet, interpreted in English, is saying:

  1. Make me a list of the strings “fee”, “fi”, “fo”, and “fum”.
  2. For each one of those strings:
    • substitute it into another string below, containing a Ruby method definition
      • The first time, use it as the method name.
      • The second time, use it (uppercased) as what the Giant says.
    • Then call class_eval to turn it into a real method on the BigMeanGiant class.

Make sense? We’re constructing method definitions in a loop, as strings, then passing them to the Ruby interpreter to attach them to a class. It’s not all that different from putting the code in a Ruby source file, then invoking the interpreter; it’s just that we’re controlling the process ourselves at runtime.

The argument to class_eval is a string. The string contains code. Before class_eval gets hold of it, it’s Pinocchio, wanting to be a Real Boy. class_eval is the fairy that sends him off to Pleasure Island to be ridiculed and learn valuable lessons, or whatever the interpreter does in its Big Black Box.

So far, so good. eval seems like a useful thing to have in your language, if you use it with caution.

Trouble in Paradise

So let’s say there’s a bug in my generated methods. Maybe the giant isn’t saying anything, or he’s saying the wrong thing. Let’s say I’m having trouble figuring out the bug by staring at my code-string, which is really just a template. It’s not real code until the interpreter finishes evaluating it and attaching it to the BigMeanGiant class.

So I fire up the debugger, and step through the code, and immediately notice a few things:

  1. The call to class_eval is atomic. The debugger just steps right over it.
  2. Calls to the generated methods are also atomic.
  3. I have no way of printing out the generated code.

In other words, your metaprogramming-generated code isn’t “first class” in the same way your normal source code is. It’s not visible to the debugger, and it’s not available to other tools either. (For instance, rdoc lets you include the source code in the generated documentation, but I don’t think there’s any easy way to have it know about your eval-generated code.)

There are some games you can play that might make some of these things achievable. For instance, you might be able to override class_eval to store the original source code (after the template substitution) in the class somewhere, and then provide an API for getting at it for your favorite debugger. But to the best of my knowledge, it’s not something that’s supported “out of the box” in Ruby, and it means that working with generated code is harder than it really needs to be.

Even if I’m completely mistaken here, and someone comments with a way to print out a generated method’s source code (which would be pretty nifty), the whole experience still falls remarkably short of the metaprogramming facilities in Lisp.

To clarify, let’s peer more closely into the lifecycle of that generated code. There are some distinct activities that rush right by us in Ruby, things we might actually want some control over.

We really will make our way to symbols soon, promise.

Constructing the code string

We start with a string, which the first example has in a “here doc” — one of Ruby’s genuine Perl-isms that you’re free to view with suspicion. Python’s syntax would be a triple-quoted string, which I think is nicer, but what’s done is done. Here’s the string again:

    def #{name}() 
      puts 'Giant says:  #{name.upcase}!'
    end

It could just as easily have been a normal, double-quoted string, even a one-liner:

  "def #{name}() puts 'Giant says: #{name.upcase}!' end"

However, because dynamically-generated code is notoriously tricky to debug, most of the time you’ll want to format code in template strings as clearly as possible.

I’m calling it a template because Ruby strings can contain inline expressions, delimited with #{}. In Java you’d use string concatenation, e.g.:

"Giant says: " + getThingGiantSays() + "!"

Python has the printf-like % operator, and other languages have their own approaches. The Ruby way is probably more readable if the substituted expressions are short; using something like sprintf (which Ruby also has) will be better if there are long expressions. Basically you want to do whatever makes the code template look as much as possible like the code it’s going to turn into.

Here’s Secret Observation #1: in Lisp, your code template isn’t a string. It’s a data structure that represents the tokenized and partially-parsed code. If Ruby had this feature, the BigMeanGiant example might look something like this:

%w(fee fi fo fum).each do |name|
  BigMeanGiant.class_eval START_CODE_TEMPLATE
    def #{name}() 
      puts 'Giant says:  #{name.upcase}!'
    end
  END_CODE_TEMPLATE
end

I put those big START/END tokens there in an attempt to make it clear that what’s inside them is NOT a real boy; it’s Pinocchio, and it will take some major Good Fairy work to make it real code.

But notice that the code inside the template is actually syntax-highlighted properly. When it was all inside a string (heredoc, double-quoted, or otherwise — it’s still just a string), it was all highlighted in light blue, which is what my editor tells me Strings should look like. My editor was nice enough to highlight the substitution expressions in brown, but you still need to realize they’re substituted before the final string is used as an argument to class_eval. But inside the CODE_TEMPLATE, we know it’s going to be code, so we can invoke the syntax-highlighter on it. Helps you see what’s going on more clearly. And auto-indenting, tagging, and other IDE functions will work on it. Muuuuuch nicer than code in a string, wouldn’t you agree?

Imagine that you could pass around one of those CODE_TEMPLATE doohickeys as an object, one that actually represented the Pinocchio-code in a way that let you traverse it and modify it before passing it off to eval. That seems like it could come in quite handy, and in fact it does. For one thing, it makes it far easier to do meta-metaprogramming, where you’re writing code that generates those code templates. But at a perhaps more mundane level, it makes it possible to create new syntactic constructs in the Ruby language.

At this point, some people will cringe and shudder and proclaim: “Evil! What you just said is Pure Evil!” Lots of programmers, maybe even most of them, are so irrationally afraid of new syntax that they’d rather leaf through hundreds of pages of similar-looking object-oriented calls than accept one new syntactic construct. I blogged about this once, in an article called Language Trickery and EJB. That article actually managed to convince a bunch of hardcore Java programmers that new syntax might actually be a useful tool. Maybe it’ll convince you too. If not, well, feel free to skip to the next section.

It would actually take me too far afield to go through a detailed example of how adding a new syntactic control-flow construct to Ruby could turn into a huge benefit for your project. Imagine, though, that Ruby didn’t have here-docs, and that you were practically drooling with jealousy over Python’s triple-quoted strings. If you’re a Java programmer, and you’re not drooling purely out of habit, then you should definitely drool over multi-line strings. It boggles the mind that they didn’t include it as a language feature, and in Java we wind up doing zillions of manual concatenations to produce long strings (which usually by then look nothing like the thing they’re trying to represent.) Ah, me.

If Ruby didn’t have here-docs, but Ruby had those CODE_TEMPLATE thingies and one system hook that allowed you to control the evaluation of those templates, then you could implement here-docs pretty easily. Because the code-to-be is represented as a data structure, allowing you to quickly and easily filter out the #{}-substitution elements, you could simply evaluate whatever’s inside those elements, and not evaluate anything else in the template. That’s all they really do. And of course (much) more sophisticated syntactic constructs are also possible, if you put in more work.

That’s the kind of thing Lisp programmers do for breakfast before going and writing their application code. And the funny thing is, it could really be super easy in Ruby — maybe even easier than in Lisp. It’s just that Ruby doesn’t support it today.

If the hairs are all standing up on the back of your neck, and you’re just recovering from shock and trying to think of the dirtiest word you could possibly call me, well, take a few deep breaths, nice and slow. It just means I’m a dog person and you’re a cat person, or something like that. Let’s not bite each other. Many people (notably Paul Graham in “On Lisp”) have spent lots of effort explaining how this kind of programming has to be treated with MUCH more deference and caution than ordinary API programming. Language extensions and minilanguages can be extremely powerful and useful — imagine where we’d be without regular expressions, for instance — but they also require tons more care, documentation, and thought than defining an ordinary function.

You’re already sort of doing this kind of “language extension” programming every time you call eval — for that matter, you’re doing it whenever you invoke a separate code generator, or open up a class and add stuff to it, or use a tool like yacc or ANTLR. We’re completely surrounded by languages, large and small, all the way down to the minilanguage you use for ordering coffee at Starbucks. It’d be hard to get along without them.

Evaluation

Once you have that code template as an actual object, as opposed to a string that you need to parse yourself, then you could do all sorts of things with it. For one thing, you could pretty-print it. It’s effectively in parse-tree format, so all you’d need to do is decide the rules for line breaks and spacing between various token types. For another thing, you could tell your debugger about it, which would allow you to inspect and step through generated code. And evaluation — the creation of actual code from your template — would no longer be the black box that it is in Ruby today (and in Python, Perl and JavaScript, for that matter.) More control means more opportunities to remove DRY violations, and do so in a way that has strong(er) long-term maintainability characteristics. I mean, you have to admit, not being able to inspect or step through your generated code makes maintenance a bit of a tricky proposition.

(Note: see the important correction Jim Weirich made in the comments section. –steve)

Symbols at last

Those nonexistent code templates I’ve been referring to — that is, objects (collections, really) that represent snippets of code to be evaluated — they’re really just syntax trees representing your source code. They’re similar to the output you’d get from any parser, including generated parsers from tools like ANTLR. Or maybe a more familiar example is the XML DOM — an object-tree representation of the parsed XML file. You have to admit, working with a DOM is a lot more convenient than working with a string containing raw XML. It’s a huge difference, and it’s a feature Lisp has that Ruby mostly lacks, at least today. A set of features, really: it’s a rich programming domain.

In a system with first-class syntax trees represented as language entities, in a way that allows you to interact with the lexer, parser, and evaluator (i.e. different components of the Ruby interpreter), symbols make a whole lot more sense. A symbol is literally an object that represents a name in the code tree. If you had a code template snippet representing this code:

  def fum() say "FUM" end

Then your syntax tree would contain a Symbol object for each token in the code except for the string “FUM” (which would be a String), because that’s just a string and not a source identifier or keyword, and also except for the parens in the arg list, but that’s another long story that we don’t have time for today.

So Ruby’s symbols are really a placeholder for grand things to come. Ruby is already a very powerful, capable language, but it has some weaknesses in its ability to process Ruby code at runtime. Your only real tool today is eval (which comes in several flavors in Ruby, but that’s irrelevant to our discussion), and it’s a big black box. Once your code template is handed over to the Good Fairy, crossing that magical line between your program and the Ruby interpreter, you’ve lost it, and what you get back is effectively an opaque binary blob wrapped in a thin Method (or UnboundMethod, etc.) class that doesn’t remember much about its original symbolic representation.

Well, that went on way too long. Was it helpful?

AddThis Social Bookmark Button

Both Jim Weirich and Yohanes Santoso posted about Ruby’s Symbol class today. As Yohanes noted, there’s not many weeks that go by where there’s no question about Symbols. I thought I’d throw in my two cents, expanding on a mailing list response ([ruby-talk:172842]). First, let’s look at what ri has to say about Symbol objects:

Symbol objects represent names and some strings inside the Ruby interpreter. They are generated using the :name and :"string" literals syntax, and by the various to_sym methods. The same Symbol object will be created for a given name or string for the duration of a program’s execution, regardless of the context or meaning of that name. Thus if Fred is a constant in one context, a method in another, and a class in a third, the Symbol :Fred will be the same object in all three contexts.

The beginning of the confusion about Symbols being like Strings is straight from the source in this case. As both Yohanes and Jim point out, though, Symbols shouldn’t be used as “Strings-lite”. At a minimum, to be useful that way, you’d have to convert each Symbol into a String (#to_s) before you could operate on it. If you shouldn’t use Symbols as if they were immutable strings, what good are they? Yohanes covers the theory well: helping to express intent. Jim covers the practical reasons well:

  1. Naming keyword options in a method argument list.
  2. Naming enumerated values (e.g. like enums in C).
  3. Naming options in an option hash table.

As Jim says, “Symbols are about naming and identifying things.” I don’t quite agree with his definition (“A Symbol is an object with a name”), preferring to say that a Symbol is an object that is a name. It’s a very subtle difference, mind you, but an important one in that there is no meaningful distinction between a Symbol and the name that it represents. That said, we’re still left with the question asked by Steve Litt, “One thing—why not some_call(:@my_variable)?”

This is where Yohanes’s discussion on intent matters. You aren’t naming your variable when you call attr_accessor. You’re naming your method. The magic here isn’t in the Symbol; it’s in attr_accessor.

>> class Foo
>>   attr_accessor :bar
>> end
=> nil
>> baz = Foo.new
=> #<Foo:0x2d8aea8>

Thus far, baz has no instance variables. But it does have two instance methods:

>> baz.methods - Object.methods
=> ["bar", "bar="]

If I call the reader method (Foo#bar), I still don’t get an instance variable:

>> baz.bar
=> nil
>> baz
=> #<Foo:0x2d8aea8>

It’s only when I call the setter method (Foo#bar=) that my instance variable is created:

>> baz.bar = 32
=> 32
>> baz
=> #<Foo:0x2d8aea8 @bar=32>

The intent of the Symbol is that it’s just a name. The intent of attr_accessor is that it creates two methods for each name that it’s been given. It is almost coincidental that these methods work on a variable of the same name. It doesn’t have to be the same, as I demonstrate below.

require 'digest/md5'
class Module
  def md5_accessor(*names)
    names.each do |name|
      var = Digest::MD5.hexdigest(rand(65536).to_s)
      define_method(name) { || instance_variable_get("@_#{var}") }
      define_method("#{name}=") { |v| instance_variable_set("@_#{var}", v) }
    end
    nil
  end
end

class Foo
  md5_accessor :bar, :baz
end

moo = Foo.new
moo.bar = 5
moo.baz = 7
moo
puts moo.bar, moo.baz
Gregory Brown

AddThis Social Bookmark Button

I’ve always liked the fact that nil evaluates to false. This is handy in conditional expressions and makes ||= possible.

However, something I didn’t expect is a bit annoying:
nil.to_i gives you a 0, which evaluates to true in ruby.

So stuff like this happens:

irb(main):002:0> puts "hello" if nil.to_i
hello
=> nil

At a first glance, it seems like zero might have been better replaced by nil or an exception.

Does anyone have any insight as to why nil.to_i returns zero and why that might be more useful or better than an alternative solution?

AddThis Social Bookmark Button

In my previous entry I had some takers on trying to write a nice, compact, method that handled round a float down to a certain decimal place.

For fun, I decided to benchmark each implementation, along with a small C implementation I wrote. Here’s the lineup:

require 'benchmark'
require 'prec_c'
include Benchmark

class Float
  def prec_caleb1(x)
    sprintf("%.0" + x.to_i.to_s + "f", self).to_f
  end

  def prec_caleb2(x)
    (("%.0" + x.to_i.to_s + "f") % self).to_f
  end

  def prec_james(x)
    ("%.0#{x.to_i}f" % self).to_f
  end

  def prec_jonas(x)
    (self * 10**x).to_i / (10**x).to_f
  end

  def prec_fansipans(x)
    to_s[/.*..{#{x}}/]
  end
end

n = 50000
bm do |x|
  r = rand(6)
  x.report { for i in 1..n; 5.44235102.prec_c(r); end; }
  x.report { for i in 1..n; 5.44235102.prec_caleb1(r); end; }
  x.report { for i in 1..n; 5.44235102.prec_caleb2(r); end; }
  x.report { for i in 1..n; 5.44235102.prec_james(r); end; }
  x.report { for i in 1..n; 5.44235102.prec_jonas(r); end; }
  x.report { for i in 1..n; 5.44235102.prec_fansipans(r); end; }
end

I ran the code a couple of times to make sure the results came up rough the same. They did. Here are the results (sorry for the formatting):

impl user system total real
caleb_c 0.330000 0.000000 0.330000 ( 0.325121)

caleb1 0.450000 0.000000 0.450000 ( 0.454775)

caleb2 0.460000 0.000000 0.460000 ( 0.456334)

james 0.410000 0.000000 0.410000 ( 0.405698)

jonas 0.480000 0.000000 0.480000 ( 0.486741)

fansipans 0.950000 0.000000 0.950000 ( 0.944716)

It looks like James is the winner, at least in terms of efficiency (as long as you don’t count my C extension, which kind of breaks the Ruby spirit).

Here’s that code, in case you’re interested:

#include "ruby.h"
#include <signal.h>
#include <time.h>

static VALUE prec_c(VALUE klass, VALUE prec)
{
  VALUE str = rb_str_plus( rb_str_new2("%."), rb_funcall(prec, rb_intern("to_s"), 0) );
  VALUE str2 = rb_str_plus( str, rb_str_new2("f") );

  char s[20];

  sprintf(s, StringValuePtr(str2), NUM2DBL(klass) );
  return rb_funcall(rb_str_new2(s), rb_intern("to_f"), 0);
}

void Init_prec_c() {
  rb_define_method(rb_cFloat, "prec_c", prec_c, 1);
}

I’m sure someone can find some fallacies in my code that makes for unfairness in my benchmarking, but for the most part I think the data is fairly reliable.

Robby Russell

AddThis Social Bookmark Button

Disclaimer: This is my first post on the O’Reilly Ruby blog. I decided that I wanted to do my own Hello World… but decided that I’d just pick on our friend Enumerable and give you a quick 90 second tutorial. Start your watch… now.

Let’s build an array of hashes

Let’s take for example this nice array that we then populate with some hashes. In this case, a few of my co-workers and their current job titles.

argonistas = Array.new
argonistas << { :full_name => 'Allison Beckwith', :title => 'Creative Director' }
argonistas << { :full_name => 'Jeremy Voorhis', :title => 'Lead Architect' }
argonistas << { :full_name => 'Robby Russell', :title => 'Founder' }
argonistas << { :full_name => 'David Gibbons', :title => 'Lead Systems Adminstrator' }

This should be pretty straight-forward.

How do we #sort_by full name?

Okay, let’s sort this array of hashes by the value of full_name in each hash. To do this, we can use #sort_by which comes with the Enumberable module.

argonistas.sort_by { |argonite| argonite[:full_name] }

Great. We have the ability to quickly sort the array of hashes.

Let’s throw that array into a block

Let’s take one step further and iterate through each hash in the array and output everyones full name and job title. What we can do is pass our array above to a block and print out these values.

argonistas.sort_by { |argonite| argonite[:full_name] }.each do |argonite|
  puts "#{argonite[:full_name]}, #{argonite[:title]}"
end

Your STDOUT loves your hash keys

With this, we now get the following output:

Allison Beckwith, Creative Director
David Gibbons, Lead Systems Adminstrator
Jeremy Voorhis, Lead Architect
Robby Russell, Founder

It’s Over!

As you can see, it doesn’t take much to really embrace the power of Ruby without turning your code into an ugly mess.

AddThis Social Bookmark Button

I have a small repository of utility code that almost all of my Ruby projects require because I find them so beneficial to what I do. One such function is a quick precision hack for the Float class. Here’s the code:

class Float
  def prec(x)
    sprintf("%.0" + x.to_i.to_s + "f", self).to_f
  end
end

This is nice, because it allows you to quickly round a float to a certain decimal value without having to resort to much trickery.

irb> i = 100.0 / 9.4
=> 10.6382978723404
irb> i.prec(2)
=> 10.64

It did get me thinking though: is there a smarter way of implementing this method?

We can get rid of the sprintf and just use the % method.

class Float
  def prec(x)
    (("%.0" + x.to_i.to_s + "f") % self).to_f
  end
end

It still seems to me like that code could be simplified even further. Do any of you have any ideas how to make it any more compact?