Skip to content

Integrating R into your Ruby services and applications

by on April 30, 2010

R is an excellent tool for statistical analysis and machine learning. It is designed with affordances for statistical and data analysis and for graphical facility in mind and there is quite a resource in the CRAN, the Comprehensive R Archive Network. Any researcher interested in data analysis will benefit from using R. It is, however, a specialized tool that should ideally be integrated into larger toolsets or more general purpose environments (for example Ruby or Python).

Recently Randall Thomas of Evil Martini and Engine Yard came to pdxruby to give a talk on machine learning and Ruby. I enjoyed Randall’s tour through the basics of statistical analysis with R and I also enjoyed the fact that even though the talk was billed as a Ruby talk, it was essentially a short talk on R and statistical analysis to a Ruby crowd. The ease with which Randall introduced R to Rubyists speaks volumes in my opinion about the maturing of the open source communities and the ease with which the “artifacts” of open source development are now wielded. That ease, in no small part, is due to the collaboration in the various communities and between them.

PDXRuby is an excellent example of a thriving community. I had heard that Portland had the best and most enthusiastic Open Source user groups, and I was really impressed with the Portland Ruby Group. Randall concurred that the Portland group was impressive. I can see why companies want to move here.

I installed RSRuby and RPy2 along with R 2.10 on CentOS 5.4 which went fairly smoothly. As in many areas of scientific computing, the Python communities are stronger and more advanced than the Rubyists are, and in terms of bridging to R Python is still the leader and is canonical. Therefore, RPy2 is the reference implementation and RSRuby is modeled on the Python implementation.

I built R 2.10.1 from source with


./configure --enable-R-shlib
make
sudo make intstall

The configure script for R is mature and will help advise you to install any libraries such as BLAS that are desirable, or any missing compilers, etc.

If you are working with Ruby 1.9.1 for now I’d recommend either getting the gem from Alex Gutteridge’s github account or cloning and installing with setup.rb.

The gem install did not work for me even when I loaded the R_HOME variable properly in my .bashrc before running the gem command as per the instructions.


export R_HOME=/usr/local/lib/R

the following did work:

First I cloned:


git clone git://github.com/alexgutteridge/rsruby.git

Then:


ruby setup.rb config -- --with-R-dir=$R_HOME
ruby setup.rb setup
sudo ruby setup.rb install

When the install completed, I fired up the irb


irb(main):001:0> require 'rsruby'
=> true
irb(main):002:0> r = RSRuby.instance
=> ##, "T"=>true, "TRUE"=>true, "F"=>false, "FALSE"=>false, "parse"=>#, "eval"=>#, "NA"=>-2147483648, "NaN"=>NaN, "help"=>#, "helpfun"=>#}>
irb(main):003:0>

Now we are ready. Fortunately, there is an AI researcher named Peter Lane who has a blog called Ruby for Scientific Research who has already posted some on working with R from Ruby and I’ve linked directly to his RSRuby posts.

Below I am going to modify his first script to use the R exp() function and plot a line:

require 'rsruby'

r = RSRuby.instance

# mod of Peter Lane's Ruby for Scientific Research plot example
# construct data to plot, graph of x vs exp(x)
xs = 10.times.collect {|i| i}
ys = xs.collect {|x| r.exp(x)}

r.png("exp_example.png")  # tell R we will create png file
r.plot(:x => xs,
     :y => ys,            # (x,y) coordinates to plot
     :type=> "o",         # draw a line through points
     :col=> "blue",       # colour the line blue
     :main=> "Plot of x against exp(x)",  # add title to graph
   :xlab => "x", :ylab => "exp(x)")     # add labels to axes
r.eval_R("dev.off()")          # finish the plotting

And here is the plot:
Plot of x against exp(x)

Randall covered some of the basics of initial data analysis in R and perhaps if there is interest we can discuss some of the issues that Randall brought up. I do remember that there was an interesting discussion about initial methods of analysis and first looks at distributions of data.

Advertisements
2 Comments
  1. blkperl permalink

    Excellent Post. I look forward to my first RUBYPDX meeting.

Trackbacks & Pingbacks

  1. Tweets that mention Integrating R into your Ruby services and applications « Phosphene's Log -- Topsy.com

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: