183 Unit Conversion

Description

Google added a calculator to its search engine a while back. Enter “convert 50 miles to kilometers”, or even just “50 mi to km”, and the first “search” result will tell you that 50 miles is 80.4672 kilometers. This works for units other than length. Try “33 ml to gal”, “6 hours to minutes”, and”50 stones to lbs”, and you’ll see that Google’s calculator knows a lot of different units and how to convert between them all.

Your task is to write a units converter script. The input to the script must be three arguments: the quantity, the source units, and the destination units. The first example above would be run like this:

$ ruby convert.rb 50 miles kilometers

Or, using abbreviations:

$ ruby convert.rb 50 mi km

Support as many units and categories of units (i.e. volume, length, weight, etc.) as you can, along with appropriate abbreviations for each unit.

Summary

The right way, generally, to do a task such as unit conversion is to see if someone has already done all the hard work for you. As was pointed out, there are several options in this respect:

Many thanks to Martin Boese, whose solution had to be empirically confirmed. Repeatedly.

But I’m going to look at the solution from Robert Dober. While it is limited, as posted, his data driven approach could be expanded to include more conversions.

To understand how the expression 1.0.in.to.mm will generate the string “25.4mm”, I’ll trace it a step at a time, looking at the relevant bits of code.

First, we have the float value 1.0, but where does the method in come from? Clearly, class Float gets something by way of extension:

class Float
 include Conversion
end

Module Conversion only defines one method that will extend Float (with the rest of Conversion being helper classes and code executed when Conversion is first evaluated). That method is method_missing:

 def method_missing unit_name
   pc = ProxyClasses[ unit_name.to_s ] || super( unit_name )
   pc::new self
 end

So we will look for ProxyClasses["in"] and, if not found, we just call to the parent class and hope it knows what to do with method call in. But in this case, we’re expecting to find something in ProxyClasses… a Class, in fact, which we attempt to instantiate immediately using new. But where does we fill ProxyClasses?

Ah, that would be the code right below method_missing in his solution: the code that makes use of LineParser.

conversions = LineParser::new
File::open "units.txt" do | f |
  f.each do | line |
    conversions.parse_line line
  end
end

Robert provided a minimal units.txt data file to show how the code works. (Note that the line beginning “use SI” is part of the data file and not a mistake; see parse_line for how that is handled.)

1 in = 0.0254 m
1 l  = 0.001 m3
use SI prefixes for m g l m3

It could be expanded greatly to support many more units. As each line is read, the LineParser object parses them, keeping track of the conversion rules – I’ll come back to that later. What I want to look at first is what gets done with those rules:

conversions.traverse do | src_unit, tgt_unit, conversion |
  ( ProxyClasses[ src_unit ] ||= Class::new ProxyClass ).module_eval do
    define_method tgt_unit do (@value * conversion).to_s + tgt_unit end
  end
end

traverse is going to enumerate over a number of valid conversions – source units, target units, and the conversion factor. And here we see from where the ProxyClasses originate… New ProxyClass objects are created through the code Class::new ProxyClass (but only if one didn’t exist already for the particular source unit… note the use of the ||= operator which only evaluates the right side and assigns left if the left was initially nil).

After ensuring that the ProxyClass corresponding to the source units exists, we call module_eval in order to add methods to the anonymous class just created. The method name will be the target units, and the method multiplies in the conversion factor, converts to a string, and appends the targets units.

So, getting back to our example 1.0.in.to.mm, we’ve now found the ProxyClass corresponding to 1.0.in. And we know that ProxyClass also has methods named by target units, which includes one that corresponds to the last part of the example: .mm.

If you’re wondering about to, every ProxyClass defines that method to return self: essentially a useless function (in the sense that it does nothing more than 1.0.in.mm). It’s existence mimics other libraries, and the point is readability. (An alternative would be a more traditional call, such as 1.0.convert(:in, :mm) or similar.)

So once these proxy classes exist, there’s very little effort going on to evaluate calls such as our example. And creating the proxy classes isn’t much more difficult, assuming you have a proper conversion table. Now we come back to LineParser and what happens beyond its parse_line method. (I’ll skip parse_line itself, since it is a few, simple regular expressions.)

Most of units.txt that defines our conversions is going to be handled by add_conversion, which just receives as arguments each split line of the data file. The conversion table (stored in @c) is two-layered hash – a hash of hashes – and is setup with this code:

def add_conversion lhs_value, lhs_unit, equal_dummy, rhs_value, rhs_unit
  @c[ lhs_unit ][ rhs_unit ] = Float( rhs_value ) / Float( lhs_value )
  @c[ rhs_unit ][ lhs_unit ] = Float( lhs_value ) / Float( rhs_value )
end

The conversion ratio (and the inverse conversion ratio) are stored in two places based on the indexing order. By storing both ratios/orders, we can convert in “both directions”. That is, for our example, not only can we convert inches to millimeters, but millimeters to inches.

The last bit of file parsing is adding appropriate metric prefixes (SI units). One line in the file indicates which units are worthy of metric prefixes. In the data file provided, we see that meters can accept metric prefixes (such as “kilo” and “milli”), but inches will not. These prefixes are handed by add_si_unit_for:

def add_si_unit_for unit
  SIUnits.each do | prefix, conversion |
    @c[ prefix + unit ][ unit ] = conversion
    @c[ unit ][ prefix + unit ] = 1 / conversion
  end
end

Here, unit is the particular unit we want to support metric prefixes. SIUnits is the hash containing the metric prefixes as characters and the corresponding orders of magnitude. For every unit and metric prefix, two more conversions are added, each the inverse of the other: conversion between the naked unit and the adorned unit (e.g. between meters and millimeters, and vice-versa).

Finally, traverse is an enumerator that will yield (via blk.call) every valid combination of units and the appropriate conversion factor. It manages this without storing every conversion (e.g. we store the inches to meters conversion, and the meters to millimeters conversion, but don’t explicitly store inches to millimeters). Enumerating every possible, valid conversion is done in the private method _traverse:

def _traverse src_unit, unit_conversions, traversed_units, f=1.0, &blk
  unit_conversions.each do | new_unit, conversion |
    next if traversed_units.include? new_unit
    blk.call src_unit, new_unit, f * conversion
    _traverse src_unit, @c[ new_unit ], traversed_units + [ new_unit ], f * conversion, &blk
  end
end

The final, recursive step here is what allows us to build a transitive closure of all units. src_unit is, of course, the source unit (e.g. inches). unit_conversion contains all possible immediate conversions from the source and is the hash of units and conversion factors. And, you can see, we enumerate those into new_unit and conversion.

We skip a target unit if it’s already been visited (i.e. in traversed_units). Otherwise, we yield to the caller (blk.call) and recurse, now converting the source unit to everything the target unit can also be converted, making sure to update traversed_units so as to terminate eventually.


Wednesday, February 04, 2009