169 Symbolify

Description

Your task this week is to count. Yup, that’s it.

Oh, by the way… You may only use the characters ?, *, (, ) and -.

Specifically, define a function symbolify that accepts an integer and returns a string composed of only the five characters shown above. The string, when evaluated, will equal the original number passed in.

That is, the following test code should raise no exceptions:

1000.times do |i|
  s = symbolify(i)
  raise "Not a string!"  unless s.is_a? String
  raise "Invalid chars!" unless s.delete("?*()-").empty?

  x = eval(s)
  raise "Decode failed!" unless i == x
end

There are at least a few different approaches that come to mind, so don’t expect everyone to have the same output. Well, not before the output is passed through eval, that is.

I am not requiring that you produce the shortest string (though you are welcome to attempt such a thing). The only thing I do ask is that you not produce megabytes of data to represent a simple integer.

P.S. Cheating is encouraged (except that you may not write code outside of symbolify).

Summary

The primary point of this week’s quiz was to highlight the ? pseudo-operator… Simply, a character preceded by ? is converted to its ASCII value. (I hesitate to call it a true operator, since it is almost certainly not defined nor implemented in such manner. Rather, I expect, it is part of Ruby’s lexer and so forms part of the definition of a number. But, in some ways, it looks like a unary operator.) Type ?( into irb and the interpreter will return the value 40.

Knowing this, the five quiz-permitted characters give easy access to five integer values.

> [?(, ?), ?*, ?-, ??]
=> [40, 41, 42, 45, 63]

But I specifically permitted four characters (aside from the ?) that can be used for a number of mathematical operations. Grouping/precedence, addition, subtraction, multiplication and exponentiation are all possible using those four characters, and so we can express values aside from the five shown above.

> ?) - ?(               # 41 - 40
=> 1

> (?* - ?() ** ?(       # 2 ** 40
=> 1099511627776

Using these techniques, it is possible to form an expression that evaluates to any number, though perhaps not compactly. Let’s look at a couple solutions.

Our first solution is from Alex LeDonne:

def symbolify(i)
  "??-??" + "-?(--?)" * i
end

Remember, it’s the string output from symbolify that, by request of the quiz specification, must contain only the five permitted characters. Alex begins with ??-??. As shove above, ?? evaluates to 63, so this expression is 63-63: that is, zero.

I’m going to examine the next string in a few steps, just to make it clear what’s going on. First, add in some whitespace.

"- ?( - - ?)"

Second, replace the ? pseudo-operators with appropriate ASCII values.

"- 40 - - 41"

Third, realize that one of those - characters now represent unary negation operators.

"- 40 - -41"

Fourth, simply the mathematical identity that subtracting a negative is the same as adding the positive.

"- 40 + 41"

Finally, do the math.

"+ 1"

So now let’s look at Alex’s solution again, using these equivalent strings, to understand the intent.

def symbolify(i)
  "0" + "+1" * i
end

So the call symbolify(5) results in a string that is effectively 0+1+1+1+1+1. Which, when evaluated, equals five. It’s not the most efficient way to count, but it is the simplest. Of course, this conversion was done just to show how Alex’s solution produces the correct values on evaluation. The actual output of symbolify(5) is:

=> "??-??-?(--?)-?(--?)-?(--?)-?(--?)-?(--?)"

I apologize for a bit long-winded explanation, but I wanted to make it clear for anyone who might be confused about all this. For the rest of the solutions I look at here, I’ll skip the string and symbol details and look specifically at the algorithm.

As programmers, I certainly expected someone to provide a simple powers-of-two solution, and Bill Kelly provides one. (Bill golfed this to a single line, but it is shown here with additional whitespace for readability.)

def symbolify(i)
  x = "(?(-?()"
  i.to_s(2).each_byte { |d|
    x="(#{x}*(?*-?()--(?#{(d-8).chr}-?())"
  }
  x
end

The first line assigns a default value of zero. The second line enumerates over the digits of the binary representation of the input (as provided by to_s(2)). For each digit, the preceding value of x is doubled (multiplication by (?*-?()), and then the next digit is added in. This is basic binary counting.

What might be a little confusing here is why eight is subtracted from d. Realize that enumerating the binary string using each_byte sets d to the ASCII value: in this case, either 48 or 49, the ASCII values of “0” and “1”. Subtracting eight gives us 40 or 41, equivalent to ?( or ?).

As a final non-cheating solution, I want to look at one from Dana Merrick.

require 'mathn'

def symbolify(i)
   i.zero?? "??-??":"-"+("(?(-?))-"*i).chop
end

def symbolify_short(i)
   return "??-??" if i.zero?
   factors = i.prime_division
   factors.inject("") do |string,pair|
     string << "(" << symbolify(pair.first) << ")**(" 
            << symbolify(pair.last) << ")*"
   end.chop
end

Dana’s symbolify function is very similar to Alex’s solution above: counting by ones. But in the added function, symbolify_short, Dana makes use of prime_division from the mathn module, which will calculate the prime factorization of an integer. For example:

> 360.prime_division
=> [[2, 3], [3, 2], [5, 1]

> (2 ** 3) * (3 ** 2) * (5 ** 1)
=> 360

This provides an efficient way to represent the inputs while still adhering to the permitted characters. symbolify_short alone isn’t enough, since it can’t provide output for the prime numbers, but it calls on the simpler symbolify to take care of that.

Now, I figured there might be some other representations possible, except that eval was going to be too limiting. So I allowed cheating. And cheats we got, of a few different varieties.

Chris Shea redefines the multiplication operator for numbers like so:

def symbolify(i)
  Fixnum.class_eval <<-EOD
    def *(other)
      #{i}
    end
  EOD
  '?**?*'
end

This isn’t, of course, a general purpose solution, and fails the delayed evaluation test I provided. Still, Chris managed to return a string (the same string every time) and have that evaluate to the number passed in.

Then we have one Ryan Davis, which converts the number to base-5, using the five symbols as the digits:

FROM, TO = '01234', '?*()-'

def symbolify n
   n.to_s(5).tr(FROM, TO)
end

Of course, using eval on the result of symbolify won’t work, so eval had to be redefined as well.

alias :real_eval :eval
def eval s
   return real_eval(s) unless s =~ /^[#{Regexp.escape TO}]+$/
   s.tr(TO, FROM).to_i(5)
end

Another method redefinition comes from Lucas (and done in similar fashion by a few other Rubyists), which was written solely to pass the automated tests provided.

def symbolify (n)
  s = "#{n}"
  def s.delete(*args)
    ""
  end
  s
end

Note that here Lucas adds a method directly to instance s rather than the String class, so it won’t interfere with other code. The sole intent of this delete method is to fool the automated tests.

It didn’t fool me as the output clearly contains characters other than those permitted, but that’s okay… I’ll let it slide. They’re ugly, but I like ‘em.

Finally, I present to you my own disturbingly ugly cheat:

def symbolify(num)
  def eval(str) $num end; $num = num; "()"
end

Not only did I redefine eval (bad) in a poor manner (badder), but I started using globals (baddest). Now that you’ve seen it, I recommend you never look at it again. Just learn from the experience.

Thanks for all the submissions! I applaud your techniques and occasional deviousness…


Wednesday, February 04, 2009