162 The Turing Machine

Description

Quiz description by James Edward Gray II

The Turing Machine is a simple computing architecture dating all the way back to the 1930s. While extremely primitive compared to any modern machine, there has been a lot of research showing that a Turing Machine is capable of just about anything the fancier machines can do (although much less efficiently, of course).

This week’s task is to build a Turing Machine, so we can play around with the architecture.

A Turing Machine has but three simple parts:

* A single state register.
* An infinite tape of memory cells that can hold one character each, with a
  read/write head that points to one of these cells at any given time.  The
  tape is filled with an infinite run of blank characters in either
  direction.
* A finite set of program instructions.  The program is just a big table of
  state transitions.  The Turing Machine will look up an instruction based
  the current value of the state register and the current character under
  head of the tape.  That instruction will provide a state for the 
  register, a character to place in the current memory cell, and an order to
  move the head to the left or the right.

To keep our Turning Machine simple, let’s say that our state register can contain words matching the regular expression /\w+/ and the tape only contains characters that match the expression /\w/. We will call our blank tape cell character the underscore.

Program lines will be of the form:

CurrentState _ NewState C R

The above translates to: if the current state is CurrentState and the character under the tape head is our blank character, set the state to NewState, replace the blank character with a C, and move the tape head to the right one position. All five elements will be present in each line and separated by one or more whitespace characters. Allow for trailing comments (using #) on a line, comment only lines, and blank lines in the program by ignoring all three.

The initial state of your Turing machine should be set to the CurrentState mentioned on the first line of the program. Optionally, the initial contents of the tape can be provided when the program is load, but it will default to an all blank tape. The program runs until it fails to find an instruction for the CurrentState and the character currently under the tape head, at which point it prints the current contents of the tape head from the first non-blank character to the last non-blank character and exits.

Here’s a sample run of a simple program through my Turing Machine so you can see how this plays out:

$ cat palindrome.tm
# Report whether a string of 0 and 1 (ie. a binary
# number) is a palindrome.
look_first   0  go_end_0     _  R
look_first   1  go_end_1     _  R
look_first   _  write_es     Y  R
go_end_0     0  go_end_0     0  R
go_end_0     1  go_end_0     1  R
go_end_0     _  check_end_0  _  L
go_end_1     0  go_end_1     0  R
go_end_1     1  go_end_1     1  R
go_end_1     _  check_end_1  _  L
check_end_0  0  ok_rewind    _  L
check_end_0  1  fail_rewind  _  L
check_end_0  _  ok_rewind    _  L
check_end_1  0  fail_rewind  _  L
check_end_1  1  ok_rewind    _  L
check_end_1  _  ok_rewind    _  L
ok_rewind    0  ok_rewind    0  L
ok_rewind    1  ok_rewind    1  L
ok_rewind    _  look_first   _  R
fail_rewind  0  fail_rewind  _  L
fail_rewind  1  fail_rewind  _  L
fail_rewind  _  write_o      N  R
write_es     _  write_s      e  R
write_o      _  done         o  R
write_s      _  done         s  R

$ ruby tm.rb palindrome.tm 011010110
Yes

$ ruby tm.rb palindrome.tm 01101
No

Summary

While waiting for 48 hours to pass, several examples of Turing Machine code were sent in, including one example of over 1,500 lines, generated by code. Most others were simpler, handcrafted rulesets and tapes, dealing primarily with the limited input set of binary numbers. This kept code size smaller and easier to understand.

A thought occurred to me that we might make an Extended Turing Machine, so we would no longer have to eliminate very repetitive state information. To take a small portion of one example:

...
mark_start v mark_start v L
mark_start w mark_start w L
mark_start x mark_start x L
mark_start y mark_start y L
mark_start z mark_start z L
mark_start _ mark_end   S R
mark_end   a mark_end   a R
mark_end   b mark_end   b R
mark_end   c mark_end   c R
mark_end   d mark_end   d R
...

Using a regular expression for the second argument with an implicit grouping, we could write our Turing Machines more easily.

...
mark_start [a-z] mark_start $1 L
mark_start _     mark_end    S R
mark_end   [a-z] mark_end   $1 R
...

I don’t doubt that there are other efficiencies we could impose on the simple Turing Machine language, to create a better language, a better Turing Machine. Yet, I think the point here is simply to explore the Turing Machine for what it is, rather than turn it into a higher level language. If we want that, we might as well stick with Ruby. (Still, a generator to help us write lengthy Turing Machine programs, such as was used for the 1500 line example, would be interesting.)

I’m going to look at Alpha Chen’s solution this week; it’s short and simple, yet managed to surprise me a little. We start with the last line: the “main” of the solution.

Turing.new(ARGV.shift, ARGV.shift).run if __FILE__ == $0

Chen creates a new Turing Machine, passing in the first two command line arguments, but only if __FILE__ (a constant containing the filename) is equal to $0. This is a typical check made to see if we are loading this module in an attempt to execute it, rather than loading it from another module via the require mechanism.

Now to initializing the Turing class.

def initialize(file, tape=nil)
  @tape = Hash.new('_')
  @head = 0

  # initialize tape
  tape.split(//).each_with_index {|c,i| @tape[i] = c } if tape

  # read program instructions
  @program = Hash.new
  File.open(file).each_line do |line|
    line.gsub!(/#.*/, '') # remove comments
    next if line =~ /^\s*$/ # skip blank lines

    line = line.split(/\s+/)
    @state = line[0] if @state.nil?
    @program[line[0..1]] = line[2..4]
  end
end

The initialization of @tape as a Hash I missed the first time around. And it may seem strange to use a Hash to represent a tape, since Array would seem the proper fit. Still, the use of Hash here is clever to simplify other code later. It might not be a general purpose trick; still, we’re implementing a Turing Machine here, which is hardly a critical application.

Initializing @tape is slightly more complex because of this. The provided tape, if not nil, is broken into an array of characters by use of split(//), but this must then be iterated over and each character inserted into the hash.

Reading the program rules is pretty simple. As each line is read in, comments are first removed, and then blank lines are skipped via:

next if line =~ /^\s*$/

This regular expression matches the beginning of the line with ^, then end of the line with $, and any amount of whitespace with \s*. If a blank line is found, next starts the next iteration of the loop.

After the line is split into words, the program is stored in the @program hash with this:

@program[line[0..1]] = line[2..4]

Chen’s program hash is keyed by two elements: the current machine state and the symbol under the machine head. These two items form a more complete state of the machine and keying the hash by this combined state makes for simple code later. To the hash, for that key, is inserted the rest of the rule.

Running the machine in Chen’s code is done as follows:

def run
  while next_state = @program[[@state, @tape[@head]]]
    @state, @tape[@head], dir = next_state
    @head += (dir == 'R') ? 1 : -1
  end

  puts @tape.sort.map {|_,v| v}.join.gsub(/^_*|_*$/, '')
end

As mentioned above, looking up the next state requires keying into the program with two pieces of information: the current state and the symbol under the tape’s head. If no entry can be found in the program, next_state will be nil and the loop will end.

Otherwise, next_state is broken down into its three parts and assigned to the appropriate portions of the Turing Machine. The direction is used to update @head, either forward or backward one cell. This is where the use of @tape as a Hash, rather than an Array, simplifies the code. Chen no longer needs to check Array boundaries, to see if the tape needs to be extended to either the left or the right. Introducing a new “index” to @tape simply allocates another Hash entry.

After the loop exits, the contents of the tape need to be output. Because @tape is a hash, there is a temptation to call Hash#values to get the content, but such is to be avoided, as I don’t believe Ruby guarantees the order of the values. To get the values out in order along the tape, Chen uses Hash#sort with no block, which will sort key-value pairs Since the keys are first in each of those pairs, and the keys are the indices of the tape, the values will be sorted correctly.

map is used to remove the keys and keep only the values, and join binds them back together into a single string. The final bit is a substitution attempting to strip off any external blanks, so only the “useful” content of the tape is displayed. Chen’s regular expression (slightly corrected in the code shown above), removes all blanks immediately after the start of the line or immediately preceding the end of the line.


Wednesday, February 04, 2009