159 Word Search Generator

Description

A word search puzzle presents a grid of seemingly random letters and a list of words. The goal of the puzzle is to find and mark all the words from the list hidden within the grid, searching along straight lines (horizontal, vertical and diagonal).

Your task for this week’s quiz is to write a Ruby program that generates word search puzzles, given a list of words and the desired width and height for the puzzle. A call to the script will look like this:

genpuzzle.rb words.txt 6x8

The first argument is the list of words, one word per line in a text file. The second argument is the dimensions of the puzzle, width by height.

Your program should output two files. The first file, search.txt, should contain the puzzle itself. All characters should be separated by spaces to allow the puzzle solver room to mark found words. As an example, here is a 6x8 puzzle containing the words from “zero” to “nine”.

e a e g g w
e e n i n t
r n o h h f
h q e g i r
t z i v u q
o e e o e l
w r f g e s
t o u s i x

The second file, solution.txt, should contain the same puzzle with the answers marked, like this:

e . e . . .
| |
e e-n-i-n t
| | /
r n o . h f
| \ / /
h . e g i r
| X / /
t z i v u .
|/ / X
o e e o e .
| | / \
w r f . . s
| |
t o . s-i-x

Summary

Adam Shelly was our sole submission this week, so we are going to take a look at his solution to the quiz.

Adam’s first bit of code is a simple class, CrossWord:

class CrossWord
  attr_accessor :word, :place, :dir
  def initialize w
    @word=w
    @dir=0
    @place=nil
  end
  def unplaced?
    @place==nil
  end
end

This class maintains information about each word to be placed into the puzzle, including the word itself but also the direction and placement. The rest of the code works with this class rather than with raw strings.

Let’s jump for a moment to the main code.

gridsize = ARGV[1].split('x').map{|v|v.to_i}
g = CharGrid.new *gridsize
words=File.open(ARGV[0],"r").read.split.sort_by{|s|s.length}.reverse
puts "unsolvable!" or exit if (words[0].size>gridsize.max)

g.fill words
File.open("search.txt","w"){|f|f.puts g.to_s}
File.open("solution.txt","w"){|f|f.puts g.solution}

Most of this should be fairly obvious work. The dimensions (e.g. “24x20”) are retrieved from the second command-line argument, split over the ‘x’ and converted to integers. These dimensions are then used to construct a new CharGrid object (which will be examined shortly).

The search words are read from the provided file and sorted from longest to shortest. Presumably, the intent here is to place the longest words into the puzzle first; the shortest (and more easily placed) words are placed last.

A quick sanity check is done by comparing the length of the longest word against the longest dimension of the grid. If the word is longer, a puzzle would be impossible to generate. Adam uses a neat, if perhaps confusing, technique to alert the user and exit out. To understand how it works, I had to parenthesize according to operator precedence:

((puts "unsolvable!") or exit) if (words[0].size > gridsize.max)

Now, realizing that the puts statement returns nil, this made sense. So, while this construct both outputs the warning message and exits the program, this might have been clearer:

if (words[0].size > gridsize.max)
  puts "unsolvable!"
	  exit
end

Getting back to Adam’s main code, the words – now sorted and checked – are handed off to the grid’s fill method, which does the bulk of the work. Once done, the final step is to output the two text files as requested, the puzzle itself using the to_s method, and the solution via the solution method.

Most of Adam’s code lives in the CharGrid class. The main algorithm is found in the fill method:

def fill words
  iterations = 0
  @words = words.map{|w| CrossWord.new(w)}
  words_todo = @words.select{|w|w.unplaced?}

The master list of all words is kept in @words, which, as mentioned earlier, is converted and kept as CrossWord instances rather than raw strings). Then Adam loops until his words_todo array is empty. For the first iteration of the loop, this array contains all of the words sorted from longest to shortest.

  until words_todo.empty?
    words_todo.each{|cw| place cw }

The first step is an attempt to place all of the words that have not yet been placed into the puzzle. We’ll come back to the place method in a bit.

    words_todo = @words.select{|w|w.unplaced?}.sort_by{rand}

Adam reevaluates what words still remain to be placed. Unlike before, where words were sorted from longest to shortest (in an attempt to place the more complex words first), remaining words are now randomized. I imagine this is an attempt to add a bit of chaos where order (i.e. word length) failed, but I question how much of a benefit this is, considering words are placed within the word grid mostly at random.

   if (iterations+=1) %(@w+@h)==0
     #if we are getting stuck, try removing some words
     puts "#{togo = words_todo.size} to go..."
     words_done = (@words-words_todo).sort_by{rand}
     (togo*2).times{|i| words_todo<< remove(words_done[i]) if words_done[i]}
    end
  end
end

As per Adam’s comment, if the loop continues for a long while without placing all the words in our list, some words are put back into the todo list. togo indicates how many words are currently unplaced, and twice as many are removed from the puzzle and put back into the words_todo list. The hope here is that, when the next loop iteration begins, the code will attempt to place the words remaining, in a location and orientation that differs from prior iterations of the loop.

Let’s look at the place method next:

def place cw
  @words.sort_by{rand}.each{|otherword|
    startpt = find_overlap(cw, otherword)
    return if test_place(cw, startpt)
  }
end

In an effort do provide an interesting puzzle, where words overlap frequently, the word to be placed is checked against all of the other words in random order. If the word is successfully placed, test_place returns a non-false value and the function exits.

def findoverlap cw, testword return nil if testword.unplaced? if (offset = testword.word.index cw.word0) startpt = testword.place offset.times{startpt=nextp(startpt,testword.dir)} else startpt = nil end startpt end

find_overlap determines if the first character of the word to be placed can be found in another, already placed word. If so, the location of that character in the grid is determined by walking along testword using nextp, which calculates the next grid index in a given direction. That index is returned, or nil if the character is not found.

This is certainly a good first step in trying to overlap words, though I think more work needs to be done to create an interesting word search puzzle… but I’ll come back to that at the end of this summary.

def test_place cw, suggestion=nil
  dir=randDir
  start= suggestion || randStart(cw.word[0])

Calling test_place attempts to place the supplied word. A random direction and starting point are chosen; the latter is random if no suggestion was made by find_overlap (i.e. there were no characters in common between the two compared words).

  8.times do
    pt = start
    good = true

The loop will attempt all eight directions to place the word from the starting location (but will exit the function as soon as the first good direction is found).

    cw.word.each_byte{|chr|
      good = (@g[pt]==?. ||  @g[pt]==chr) && (pt=nextp(pt,dir))
      break unless good
    }

Looking at each character of the word to place, and the corresponding grid spaces, we determine if each character either fills in an empty grid space (currently occupied by periods ?.) or matches existing grid characters. As long as all characters of the word fulfill this criteria, the variable good remains true.

    return add(cw, start, dir) if good

After examining the whole word, a true value for good indicates that the word fits into the grid and can be placed. The add method accomplishes this, and we exit test_place early, now that we have actually placed the word.

    dir=(dir+1)%8
  end
  nil
end

Finishing the loop, the next direction is tried if the previous direction did not allow the word to fit. If no direction works, nil is returned, and the place method moves onto the next word.

That covers the bulk of the algorithm; the rest of the class is bookkeeping, filling in characters or removing them, checking directions, and more. I’m going to move onto looking at the output, however please take a look at Adam’s code if you want to see in more detail how he manipulates the data.

Here now is the solution.txt output from one run of Adam’s generator, using the word list provided earlier:

e-x-p-r-e-s-s-i-o-n v-i-r-t-u-a-l . . .
                                       
. e-c-n-a-t-s-n-i p . . e-t-a-l-p-m-e-t
                 \ \                   
i c-o-n-d-i-t-i-o-n r e-v-a-l-u-a-t-e s
|                  \ \            |   |
n e c-o-n-s-t-a-n-t h i e . . . . c n c
| |                  \ \ \        | | |
t c p-a-r-a-m-e-t-e-r e m r . . . e o o
| |                    \ \ \      | | |
e n y-r-o-t-c-a-f . . . r i u . . j i p
| |                      \ \ \    | | |
r e . d-a-o-l-r-e-v-o . . i t s . b t e
| |                        \ \ \  | |  
f r . e-l-b-a-t-u-m-m-i . . t i o o p .
| |                          \ \ \  |  
a e . . t-c-a-r-t-s-b-a . . . a v l e .
| |                            \ \ \|  
c f p-r-o-t-o-t-y-p-e s-t-a-c-k n e c l
| |                              \  | |
e e . . . . n-o-i-t-a-u-n-i-t-n-o-c x i
  |                                \| |
o r . a m . m-s-i-h-p-r-o-m-y-l-o-p e t
|     | |                             |
p . k r h l k-c-o-l-b . m-e-s-s-a-g-e e
|   | | | |                         | |
e g e g t a m-e-t-h-o-d s-s-a-l-c . t r
| | | | | |                         | |
r l y u i c r-e-f-l-e-c-t-i-o-n t . a a
| | | | | |                     |   | |
a o w m r o . n-o-i-t-c-n-u-f . y . g l
| | | | | |                     |   |  
t b o e o l g-e-n-e-r-a-t-o-r . p . e .
| | | | |                       |   |  
o a r n g t-n-e-m-e-t-a-t-s . . e . l .
| | | | |                           |  
r l d t l . . . r-e-c-u-r-s-i-o-n . e .
        |                           |  
i-t-e-r-a-t-o-r i-d-e-n-t-i-f-i-e-r d .

A congratulations to Adam whose solution does indeed generate word search puzzles. His solution to the quiz worked better than my own (incomplete) code, and managed to get it into a tighter space and complete more often, due in large part to his algorithm that will remove and replace words in the hopes that they will fit better.

However, one thing that is very similar between Adam’s solution and my own is the output. As you’ll see, there are some very obvious groupings of words parallel to one another. For the list of 44 words, this particular word search has less than ten intersections, and the parallel groupings make finding words easier than might be expected. Handcrafted puzzles tend to be more difficult, visually interesting (i.e. no parallel groupings) and many more intersections of words.

My initial thought was that Adam’s overlap code would provide more interesting puzzles than my very rough generator, as mine had no metrics or heuristics beyond “Does it fit?” But there was very little difference between Adam’s output and my own. I have two suspicions.

First, Adam checks only the first character of one word against all of the characters of another. I think that might have been two limiting, and to my recollection, word search puzzles tend to have intersections in the interior of words, and not at the ends.

Second, I suspect that a simple two-word intersection test like Adam’s won’t be sufficient for interesting puzzles. I think three-word (triangle), and perhaps four-word (rectangle), overlap tests might be required. These are common patterns in word search puzzles, and the triangle pattern would better fend off the uninteresting parallel groupings.


Wednesday, February 04, 2009