224 Found Senryu
This week’s quiz comes from Martin DeMello:
Scan a text for runs of seventeen syllables formed from complete words
The quiz this week has solutions from Martin DeMello and wkm.
wkm’s solution uses the Lingua::EN package to count syllables. Installing Lingua::EN was somewhat of a challenge. I was only able to install it without the dictionary and run using the guessing library, hopefully others may have better luck.
When the program runs the first step is to read in the documents, extracting the words and syllable counts. Whenever a word’s syllables are looked up the results are cached to save time on looking up common words. Next, this list of words with their syllable counts is iterated over with all possible word offsets to find runs of 17 syllables. When such a run is found the words that comprise it are printed out.
There is, however, one issue with wkm’s solution. It does not check to see if the 17 syllables split on word boundaries into 5-7-5 syllable chunks. This causes the program to greatly over-estimate how many 17 syllable runs there are in the text.
Martin’s solution uses uses the cmu pronouncing dictionary directly. The entire dictionary is loaded so that words can be looked up easily. The text is then iterated through, counting up the number of syllables. When a run of words totals 17 syllables it is checked to see that the fifth and twelfth syllable boundaries are also on word boundaries.
If the number of syllables for a section is greater than 17 the first word in the section is removed and its syllables removed from the count. This allows checking of all possible runs of 17 syllables without having to iterate through the text multiple times. When running Martin’s solution against Wodehouse’s “Right Ho, Jeeves” it produces many good excerpts, but when run solely against those excerpts does not print them out. I haven’t been able to find the cause of this issue, but as they say, let’s leave that as an exercise to the reader.
Thank you Martin and wkm for your solutions to this week’s quiz!