aaron.harnly.net

NLTK’s “ing words”: variations

NLTK, the “natural language toolkit” for Python, is a wonderful lightweight framework that provides a wealth of NLP tools. The other day, in reading through its documentation, I came across a little appendix describing the advantages of Python for implementing and (especially) teaching NLP.

The authors show a simple sample program to find and list words ending in “ing” from the standard input:

import sys for line in sys.stdin.readlines(): for word in line.split(): if word.endswith('ing'): print word

and contrast this elegant Python implementation with a variety of monstrosities in other languages. I won’t disagree that the Python is nice, but it seemed like a good little exercise to see whether I can’t produce something almost as good in my languages de jour.

To wit, a Ruby version: for line in ARGF for word in line.split if word.match(/ing$/) then puts word
end end end

which is almost identical to the Python version, though showing Ruby’s not-exactly-pretty fascination with the ‘end’ keyword.

And a Scala version using for-comprehensions. Note to Scala creators: It’s really frustrating having the various ways of executing Scala — as a script, as an object, etc. — all disagree slightly on how the outermost wrapper of a procedure should be formatted.

import scala.io._ object IngWords extends Application { for ( val line <- Source.fromInputStream(System.in).getLines; val word <- line.split(" "); word.endsWith("ing") ) Console.println(word)
}

(Aside: I need a decent syntax highlighting package for WP, it seems.)

Digg this     Create a del.icio.us Bookmark     Add to Newsvine

No Responses to “NLTK’s “ing words”: variations”

No comments yet

Leave a Reply

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word