April 22nd, 2007 by aaronharnly
NLTK, the “natural language toolkit” for Python, is a wonderful lightweight framework that provides a wealth of NLP tools. The other day, in reading through its documentation, I came across a little appendix describing the advantages of Python for implementing and (especially) teaching NLP.
The authors show a simple sample program to find and list words ending in “ing” from the standard input:
import sys
for line in sys.stdin.readlines():
for word in line.split():
if word.endswith('ing'):
print word
and contrast this elegant Python implementation with a variety of monstrosities in other languages. I won’t disagree that the Python is nice, but it seemed like a good little exercise to see whether I can’t produce something almost as good in my languages de jour.
To wit, a Ruby version:
for line in ARGF
for word in line.split
if word.match(/ing$/) then
puts word
end
end
end
which is almost identical to the Python version, though showing Ruby’s not-exactly-pretty fascination with the ‘end’ keyword.
And a Scala version using for-comprehensions. Note to Scala creators: It’s really frustrating having the various ways of executing Scala — as a script, as an object, etc. — all disagree slightly on how the outermost wrapper of a procedure should be formatted.
import scala.io._
object IngWords extends Application {
for (
val line <- Source.fromInputStream(System.in).getLines;
val word <- line.split(" ");
word.endsWith("ing")
)
Console.println(word)
}
(Aside: I need a decent syntax highlighting package for WP, it seems.)
Posted in ruby, scala | No Comments »
April 22nd, 2007 by aaronharnly
NLTK, the “natural language toolkit” for Python, is a wonderful lightweight framework that provides a wealth of NLP tools. The other day, in reading through its documentation, I came across a little appendix describing the advantages of Python for implementing and (especially) teaching NLP.
The authors show a simple sample program to find and list words ending in “ing” from the standard input:
import sys
for line in sys.stdin.readlines():
for word in line.split():
if word.endswith('ing'):
print word
and contrast this elegant Python implementation with a variety of monstrosities in other languages. I won’t disagree that the Python is nice, but it seemed like a good little exercise to see whether I can’t produce something almost as good in my languages de jour.
To wit, a Ruby version:
for line in ARGF
for word in line.split
if word.match(/ing$/) then
puts word
end
end
end
which is almost identical to the Python version, though showing Ruby’s not-exactly-pretty fascination with the ‘end’ keyword.
And a Scala version using for-comprehensions. Note to Scala creators: It’s really frustrating having the various ways of executing Scala — as a script, as an object, etc. — all disagree slightly on how the outermost wrapper of a procedure should be formatted.
import scala.io._
object IngWords extends Application {
for (
val line <- Source.fromInputStream(System.in).getLines;
val word <- line.split(" ");
word.endsWith("ing")
)
Console.println(word)
}
(Aside: I need a decent syntax highlighting package for WP, it seems.)
Posted in ruby, scala | No Comments »
March 19th, 2007 by aaronharnly
Scala is my new Ruby, i.e. the language I love to tinker in. Rather more practical, too, as the fact that Ruby is dog-slow has gotten in the way of my work more than once recently.
Posted in ruby | No Comments »
March 7th, 2007 by aaronharnly
In the category of tools that I want, but better not make right now, lest it turn into a “paroxysm of generalization”:
I have a DTD, describing a bunch of entities, their relationships, and their attributes. I’m going to push data from a set of XML files (adhering to said DTD) into a Ruby-on-Rails savvy database. Wouldn’t be nice to have a simple tool that, in the most general way possible, would, given that DTD:
Issue a series of ’script/general model Foo’ commands for the various entities.
Populate the Rails migration files appropriately, to manage the creation of the database tables for these entities. That would include inserting :foo_id columns for has-many and has-and-belongs-to-many relationships (though differentiating between the two might require some human supervision), and exploiting the wonderful Red Hill Foreign Key Migrations to create appropriate FK constraints.
In addition / as an alternative to using the Red Hill plugin, insert the appropriate has_many / habtm declarations in the model files.
And finally, make a script that can read a set of such XML files and fill the database appropriately.
Well, sounds nice to me, anyway. Put it on the someday-maybe list.
Posted in ruby | No Comments »