FakeHackerNewspaper is a project mainly written in Ruby, it's free.
My contribution to the Hacker Newspaper project http://hacker-newspaper.gilesb.com/
h2. Overview
So this is my code... it's a bit of a hack :-) but it has a few rspec specs in the specs directory.
Run:
% rspec specs/*
to run them all... they should all pass.
To get actual useful work done, run:
% ruby parse_hn_front_page.rb > results.txt
This shows a dump of what it finds, and what the objects contain.
h2. Future Work
Obviously, a lot could be done. I think it would be useful to set up classes that recognize popular sites (nytimes.com, wsj.com, etc.) that don't easily give up their useful text.