Corpus is a project mainly written in Clojure, it's free.
a tool used to train a detokenization library
A tool used to train detokenization libraries
(use 'corpus.core) (def w (corpus-file "data/alice-in-wonderland.txt")) (count w)
Copyright (C) 2010 Lee Hinman
Distributed under the Eclipse Public License, the same as Clojure.