Home > nlp-revindex

nlp-revindex

Nlp-revindex is a project mainly written in Python, it's free.

Simple tf-idf based reverse index experiment.

Simple scripts to generate a reverse index from a collection of text files, based on tf-idf weights.

We also use a shingling technique to calculate text containment between the files of the collection.

The tree-tagger English parameter file is available from here.