OpSysII is a project mainly written in Java, based on the View license.
Repo for the Operating Systems II project
This is a map-reduce program for hadoop calculating the TF-IDF values for every word in a set of input text files.
This was developed as a part of a school project.
after running make jar
, to create the inverted index, run:
hadoop jar TfIdf.jar gr.upatras.ceid.romo.Index
to create the tf-idf metrics, run: hadoop jar TfIdf.jar gr.upatras.ceid.romo.Tf
I hardcoded the number of reducers to 5 according to my system. You might want to change it to suit your needs.