Pdf2xml is a project mainly written in Java, it's free.
Converts a PDF into an XML representation of the PDF's layout.
404: Not Found