Home > clsem

clsem

Clsem is a project mainly written in Common Lisp, it's free.

A tool that extracts links and definitions from the CL HyperSpec as an RDF graph

Interlinking information for the HyperSpec

The CLHS comes with ~110000 links. We know where each one of them goes, but where do they come from?

Imagine you're looking at a glossary entry. It's normative, but to which areas of the spec is it relevant? Which functions depend on its definition? I often asked myself that, and it bothered me enough that I wrote this program to extract the hyperlinking structure from the CLHS into an RDF graph, which can be imported into any database (of course, I recommend agraph) that can answer these types of questions.

= Creating the triple information

  1. Install the dependencies:
    • asdf
    • cxml-stp
    • closure-html
    • cl-ppcre
    • drakma
  2. Load the clsem.asd, (asdf:oos 'asdf:load-op :clsem)
  3. (clsem:do-it #p"/path/to/output/file.ttl")

This will query the lispworks HTTP servers and will take a long, long time. If you have a copy of the HyperSpec downloaded, you can use:

      (clsem:do-it #p"/path/to/output/file.ttl"
                   :prefix "file:///Users/asf/Downloads/HyperSpec-7-0/HyperSpec")

And it will finish in ~9 seconds.

This file is in the turtle RDF triple language format. If you import the data into a graph database, things may be easier if you convert it to the ntriples format first. I recommend using the most excellent rapper for this task.