Wikiextractor is a project mainly written in JAVA and SHELL, it's free.
Extraction of Wikipedia plain text bi-lingual page pairs
This project contains some code and scripts to extract plain text pairs of linked wikipedia articles in English and L2.
Get wikipedia dumps from wikimedia.org:
Extracting interlinked plain text page pairs
Compile the wikiextractor project (tools/build.sh). The script generates ../lib/wikiextractor.jar.
Extract plain text pages (extractAll.pl) for a set of languages. Pages are extracted if they have inter-language links to corresponding pages in English. Filenames are the names of the corresponding English articles (both plain text articles and original markup are saved).
Pair up the articles in English and L2 (allpairs.pl), saving names of the pairs in list.txt.
E.g. (for en-es):
cd tools perl download.pl languages dumps perl unzip.pl languages dumps
bash build.sh perl extractAll.pl languages dumps out
mkdir -p out/pagepairs/logs perl allpairs.pl en es out/pages out/pagepairs/en-es/pairs &> out/pagepairs/logs/en-es.pairs.log