Officiele-bekendmakingen-scraper is a project mainly written in Python, it's free.
Scrapes the search result pages of https://zoek.officielebekendmakingen.nl/ using Scrapy, and downloads the XML documents it may find along it's way.
Author: Justin van Wees ([email protected])
Date: 2010-06-21
Officiële bekendmakingen scraper scrapes the search result pages of https://zoek.officielebekendmakingen.nl/ and downloads the XML documents it may find along it's way.
After you've made sure that all the required Python packages are installed, please edit "officielebekendmakingen/settings.py". The settings should be self explanatory.
Run python scrapy-ctl.py crawl zoek.officielebekendmakingen.nl
You can monitor the Scrapy process by visiting http://[HOSTNAME]:6080 or by opening a Telnet session to port 6023 (the "stats" object contains information about the current run)