Very draft version :)
Ruby version: 1.9
Gems required: sqlite3, nokogiri, open-uri, htmlentities
Basic Rake steps workflow (run rake -T for the whole list of available tasks)
Create cache folders and SQLite DB
rake setup:init
OPTIONAL step: by default whole Switzerland timetables are fetched
if you want it only for a smaller area, run the following command
'bounds' parameter contains 4 values, separated by comma:
corner SW longitude,corner SW latitude,corner NE longitude,corner NE latitude
rake station:set_bounds bounds=8.53,47.41,8.61,47.46
Discovers the stations in the studied area
rake station:fetch
OPTIONAL step: exports the stations in tmp/station.csv for visualizing data (i.e. in QGIS)
rake station:export
OPTIONAL step(although recommended): remove the stations outside of Switzerland's borders
rake station:geo_clean
Fetches from SBB, the files containing departures for each station
rake departure:fetch
Removes the files that contains errors. If any files are mentioned, you have to run again the previous step
rake departure:files_clean
Inserts the timetables in DB
rake timetable:parse
Remove timetable duplicates (based on departure, vehicle_id, destination)
rake timetable:remove_duplicates
Remove stops of the stations that are now known (outside of Switzerland)
rake timetable:remove_notknown_stations
Determine the station type from the timetables
rake station:parse_type
OPTIONAL step: exports the stations in tmp/station.csv for visualizing data (i.e. in QGIS)
rake station:export
Build the vehicle table based on departures
rake vehicle:insert
Update the vehicle table with info for each vehicle and insert arrivals at destination in the timetables
rake vehicle:build
Remove again stops of the stations that are now known (outside of Switzerland), which may be inserted by the previous step
rake timetable:remove_notknown_stations
Remove vehicles that have only one station in Switzerland and the rest outside (like TGVs leaving to France from the border)
rake vehicle:remove_onestopper
Run this step again because me might have stations without vehicle stops, so we don't want to display this station
rake station:parse_type
Remove the stops of the vehicles that have duplicate consecutive stations but with different departure times. A bit similar with 'timetable:remove_duplicates' task
rake vehicle:check_duplicate_stations