Thomas Pain :: blog

RailMiles

I built an app to track how far I've travelled on trains because of course I did

2023-06-22 :: 919 words


If you know me, you know that I'm kind of a nerd about trains. It's not like I have an entire page dedicated to what I think of certain types, or anything.

In a similar vein to that, I decided that I wanted to keep track of how far I'd travelled by train and that I was going to build a few pages on my website to do that for me.

Before I get into how all that was built and how that works, this is what the end product looks like:

Screenshot of the UI

The UI is fairly simple - there's a table of all the journeys logged in the last month (not pictured), a map of those journeys and some overall stats at the bottom of the page.

To add a journey, at the very least all you have to do is tell it the date you travelled on and the stations you started, changed and ended at:

Screenshot of the new journey UI

How it's built

When a new journey is entered into the system, there are two key things that need to happen to be able to show the user a map and update the mileage totals.

Getting data

Currently, I use RealTimeTrains to source distance data. Their API is first used to look up the schedule for the trains that the journey was comprised of. A journey can have multiple legs - for example, if I'm travelling from Corby to Brighton, I would take two trains, one from Corby to London St Pancras and one from St Pancras to Brighton. For each leg in the journey, we look up the schedule for a train that took that route on the same day that the journey took place on, then pull the stations it passed through and the distance travelled from that train.

However, the RealTimeTrains API only allows searching services by the stations they call at (as opposed to just by the origin and terminus stations) on the day they ran. This means that, sometimes, we have to manually enter the service IDs of the trains that formed the journey to get accurate results, for example if we're logging them for a journey we took yesterday. In this case, the program skips straight to requesting the routing and distance information for that train and don't bother doing any schedule lookups.

While scheduling information can (and is) pulled from the RTT API, distance information for a service is not. RealTimeTrains only servces this data in the frontend, so if we want it, we have to do some simple web scraping.

The distance numbers

This is the actual code that pulls that information out of the HTML, after which the numbers are parsed and converted into miles.

 1var waypoints [][3]string
 2doc.Find(".location.call,.location.pass").Each(func(i int, selection *goquery.Selection) {
 3    shortcode := shortcodeRegexp.FindString(
 4        selection.Find(".location a").Text(),
 5    )
 6
 7    if shortcode == "" {
 8        // If this is a junction without a CRS code
 9        return
10    }
11
12    waypoints = append(waypoints, [3]string{
13        shortcode,
14        strings.TrimSpace(selection.Find("span.miles").Text()),
15        strings.TrimSpace(selection.Find("span.chains").Text()),
16    })
17})

Note that here, a short/CRS code one of the three letter codes used to refer to a train station - eg BHM for Birmingham New Street, KGX for Kings Cross, or VIC for London Victoria.

From all of this, we get a sequence of locations that the train passes through and the distance between them all. Now what?

Generating maps

It's all well and good talking about abstract location data, but it'd be nice to put that on a map and visualise it. Enter: OpenStreetMap!

Using OSM data, it becomes trivially easy to translate a CRS code into a latitude and longitude using the Overpass API, which puts a layer on top of the raw OSM data to let you query it with a dedicated query language. In fact, to get a list of station locations (and while we're at it, names, since we only store short station codes instead of their full names in the database), all you need to run is the following query:

1[out:json];
2node["ref:crs"];
3out geom;

... which gives you a load of JSON that's comprised of entries looking a little like this (you can play around with the query here).

 1{
 2  "type": "node",
 3  "id": 104734,
 4  "lat": 51.5656526,
 5  "lon": -1.7858762,
 6  "tags": {
 7    "name": "Swindon",
 8    "naptan:AtcoCode": "9100SDON",
 9    "network": "National Rail",
10    "operator": "First Great Western",
11    "platforms": "4",
12    "public_transport": "station",
13    "railway": "station",
14    "ref:crs": "SWI",
15    "toilets:wheelchair": "yes",
16    "wheelchair": "yes",
17    "wikidata": "Q3244572",
18    "wikipedia": "en:Swindon railway station"
19  }
20}

I filtered this raw data so it only contained the stuff I was interested in, then embedded the file in the Go code that powers RailMiles.

Now we have train routes and the locations those routes correspond to, the last thing we have to do is to render those to a map.

I chose to do all the visual mapping stuff with Leaflet using the default OpenStreetMap tiles and some fancy OpenRailwayMap tiles as an overlay to emphasize the main railway routes atop that. On the main page, the routes and stations visited in the last month are compiled into a blob of GeoJSON and fed to this chunk of code:

 1<div id="journey-map"></div>
 2<script>
 3    let map = L.map("journey-map");
 4
 5    L.tileLayer('https://tile.openstreetmap.org/{z}/{x}/{y}.png', {
 6        maxZoom: 19,
 7        attribution: '...'
 8    }).addTo(map);
 9
10    let ormOverlay = L.tileLayer('http://{s}.tiles.openrailwaymap.org/standard/{z}/{x}/{y}.png', {
11        attribution: '...',
12        minZoom: 2,
13        maxZoom: 19,
14        tileSize: 256,
15        className: "tile-orm", // this is used to make the ORM tiles greyscale
16    });
17
18    let layerControl = L.control.layers({}, {"OpenRailwayMap": ormOverlay}).addTo(map);
19
20    let gj = L.geoJSON("<< geoJSON goes here >>", {onEachFeature: (feature, layer) => {
21        if (feature.properties && feature.properties.name) {
22            layer.bindPopup(feature.properties.name);
23        }
24    }});
25    gj.addTo(map);
26    map.fitBounds(gj.getBounds())
27</script>

Problems

As much as I'd like to say that this is perfect and works 100% of the time - it doesn't. Sometimes it falls flat on its face and ceases to work. The main issues I know about are:

I have a couple of ideas of how to fix these. More posts to follow!