Reconstructing movement trajectories with the Valhalla routing engine
This tutorial presents a practical use case for Valhalla, a popular open-source engine for routing and map matching that can enrich geospatial data.
Geospatial challenge: device track is incomplete
Our METER tech team works with anonymised footfall data captured by mobile apps. These mobility records are intrinsically fragmentary, often ending when the app shifts to the background. So we never observe a device’s complete journey.

Complete trajectory data offers significant value:
We want to measure realistic media metrics by reconstructing the full device track and identifying how many times it intersects particular areas, such as billboard visibility zone or point of interest (POI). This allows us to calculate marketing KPIs like Reach, Opportunities to See (OTS) and Rating, since they are all rooted in real movement data.
We want to profile data. Track data also serves as the foundation for further analytics and machine learning. For example, we can generate a track embedding with an AI model and aggregate it into a device-level embedding (embeddings are vector representations of real-world items). Then feed those embeddings into models that predict a device’s profile: gender, age, income, interests from originally entirely anonymous data.
Here is an example of our machine-learning process and audience profiling in this slide deck.

By mapping every audience intersection with all city (D)OOH inventory such as billboards, we can design an optimal advertising campaign tailored to the target audience’s profile.
Map matching as a tech solution
Enriching a device track involves generating additional points along its path. A routing and map-matching framework is an algorithm that computes a route from point A to point B. The beauty of using map-matching framework is that it provides the extra points we need and ensures they align accurately with the road network, which is crucial.
Routing frameworks are available as both open-source and proprietary options. Our team selected Valhalla, utilizing its Meili component for map matching:
Fully open source: Valhalla’s code and Docker image are publicly available;
OSM-based: it relies on OpenStreetMap data, which is also freely downloadable thanks to its vibrant community;
Self-hosted deployment: we run Valhalla within our own infrastructure, so there are no restrictions on backend requests or map requests.
Valhalla is an open-source routing engine written in C++. It supports multiple travel modes - driving, cycling, walking, and public transit - that can be switched at runtime. Since it runs in Docker container, you can host it on your own servers without any external limits on requests.
The Valhalla engine has been employed by BMW in their in-car navigation systems (based on an great interview with Valhalla developers Kevin and David - highly recommended), by Sidewalk Labs for route calculation and urban mobility analysis, by Stadia Maps whose cloud routing APIs are built on Valhalla and also by Hammerhead, which used Valhalla to generate cycling routes on their Android devices.
For more information on comparing routing engines, you can find an excellent comparison table of approaches here (created by Nils Nolde - one of co-maintainers of the Valhalla). You can also benchmark different routing engines using routingpy Python package.
Get started with the Valhalla docker
Before you begin, ensure the following prerequisites are met:
Docker Desktop is installed and running;
The specified region’s OSM maps are downloaded.
Keep in mind that feeding OSM maps into the docker for generating tiles can be time-consuming: the larger the area you need, the longer it will take.
Tiles are small, square map segments loaded on demand, without using the entire map. Valhalla uses tiles to perform its computations.
We’ll need to set up a directory called custom_files as the workspace for all map-related operations. Let’s create that folder and download the map files into it:
mkdir custom_files
Let’s say our target regions are Uzbekistan and Kazakhstan. Go to Geofabrik and locate the appropriate links. Valhalla requires the .pbf file format:
wget -P custom_files \
https://download.geofabrik.de/asia/uzbekistan-latest.osm.pbf \
https://download.geofabrik.de/asia/kazakhstan-latest.osm.pbf
Besides the official Docker image here, there’s a constantly-updated image maintained by Valhalla’s maintainer Nils Nolde. We’ll go with Nils’s image, since it doesn’t require a valhalla.json configuration file.
Pull the Docker image from the container registry and run it right away, specifying where the map files are located:
docker run -dt \
--name valhalla_gis-ops \
-p 8002:8002 \
-v "$PWD/custom_files:/custom_files" \
ghcr.io/nilsnolde/docker-valhalla/valhalla:latest
Example running the official Docker image:
docker run -d \
--name valhalla_official \
-p 8002:8002 \
-v "$PWD/custom_files:/custom_files" \
ghcr.io/valhalla/valhalla:latest \
valhalla_service /custom_files/valhalla.json
Production mode
Deploying Valhalla in a production environment on Kubernetes adds extra complexity.
In the example described above, we used maps for Kazakhstan and Uzbekistan as a basis, but what if you also need data from another regions? For example, countries included in the Asia region plus Turkey, whose maps are classified under the Europe region.
You could run two parallel Valhalla instances and alternate requests between them, but as the number of regions scales globally, maintaining a separate instance for each becomes impractical: consuming extra resources and adding code complexity.
To run Valhalla with a single instance, merge all required map extracts into one file using osmium-tool (installation instructions are available in the official documentation).
Here’s a bash script that merges the selected regions into a single map file:
#!/bin/bash
set -xe
# Define URLs for the Geofabrik map extracts
TURKEY_URL="https://download.geofabrik.de/europe/turkey-latest.osm.pbf"
ASIA_URL="https://download.geofabrik.de/asia-latest.osm.pbf"
# Define the output file name
OUTPUT_FILE="combined_map.osm.pbf"
# Function to install osmium-tool
install_osmium() {
echo "Installing osmium-tool..."
sudo apt-get update
sudo apt-get install -y osmium-tool
}
# Function to download map files
download_maps() {
echo "Downloading map files..."
# Download Turkey and Asia maps from Geofabrik
wget -O turkey.osm.pbf $TURKEY_URL
wget -O asia.osm.pbf $ASIA_URL
}
# Function to merge the maps
merge_maps() {
echo "Merging map files..."
# Merge Turkey and Asia maps using osmium
osmium merge turkey.osm.pbf asia.osm.pbf -o $OUTPUT_FILE
}
# Main script logic
main() {
# Install osmium-tool if it's not already installed
if ! command -v osmium &> /dev/null; then
install_osmium
else
echo "osmium-tool is already installed."
fi
# Download map files
download_maps
# Merge the maps
merge_maps
# Verify the merge result
echo "Merged map file created: $OUTPUT_FILE"
osmium fileinfo $OUTPUT_FILE
}
# Run the main function
main
After an execution, the script is composed of two files specified by the following variables:
TURKEY_URL="https://download.geofabrik.de/europe/turkey-latest.osm.pbf"
ASIA_URL="https://download.geofabrik.de/asia-latest.osm.pbf"
will generate one more file with name from the variable:
OUTPUT_FILE="combined_map.osm.pbf"
For the Valhalla Operator to load this file at startup and prepare the required layers, it must be hosted on a service that allows the operator to download it. It can be placed in a private or public S3 bucket, or on a web server configured to serve it as a static file.
If you’re using a modified example.yaml from the Valhalla Operator repository, find the line:
pbfUrl: https://download.geofabrik.de/australia-oceania/marshall-islands-latest.osm.pbf
And replace it with the following:
pbfUrl: https://example.com/path/to/your/combined_map.osm.pbf
Once that’s done, redeploy the service with the updated configuration. Valhalla will be running with a map file that includes only the regions you need.
API testing
Let’s simulate a track for a non-existent device: place several points off a straight line, and check how the routing engine handles the turns.
We are going to use the /trace_attributes endpoint, which requires at least two points (latitude and longitude), a costing mode, and optionally a timestamp. It provides a JSON response with detailed information: the complete route (as a LineString), a step-by-step navigation description, and the estimated travel time for each segment. It is well documented here.
curl -s -X POST http://localhost:8002/trace_attributes \
-H "Content-Type: application/json" \
-d '{
"shape":[
{"lat":41.2872705,"lon":69.2526042},
{"lat":41.283764,"lon":69.2421776}
],
"costing":"auto",
"shape_match":"map_snap"
}' | jq
Below is an example response from the /trace_attributes endpoint, in which the number of attributes and edges fields has been reduced (can you guess where the LineString representation is hiding?):
{
"units": "kilometers",
"osm_changeset": 12856148100,
"shape": "af~vmAiazacCp[qKbPqFf@rD^lCjDtVpAhJ`DhUvLt{@jAjG`@tBfAzFdCxNpFhb@~AxLt@lEn@~Cr@nCz@dC|@`CxChGxJtQ~ErJpAxDnA|EdAbFfB`LxAnJz@`GrBrQdRrwAvSv|AF`GNfBbBfLpB`Nn@nEvO~eAtEn[",
"confidence_score": 1.0,
"raw_score": 30.302,
"admins": [
{
"country_text": "None",
"state_text": "None"
}
],
"edges": [
{
"truck_route": false,
"speed_limit": 70,
"road_class": "primary",
"speed": 70,
"country_crossing": false,
"forward": true,
"length": 0.085,
"source_percent_along": 0.554,
"names": [
"Bobur ko'chasi"
],
"end_node": {
"intersecting_edges": [
{
"walkability": "both",
"cyclability": "forward",
"driveability": "forward",
"from_edge_name_consistency": true,
"to_edge_name_consistency": false,
"begin_heading": 162,
"use": "road",
"road_class": "primary"
}
],
"elapsed_time": 4.399,
"elapsed_cost": 4.729,
"admin_index": 0,
"type": "street_intersection",
"traffic_signal": false,
"fork": false,
"transition_time": 5.51
}
}
],
"matched_points": [
{
"lon": 69.252645,
"lat": 41.28728,
"type": "matched",
"edge_index": 0,
"distance_along_edge": 0.554415,
"distance_from_trace_point": 3.604151
},
{
"lon": 69.242168,
"lat": 41.283784,
"type": "matched",
"edge_index": 21,
"distance_along_edge": 0.535914,
"distance_from_trace_point": 2.411607
}
],
"alternate_paths": []
}
This tutorial demonstrated how Valhalla can enhance and enrich geospatial data through routing and map matching. We deployed Valhalla in Docker, loaded OpenStreetMap data, and showcased the API in action, providing a foundation for integrating Valhalla into your own geospatial workflows.