Self-hosting maps: taking control over UX and users' privacy

Sebastian Greger

While self-hosting is comparatively easy for content formats like photos, same does not apply to interactive maps - given the sheer amount of data and/or the complexity of its processing. Experimenting with OpenMapTiles, it for the first time appears reasonable to satisfy some of my mapping needs beyond static maps without involving a third party.

Driven by a concrete need, but also triggered by discussions surrounding our collaborative Collection of Data Processing Agreements, I recently looked into current alternatives for embedding maps on a website. Led by the motivation to find a user-centred, small-budget, privacy-friendly, and legally compliant solution for embedding rich maps on a website, I believe to have found a viable alternative for certain use cases: self-hosting open source maps.

Evaluating common options

The go-to solutions coming to mind when needing a map on a website are Google Maps or its open-source counterpart OpenStreetMap (OSM). In addition, there is a range of commercial providers in the market (and probably some services I’ve overlooked - comments below are open for your additions).

Google Maps

The elephant in the room, Google Maps, not only provides state-of-the-art maps and a range of powerful features, but is easy to embed and offers a plethora of styling options.

Claiming to reach over a billion users every month, Google Maps clearly is the most commonly used map service. At the same time, that also gives Google (in theory, at least) access to usage patterns of a billion people.

Yet, it is “free” only in the surveillance capitalism sense, exposing website users to tracking by Google and their partners. And with recently announced volume pricing, many website owners woke up to the fact that providing maps on the web indeed comes with a cost.

Google created a contractual construct in which website owner and Google are independent data controllers. Under GDPR, this puts burdens on the website owner regarding the justification of data transfer to Google. In my personal opinion, this alone would be reason enough to seek for an alternative maps provider.

OpenStreetMaps

Thanks to good libraries and documentation, along with out-of-the-box plugins for many CMS, embedding OpenStreetMaps is equally straightforward. Yet, while the code is easy to self-host, the maps still have to be pulled in from somewhere. The OpenStreetMap Foundation and a handful of services offer these for free - naturally limiting the visual options to pre-rendered styles.

Interestingly, while this is the interface most people associate OpenStreetMaps with, the core of the project is actually a geographic database, not a “map service”.

Using the map rendering and delivery infrastructure of openstreetmap.org or associated initiatives, while free both in the sense of “no payment” and - based on the information available - also by “not paying for the service with users’ data”, it must not be forgot that their infrastructure runs on money. So, while this availability of a free Google Maps alternative is an important and generous service to the public, commercial entities using community-provided map tiles may want to consider a donation to the OpenStreetMap Foundation (OSMF).

In legal terms, the OSMF clearly states that it does not see itself as a data processor, but a data controller. Hence a website user embedding map images (tiles) from openstreetmap.org would be considered as transferring personal data (here: IP addresses) to a third party; essentially a similar setup as with Google, even though the OSMF’s motives are diametrically different (the OSMF is not primarily in the business of providing map tiles for third-party use) and the absence of tracking makes it easier to formulate a “legitimate interest” justification.

Commercial alternatives

When wishing for more customisation or additional features, a viable solution is to use some of the commercial providers of mapping services, either OSM-based or using proprietary map data (maybe best known Mapbox and Here, but many others as well). These often come with a wide range of implementation options and additional services.

Mapbox has for years been one of the most visible Google Maps competitors, building on OSM and other data to provide a range of services.

Using such solution, the contractual terms make it possible to evaluate privacy implications (e.g. compulsory or optional tracking) transparently. Some providers even offer a Data Processing Agreement under GDPR, possibly rendering this a suitable solution for the performant, customisable, privacy-friendly and legally compliant embedding of maps.

“Privacy-friendly maps” - a journey in UX and technical design

As long-time readers are aware, I have since 2013 been experimenting with designing websites for “ultimate privacy” - aiming to not leak users’ data to any third parties without informing them first. This is not only in the belief that site owners are responsible for how they treat users’ privacy, but on the desire to test out the boundaries of what is possible in terms of being self-sufficient on the web.

Such approach is possibly over-apprehensive for many contexts, but for example websites with privacy-sensitive audiences do care about these aspects. Hence, the design driver for this exercise was to create a solution that allows to embed an (interactive) map where the browser only contacts a third party after informing the user beforehand or - even better - not at all.

Static maps to the rescue?

Oftentimes, a full-grown interactive map is not needed. The use of static, pre-rendered maps is a convenient solution for simple use cases and particularly valuable for using large-size maps as page backgrounds etc.

While the Google Maps static maps service demands to embed them without server-side caching directly from Google’s servers, some tools provide means to render static maps off OSM and host them locally:

  • staticMapLite is a simple, one-file PHP script that generates static map images by rendering the maps available from openstreetmap.org into a static image.

  • A slightly more complex, but also way more powerful, alternative is Static-Maps-API-PHP, which includes many more rendering options.

A static map, generated from the default OSM tiles via openstreetmap.org, using the staticMapLite script. To add “interactivity”, it would be good UX to provide a link to OpenStreetMap.org or other services that assist the user to contextualise.

Apart from the possibility to serve them first-party and hence avoid any privacy/legal issues, the great benefit of using static maps is that their performance is hard to beat, compared to the embedding of interactive JavaScript maps. This particularly applies when caching them locally. However, the images come at a predefined resolution, making it worthwhile to consider calling different versions for different screen sizes (otherwise a big static map rendered for the desktop screen could be rather useless on a smartphone).

Using third-party maps “responsibly”

The Shariff social media buttons were probably one of the first widely popularized “click-to-activate” solutions; later services implemented similar flows for video, initially for “Do Not Track” users - as recommended by the EFF DNT implementation guide - and recently more and more often for all users by default.
When embedding third-party content, the approach to ask for permission first is growing in popularity. I have been using a self-written piece of code to override the Wordpress oEmbed functionality for a long time with videos, so could imagine to simply re-use that implementation for maps as well.

In this solution, I display a locally hosted map placeholder (e.g. a static screenshot) with an interstitial overlay that informs the user and offer the option to embed the map from a third party or view it externally (unless a user has “Do Not Track” activated, in which case the embedding would normally not even be offered):

Static preview image for embedded content By embedding this map you consent to retrieving third-party map data; the OpenStreetMap Foundation could thereby track that you are reading this page. Embed here View on osm.org The osm.org privacy policy applies.
To learn more, read this site’s privacy notice.

A demo of information-first embedding (using a map from osm.org for proof-of concept here, to avoid embedding Google services on this site), as shown to a user with Do Not Track disabled (for the purpose of this demo, the DNT detection is temporarily disabled); if the embedded map doesn’t show, that may be due to privacy browser plugins - e.g. the EFF’s PrivacyBadger seems to sometimes block osm.org embeds.

Static preview image for embedded contentYou have “Do Not Track” activated in your browser. Embedding this map would make the IP address of your device known to the OpenStreetMap Foundation and is therefore disabled.View on osm.orgTo learn more, read this site’s privacy notice.

The same demo, but as seen by a user with Do Not Track activated (identifying a best practice for what alternatives to offer would be subject to a separate design exercise, here just re-using my texts from the video embeds).

This solves a range of ethical, and possibly legal, implications, depending on who is the third party and what is the contractual relation with the website owner. But in terms of UX, this is not ideal. While still somewhat ok for a map embedded within a text as demonstrated here, this approach becomes rather obtrusive when maps are not so separated from a site’s content or layout (e.g. as background images).

Self-hosting interactive maps?

If wanting to go “fully independent” and not rely on static map rendering only, there is no way around self-hosting map data. While the raw data of OSM is freely available, the journey from data to visual representation is long. It involves downloading and processing the raw data (40 GB for the entire world, compressed), generating map tiles (either pre-rendered for caching or on-the-fly in real time) and serving them to the user. In particular with conventional bitmap image tiles (i.e. readily visually rendered maps), this comes with huge demands for processing power and data storage - not realistic to set up for the occasional use of maps on a small or medium-size website.

OpenMapTiles - self-hosted interactive OSM maps

This is why I was excited to learn about OpenMapTiles, an open source project eliminating most of the dreadful tasks of running OSM maps locally: no longer does the website owner need to set up a complex process, but can build self-hosted maps “in a matter of minutes”; actively promoted as privacy-by-design by its inventors.

The slogan of OpenMapTiles.org is “Open source maps made for self-hosting” — just what the doctor ordered!

This, however, is not as easy as embedding a Google/OSM map; it comes with a few extra steps and requires some degree of fluency with the backend of a website. On another note, while free for personal projects, educational use and for open source and open data projects, purchasing a license for pre-rendered map tiles comes with (very reasonable) fees attached.

I chose to leave the more advanced solutions (Docker images etc.) out for now and try to set up a minimum viable implementation as replicable on even the most basic shared hosting with PHP.

1) Setting up a PHP tile server

Tile Server PHP is an open source component (a single file PHP script, to be precise) that takes the role of delivering requested vector tiles (the square pieces a map is assembled from, containing all the map data for that area) to the user. Installation is very straightforward: download and place it in a dedicated directory on the server (detailed instructions here).

2) Upload the map data

Licenses and attributionsBackend and frontend components used in this demo are open source and free, including commercial use - some of the licenses however coming with attribution requirements. OSM’s data is also free to use with attribution, but pre-fabricated map tiles commonly have more restrictive licenses.OSM by nature is not a map, but a database of geographical information. Hence, in order to present it visually, the raw data needs to be translated into map tiles (in this case: tiles with vector data).

The 100% free and independent (but time-intensive) way would be to self-generate map tiles from OSM data. Much easier to work with, readily produced tile packages are available from openmaptiles.com. These are free for certain non-commercial, non-profit or evaluation use, but come with a license fee for everybody else. For the purpose of this demo, I obtained the package for Berlin. It is a .mbtiles file, and simply has to be copied into the same folder with the tileserver.php file.

The PHP script, a .htaccess file and the .mbtiles archive - my vector tile server is up and running; at 121 MB ready to serve OSM maps on all zoom levels for the entire Berlin area!

Obviously, this is where storage space considerations - and depending on the website in question also bandwidth/traffic - come into play. In my personal explorations, I assume a small-scale site using maps as content not unlike images and videos; the evaluation would likely look very different regarding the feasibility of self-hosting or when looking at a map-based web application.

While the Greater Berlin area weighs in at 120 MB (172 MB uncompressed), a map of Germany already consumes 3.4 GB of space, not to speak about the entire world at a whopping 56 GB. But to be fair: a lot of websites use maps for rather limited areas. The amount of map data in combination with traffic volumes give a good indication where a hosted solution becomes more attractive. If the described open source, vector-based solution appeals: Klokan Technologies, the company behind OpenMapTiles offers hosting plans as well.

3) Downloading a style

Since vector maps are just plain data, the Mapbox GL JS library requires a style sheet to render the data into a visual representation. OpenMapTiles comes with a range of freely usable styles, but to get started, it is easiest to create a copy of the default JSON file and save it under the same domain where the HTML page will reside.

The only change absolutely necessary for now is to change the URL under sources > openmaptiles > url to link to the tileserver.php file, providing the relative path to the style JSON as an attribute:

"url": "https://domain.tld/path-to-tileserver/tileserver.php?/name-of-the-style-json"

4) Create the embed code

Next, a frontend HTML page needs to be created. Using the sample code provided, all that needs to be done is changing the “style” URL to point to the locally hosted style JSON from step 3.

Now, if everything went well, calling the HTML page should display a map from the locally hosted vector tiles and using the style from the locally hosted JSON file. The JS console helps to trace issues if something doesn’t work - most importantly cross-domain issues.

The remaining steps are about optimising user privacy and server performance.

5) Eliminate third-party calls

If the desire is to create a fully self-contained solution, there are still a few resources to be moved, as the provided demo code calls the Mapbox GL JS library from mapbox.com and the default style references resources (e.g. fonts, icons etc.) from GitHub or other services. The documentation explains this fairly well:

Mapbox GL JavaScript code

To use a local version of the JS code, the required library can either be generated with NodeJS from the GitHub repository or - for friends of shortcuts - the following three files copied into one place on the local server: mapbox-gl.js, mapbox-gl.css and mapbox-gl.js.map; then change the according references in the HTML head.

Style assets

The assets for the chosen style are still referenced externally; to fix this in the case of the default theme used, the content of the GitHub repository needs to be stored with the style JSON. When for example copying them all into a subfolder named “style”, the according “sprite” entry in the style JSON becomes:

"sprite": "https://domain.tld/path-to-style/style/sprite",

Fonts

The fonts referenced in the style JSON can be stored locally by retrieving the according folders from GitHub and storing them in a folder below the style sheet. Then, the “glyphs” field in the JSON needs to be changed to reflect the new location.

"glyphs": "https://domain.tld/path-to-style/fonts/{fontstack}/{range}.pbf",

6) Optimize performance

In terms of performance, the documentation suggests two more steps:

Unpacking the tiles from the .mbtiles file

Since the requested tile has to otherwise be extracted from this compressed sqlite file every time (unless cached in the browser of a repeat visitor, of course), it is suggested to unpack the tiles into a directory structure using the mbutil tool - eliminating almost all server load and enabling direct delivery as static files.

Once these are uploaded to the server (e.g. in a subfolder), the “sources” values in the style.json have to be updated:

"sources": {
    "openmaptiles": {
        "type": "vector",
        "tiles": [
            "https://domain.tld/path/{z}/{x}/{y}.pbf"
        ],
        "maxzoom": 14
    }
},

Important at this point: if accessing the tiles from another domain, make sure to uncomment the section after “Option: CORS header for cross-domain origin access to all data” in the .htaccess file (otherwise Web GL is running into cross-browser issues).

Interestingly, once this has been done, it appears that the tileserver.php does not really play a role any more in all of this; once the .mbtiles file has been replaced with the folder structure, it is all managed by the frontend JS and the .htaccess file.

Multiple domains for simultaneous requests

Where applicable, several CNAME domain names could be set up to access the tile domain - this allows for more than the standard two concurrent HTTP connections between browser and server, further improving speed and therefore UX.

Result

This map is displayed using a 100% self-hosted solution, as described in the six steps above; no HTTP requests to any third parties.

What I really like: even though my 172 MB of hosted map data only contain detailed maps for Greater Berlin, it is still possible to zoom out and see the entire world - only when zooming into other areas, the lack of detailed tiles becomes apparent. This is a great UX for a range of common use cases on websites - in order to indicate the location of a store in Berlin, users do not expect access to a high res map of downtown New York in the same frame, yet it may be welcome to be able to locate Berlin on a world map.

Conclusion

Initially triggered by the desire to work around some of the privacy-related implications (UX and legal) of common map use on websites, this little experiment led to a set of options for the responsible and user-centred embedding of maps - here in order of my personal preference:

A) Static maps as a lightweight alternative

First of all, the explorations led to highly usable options for using static maps. This already covers a good share of common use cases on websites, both as content within a page or a design element for the layout, and comes with great privacy, zero compliance overhead and unbeatable performance. And it is always completely free.

Combining that approach with the Tile Server GL suite, developers could even locally build a static map generator using self-hosted map tiles - therefore being able to take full control over their content and appearance (none of the reviewed static map generators was able to consume vector maps, hence simply feeding them with the tile URLs of self-hosted vector tiles would not be sufficient).

B) Self-hosting interactive maps

Other than still some time ago, self-hosting maps has suddenly become a viable alternative to using external services. The OpenTileMaps offering is rather easy to implement (though not a copy-paste solution for front-end users). Even better, with full access to editing the styles, website owners not only take control over hosting and delivery, but over every aspect of their maps. Depending on the nature/scope of the project and the chosen source of vector tiles, this can be anything from entirely free to competitively priced (e.g. acquiring the Berlin map set used in this demo, including the global basemap at low resolutions, would carry a reasonable 20$US price tag for commercial use).

For applications where self-hosting is not feasible due to volume, performance, or maintenance options, using a hosted service for open source vector maps is an interesting alternative to services like Mapbox et al.

C) Contracting a partner as data processor

The next alternative would be to find an external provider of mapping solutions. When aiming for maximum legal security (as always: depending on the circumstances of a project and the risk-affinity of the website operator), the provision of a GDPR-based Data Processing Agreement could be an important variable here - avoiding the potential pitfalls of transmitting personal data of users (at a minimum, IP addresses will always be processed) to another data controller.

D) Consent UI as a compromise

Where static maps are not sufficient, and neither the self-hosted solution nor a commercial service is the right answer, using Google Maps, openstreetmaps.org or any other third party service remains a possible alternative; however, every website owner may want to carefully evaluate the conduct of the partner and consider the contractual insecurity when embedding third-party content without a DPA. Implementing a two-step embed, i.e. asking the user to confirm their intent before connecting to external services, could be a safeguard for the related dilemmas, but inevitably comes with a rather harsh impact on UX.

Footnotes

  1. Depending on the applied interpretation of the GDPR, the transmission of user data to an outside party comes with rather high requirements for e.g. being able to rely on “legitimate interest” as a basis for processing - risk-averse natures may read this as an indication towards the need for explicit consent, especially considering the tracking involved.
  2. Before doing so, always verify the license of the map visuals used for possible limitations and obligatory attribution requirements.
  3. For a great demo in German, see “OpenMapTiles: Revolution in selbstgehosteten Karten” from the FOSSGIS conference 2018.
  4. With the promise of highest privacy, due to no personal data being stored, yet as it appears (based on a friendly e-mail exchange from May 28) currently without explicit provision of a DPA.
  5. In case you want to serve map tiles to more than one domain from your Tile Server PHP, follow these instructions to modify .htaccess.