PMTiles version 3: Disk Layout and Compressed Directories

Aug 12, 2022

PMTiles is a single-file archive format for pyramids of tiled map data. The last post described the new Entry struct to compact repetitive data in-memory; the next step is to further shrink directories for storage on disk and transfer over the Internet.

PMTiles is designed to substitute for APIs like this:

https://example.s3-like-storage.com/tileset/{z}/{x}/{y}.pbf

One storage pattern is to store each tile as its own object, relying on cloud storage’s filesystem-like paths:

                  ┌───┐                   
┌────────┐   ┌───▶│███│ /tileset/1/0/0.pbf
│        │───┘    └───┘                   
│        │        ┌───┐                   
│   S3   │───────▶│███│ /tileset/1/0/1.pbf
│        │        └───┘                   
│        │───┐    ┌───┐                   
└────────┘   └───▶│███│ /tileset/1/1/1.pbf
                  └───┘

But this approach doesn’t scale up to planet-sized datasets, since millions of unique tiles can take days to upload.

A PMTiles archive is a single file upload:

https://example.s3-like-storage.com/tileset.pmtiles

The information mapping a Z,X,Y coordinate to the address of the tile must be stored in the file itself via an embedded Directory sector. Making interactive maps fast and affordable means making this directory as small as possible.


┌────────┐             /tileset.pmtiles   
│        │         ┌─────────────────────┐
│        │         │1,0,0 ┌───┐┌───┐┌───┐│
│   S3   │───────▶ │1,0,1 │███││███││███││
│        │         │1,1,1 └───┘└───┘└───┘│
│        │         └─────────────────────┘
└────────┘

PMTiles v2 punts on compression completely; it has a 1:1 relationship between the file layout and in-memory data structures. Waiting for half a megabyte of directory data for every map is slow, but the implementation remains dead simple and has been good enough to prove out the design across diverse environments like Cloudflare Workers.

The goal of the next specification version is not just to gzip directories and call it a day, but hand-tune a custom compression codec specific to map datasets. Projects like Apache Parquet combine multiple compression techniques for arbitrary non-tiled data; our approach will look more like the domain-specific compression for PNG images, but tuned to map tiles instead of RGB pixels.

Disk Layout

PMTiles v2 did not enforce any particular ordering for tile contents in the archive, so it’s easy to generate archives with multi-threaded programs like Tippecanoe. v3 adds an optional header field clustered: a boolean indicating the disk layout of tile contents is ordered by Hilbert TileID, analogous to FlatGeobuf’s indexed layout. A clustered archive enables optimized clients to batch tile requests for lower latency, inspired by Markus Tremmel’s COMTiles project.

clustered=false        clustered=true 
──────────────▶        ▼ ┌───┐ ┌───┐ ▲
─────▶ ───────▶        └─┘ ┌─┘ └─┐ └─┘
──▶ ──────────▶        ┌─┐ └─┐ ┌─┘ ┌─┐
──────────▶ ──▶        │ └───┘ └───┘ │
──────────────▶        └─┐ ┌─────┐ ┌─┘
───────▶ ─────▶        ┌─┘ └─┐ ┌─┘ └─┐
────▶ ────────▶        │ ┌─┐ │ │ ┌─┐ │
───────▶ ─────▶        └─┘ └─┘ └─┘ └─┘

Test Dataset

Our starting example is a global vector basemap tileset. It addresses 357,913,941 individual tiles, or every tile on every zoom level between 0 and 14. (It includes both an earth and ocean layer, so there are no holes.) After Hilbert run-length encoding, 40,884,468 Entry records remain.

A direct serialization of these records to disk is 40884468 * 24 bytes or 981.2 MB. Simple gzip compression reduces this to 305.4 MB, but we should be able to do better.

Varint Encoding

A web-optimized tileset should have individual tiles under a megabyte in size, so 32 bits for Length is overkill. We replace the fixed-size records with a stream of unsigned Varints. We also chop off unnecessary high bits used in TileId, RunLength and Offset.

This step reduces the 981.2 MB directory to 526.4 MB, or 53.6% of the original size.

 TileId      RL    Offset     Len
┌───────────┬─────┬──────────┬────┐         ┌─────┬───┬───┬────┐       
│    100    │  1  │    0     │2200│         │ 100 │ 1 │ 0 │2200│       
├───────────┼─────┼──────────┼────┤         ├─────┴┬──┴┬──┴───┬┴───┐   
│    101    │  1  │   2200   │2300│         │ 101  │ 1 │ 2200 │2300│   
├───────────┼─────┼──────────┼────┤ ──────▶ ├──────┼───┼──────┴─┬──┴─┐ 
│    103    │  1  │   4500   │2000│         │ 103  │ 1 │  4500  │2000│ 
├───────────┼─────┼──────────┼────┤         ├──────┴┬──┴┬───────┴┬───┴┐
│    104    │  1  │   6500   │1900│         │  104  │ 1 │  6500  │1900│
└───────────┴─────┴──────────┴────┘         └───────┴───┴────────┴────┘

Delta Encoding of TileID + Offset

Because a directory is sorted by ascending TileID, we can store deltas between consecutive entries instead of large numbers.

In a clustered archive, the physical layout of tile data will mostly match TileID order. Where tile contents are contiguous, we can keep Length while replacing Offset with 0, since the Length implies the delta to the next Offset.

Since this delta encoding makes values small, the varint step above should be even more effective.

These two encodings reduce the 526.4 MB directory to 243.2 MB, or 24.8% of the original size.

┌─────┬───┬───┬────┐                   ┌─────┬───┬───┬────┐
│ 100 │ 1 │ 0 │2200│                   │ 100 │ 1 │ 0 │2200│
├─────┴┬──┴┬──┴───┬┴───┐               ├───┬─┴─┬─┴─┬─┴──┬─┘
│ 101  │ 1 │ 2200 │2300│               │ 1 │ 1 │ 0 │2300│  
├──────┼───┼──────┴─┬──┴─┐     ──────▶ ├───┼───┼───┼────┤  
│ 103  │ 1 │  4500  │2000│             │ 2 │ 1 │ 0 │2000│  
├──────┴┬──┴┬───────┴┬───┴┐            ├───┼───┼───┼────┤  
│  104  │ 1 │  6500  │1900│            │ 1 │ 1 │ 0 │1900│  
└───────┴───┴────────┴────┘            └───┴───┴───┴────┘

Column transpose

Instead of storing each entry in order, we transpose the values to a columnar layout.

┌─────┬───┬───┬────┐           ┌─────┬───┬───┬───┐  
│ 100 │ 1 │ 0 │2200│           │ 100 │ 1 │ 2 │ 1 │  
├───┬─┴─┬─┴─┬─┴──┬─┘           ├───┬─┴─┬─┴─┬─┴─┬─┘  
│ 1 │ 1 │ 0 │2300│             │ 1 │ 1 │ 1 │ 1 │    
├───┼───┼───┼────┤     ──────▶ ├───┼───┼───┼───┤    
│ 2 │ 1 │ 0 │2000│             │ 0 │ 0 │ 0 │ 0 │    
├───┼───┼───┼────┤             ├───┴┬──┴─┬─┴──┬┴───┐
│ 1 │ 1 │ 0 │1900│             │2200│2300│2000│1900│
└───┴───┴───┴────┘             └────┴────┴────┴────┘

This step in isolation does not reduce the size of our directory. However, sparse geographic datasets will have repeated deltas of 1, RunLength=0 and Offset zeroed in the first step, which aids in the next compression step.

General-purpose compression

Finally, a general purpose compression algorithm like gzip is applied to the transposed directory.

This step reduces our 243.2 MB directory size to 91.6 MB, 9.3% of the original size. Without the column transpose above, the result is 102.0 MB.

Compressors like Brotli and Zstandard improve on gzip and are supported by the spec for when they’re widely available in browsers.

Conclusions + Next Steps

Our real-world, planet-scale dataset can address over 350 million individual tiles in just 91.6 megabytes, beating generic compression by a factor of 3.

The finishing touches to header design and directory partitoning are under discussion on GitHub and will be presented at the FOSS4G 2022 conference in Firenze, Italy, along with a richer tool ecosystem for PMTiles.