To deliver fast analysis of their documents to our clients, we need to retrieve a lot of images and their details as fast as possible.

We analyzed and compared the raw performance of several open-source IIIF servers compatible with the API 2.0 specification on specific base operations under different scenarios and image file formats.

All our tests were conducted against containerized versions of the IIIF servers (using Docker), hosted sequentially on the same bare metal server.

Our benchmark provides detailed information on how each server performs on the different scenario. The source code is also available as the project is an Open-source software. Anyone can test their own servers; we have made it possible to easily add new server implementations to the test suite. We also showcase the evolution of the latest IIIF servers updates and their impact on the afore-mentioned performances.

This analysis and benchmarking tool should allow project owners that require high-throughput from IIIF servers to make the best choice for their use-case.

Benchmark specifications

IIIF Servers

In this study, we compared 4 common servers compatible with the IIIF API 2.0 specification:

serverversionLanguage
Cantaloupe5.0.2Java
IIPSrv1.1C++
Loris3.2.1Python
RAIS4.1.0Go

Measurement environment

All the servers were run through a Docker container in order to make the tests reproductible. At the moment, only RAIS is delivered with an up-to-date docker image. Others servers required us to build our own images for the benchmark, which are available on the registry of the project.

We also tried to implement Go-IIIF and Hymir servers, but they appeared to be too specialized and either required specific image formats or a custom path resolver.

The machine hosting the different servers was a Hetzner CPX41 VPS (8 vCPU, 16Go of RAM) and the benchmark was executed through a VPN with a latency of 46.0ms (σ=0.53).

It was important for us to produce results via a network in a context of distributed processing. Moreover, we observed different data stream strategies among the servers, which could have an impact on results produced through a network.

Servers have been tested in a similar environment, aiming to use the maximum available CPU resource to complete the test. Cantaloupe and RAIS servers used threading by default. We had to spawn multiple processes (10 in the benchmark implementation) for Loris and IIPsrv to benefit from parallel processing.

Benchmark tool

We used Drill load testing tool to perform the benchmark. In our opinion this tool was easy to use and adapted to our use case, allowing to play with the concurrency or repetition number during the tests.

A Python script is responsible for generating the scenarios depending on the tested operations, image set and chosen concurrency. The script then run the load test on each running server invoking Drill. Once a load test is finished, the script parses the Drill report to generate, store and compare benchmark results in a JSON format.

Benchmark architecture
Benchmark architecture

Operations

We evaluated the computational efficiency on 4 operations on images we intensely use in our Machine Learning processes:

OperationApplication example
Image informationImages size or availability
Full imageSegmentation
ResizeClassification
CropProcess a section (e.g. sub-pages)

All the servers have been configured not to use cache, so we could refine results with more iterations without altering the response time.

Formats comparison

3 formats were used for stored images supported by the IIIF 2.0 specification: JPEG, TIFF and JPEG-2000.

We retrieved a set of 25 TIFF images from the public PPN321275802 Manifest, then we converted those images to JPEG and JPEG-2000 using respectively ImageMagick and libopenjp2-tools.

$ convert sample_01.tif -quality 90 sample_01.jpg
$ opj_compress -i sample_01.tif -o sample_01.jp2
$ opj_compress -r 10 -i sample_01.tif -o sample_01_10x.jp2
formatqualityaverage weight
TIFlossless2.53Mo
JP2lossless5.52Mo
JP210x compression1.28Mo
JPGq = 90%1.56Mo

To decide which format is the best to store images, we compared the response time on each server with the different formats. Servers have been run with a basic configuration and the OpenJPEG codec. The concurrency was set to 10 to observe servers resilience:

Server Cantaloupe IIPsrv Loris RAIS
Success 11500 0 11500 11500
Total 11500 11500 11500 11500
Crop 194.76 null 118.96 849.72
Full image 825.24 null 172.2 1153.84
Information 182.88 null 53.16 50.6
Resized image 503.76 null 215.08 1021.32
TIFF (concurrency = 10)
Server Cantaloupe IIPsrv Loris RAIS
Success 11500 11500 11500 11500
Total 11500 11500 11500 11500
Crop 144.4 151.52 137 101.68
Full image 2399.2 1461.6 1923.68 1712.96
Information 78.56 115.2 51.08 49.8
Resized image 722.2 520.16 649.6 513.52
JPEG 2000 (concurrency = 10)
Server Cantaloupe IIPsrv Loris RAIS
Success 11256 11500 11500 11500
Total 11500 11500 11500 11500
Crop 317.52 93.2 99.48 78.4
Full image 1374.33 555.72 1028.16 788.32
Information 92.2 71.72 48.88 48.36
Resized image 522.75 322.92 455.68 353.88
JPEG-2000 10x (concurrency = 10)
Server Cantaloupe IIPsrv Loris RAIS
Success 11500 0 11500 11500
Total 11500 11500 11500 11500
Crop 146.64 null 89.32 588.56
Full image 77.48 null 131.72 889.68
Information 80.68 null 51.6 53.76
Resized image 1981.64 null 196.52 744.84
JPEG (concurrency = 10)

JPEG seems to offers the best performance with similar resolutions, especially when retrieving full resolution images. Crop operations are fast with JPEG 2000, but Loris seems to handle crops even better with JPEGs.

JPEG compression artifacts are not really a problem in our case, as it poorly affect accuracy whereas reduce transfer and processing duration. JPEG is also a widely adopted format, which can avoid further complexity or errors in our processes.

In our specific context, IIPsrv which does not handle JPEG as input format is not the best choice for performance although it seems to have the best results with JPEG-2000 decoding. IIPsrv neither handle simple TIFF images as it requires pyramidal tiles.

Note that the output image format used to compare the response time has been set to JPEG, whatever the input format.

Performances results

Concurrency

As JPEG seems to be the best format for our need, we compared compatible servers response with different levels of concurrency:

Response times depending on concurrency Response times depending on concurrency Response times depending on concurrency Response times depending on concurrency
Response times depending on concurrency

We can observe that RAIS performances drop quickly on most operations. This behavior is caused because of a software limitation decoding JPEGs with ImageMagick causing disk space errors with concurrency.

Loris seems to handle concurrency relatively well compared to Cantaloupe. Cantaloupe is extremely fast serving the full image because it does apply no transformation, which is rather smart but can be less consistent with other operations output.

Decoding libraries

We decided to compare different available libraries to decode images on the Cantaloupe server. This server is easily configurable and offers many options for image decoding.

JPEG
Library GraphicsMagick ImageMagick Jai TurboJpeg Java2d
Success 2875 2875 2875 2875 2875
Total 2875 2875 2875 2875 2875
Crop 219.56 252.72 1614.36 84.24 142.44
Full image 81.68 124.48 74.76 54.8 70.92
Information 131.72 174.24 81.8 54.72 86.52
Resized image 243.96 319.92 1739.24 178.76 1736.16
JPEG libraries comparison on Cantaloupe (concurrency = 1)

We had to use the 4.1.9 release of Cantaloupe to test ImageMagick and GraphicsMagick, has they have been removed in the next releases.

Those results shows that libjpeg-turbo clearly improve decoding performance on JPEGs compared to the default Java2d library:

OperationSpeed improvement
Crop69.1%
Full image29.4%
Information58.1%
Resized image871.2%

Loris, which is based on Pillow, showed rather good performance handling JPEGs too.

Comparison of Loris and Cantaloupe libjpeg-turbo

We compared performances between the two most promising servers for JPEGs, with different levels of concurrency:

Comparison of Loris and Cantaloupe (libjpeg-turbo) performance depending on concurrency Comparison of Loris and Cantaloupe (libjpeg-turbo) performance depending on concurrency Comparison of Loris and Cantaloupe (libjpeg-turbo) performance depending on concurrency Comparison of Loris and Cantaloupe (libjpeg-turbo) performance depending on concurrency
Comparison of Loris and Cantaloupe (libjpeg-turbo) performance depending on concurrency

Those results shows interesting conclusions:

  • Cantaloupe strategy on serving source images directly makes it incredibly fast, especially with low concurrency
  • Both servers have rather similar speed for other image processing on low concurrency
  • Loris can significantly improve crop and resize operations with high concurrency (respectively 31.9% and 45.9% with a concurrency set to 50)
Other tests

The generic implementation of the benchmark allowed us to see how certain parameters could influence the performance.

JPEG quality factor
Evolution of Loris response time with JPEG quality (concurrency = 1)
Evolution of Loris response time with JPEG quality (concurrency = 1)

Quality does not have a huge impact on response time between 75 and 95 quality parameters. However setting a quality from 100 to 95 can represent a significant improvement:

OperationSpeed improvement
Crop26.2%
Full image17.7%
Information-0.41%
Resized image10.3%
JPEG-2000 decoding libraries

We compared Kakadu and OpenJPEG implementation performances on the Canataloupe server:

Library Kakadu OpenJPEG
Success 2875 2875
Total 2875 2875
Crop 133.92 155.44
Full image 1052.84 2892.44
Information 97.08 81.04
Resized image 275.2 925.32
JPEG-2000 libraries comparison on Cantaloupe (concurrency = 1)

Kakadu library makes a hugemore performant for most operations, especially for full and resized images:

OperationSpeed improvement
Crop16.1%
Full image174.7%
Information-16.5%
Resized image236.2%
Large source files

Many projects using IIIF uses high resolution images. We ran the benchmark with a set of two images approaching 10,000px to see if it makes an important difference.

FormatQualityAverage weight
JPGq = 90%11.57Mo
TIFlossless137.00Mo
JP2lossless57.79Mo
Large image information

We tested resilience by serving large images on each servers, with a concurrency of 10:

Server Cantaloupe IIPsrv Loris RAIS
Success 2087 2300 2100 1345
Total 2300 2300 2300 2300
Crop 696 480.5 675.5 337
Full image null 4334.5 null null
Information 108.5 546.5 73.5 71
Resized image null 4320 null 6799
JPEG-2000 (concurrency = 10)
Server Cantaloupe IIPsrv Loris RAIS
Success 2245 0 2300 108
Total 2300 2300 2300 2300
Crop 969 null 643.5 9992
Full image 594.5 null 1070.5 9523
Information 559 null 85.5 95
Resized image 3885.5 null 1480 null
JPEG (concurrency = 10)
Server Cantaloupe IIPsrv Loris RAIS
Success 2295 0 2300 145
Total 2300 2300 2300 2300
Crop 142 null 261.5 9977.5
Full image 7158.5 null 679.5 9620
Information 112.5 null 76 93
Resized image 2782.5 null 1124.5 null
TIFF (concurrency = 10)

Although TIFF format takes more disk space, it seems to be the average best format for large images. Loris has the best results combined with TIFF. It is the only server that managed to handle this format and JPEG for large files. IIPsrv is the only server that manage to complete all requests with the JPEG-2000 format.

Performances on those 3 cases were close to the single concurrent results, it can be explained because one process is spawn by core on those implementations, and the bare metal server used for the benchmark has 8 cores.

Conclusion

IIIF servers are usually optimized to allow a fast navigation on high quality images. In the context of unpredictable high throughput, results showed that choosing the best format and server depending on the use case could have a major impact on performance. Response time with generally explode with a high concurrency and may cause errors during Machine Learning processes.

We used to have a Cantaloupe server, and a first step would be to update the configuration to use libjpeg-turbo. This change could have a huge impact and make certain jobs 10 times faster. Depending on the need, different comparisons of servers and formats are possible. JPEG, which is a light, fast and widely adopted format, seemed to be the best option to store images in our case.

If we really want to optimize speed on crop and resize operations, we may consider switching to Loris. This server seems to handle concurrency pretty well on those operations, and is closer to our tech stack as written in Python.

The benchmark we created to generate those data is an Open-source and free software. Any re-use or contribution to this project is welcome, and could help to:

  • Update a server to a newer release
  • Add and compare results with a new server
  • Compare a different images set or format
  • Test a specific IIIF feature