Teklia - What not to do when implementing IIIF

In a previous blog post, we talked about the performance concerns with IIIF servers for Machine Learning workflows, but performance is not the only factor that we had to take into account. Before we even had a real need for high performance from servers during the development of Arkindex, we ran into numerous issues related to the lack of compliance to the IIIF specification by various IIIF server administrators, as well as a lack of knowledge about necessary workflows.

This post documents some of those issues, both to help other IIIF consumers when interacting with external servers, and help implementors or server administrators in trying to avoid them. A few other posts will be published later to share how we managed those issues in Arkindex or how some can be detected automatically.

IIIF Image API

IIIF is divided in multiple APIs, with the two most important ones being the Presentation API and the Image API. Most of our troubles appeared with the Image API, mainly because it is the most widely supported and used API, both in our Arkindex platform and by others.

The API is divided in two parts:

The Image Information Request, more informally known as the info.json, a JSON file which describes how the image can be accessed with this server;
The Image Request, which returns an actual image with the specified processing parameters.

Both of those API endpoints have been the source of various issues.

HTTP status codes

As a consumer of an HTTP resource, you can generally expect that the server will return an HTTP status code that is relevant to the actual status of the response: for example a 200 code for a successful request, a 400 for an error made on the consumer side, or a 500 for an error on the server's side.

It is common for proprietary APIs that do not follow any standards to return strange status codes, but with a proper specification like IIIF, clients can usually expect some sensical values.

The Image API specification explicitly defines a meaning for some HTTP status codes. Those codes are very common in HTTP.

A 400 Bad Request means that the client made a request that does not fit the IIIF-defined URI syntax extension.
A 401 Unauthorized or a 403 Forbidden can only be related to authentication issues, and are usually associated with the use of the IIIF Authentication API.
A 404 Not Found occurs if the image is not found, or if the specified parameters, despite being valid according to the specification, cannot be satisfied by the server.

This last code strays away from the intended HTTP definition: most REST APIs will usually return a 400 error code, not a 404, if the parameters are unsupported. A large portion of the IIIF servers do not return either 400 or 404 for invalid parameters anyway, making this detail irrelevant.

Some servers will throw exceptions for invalid parameters or non-existent images, causing a 500 Internal Server Error, and some others will cause a 403 Forbidden as if the request parameters were used for authentication.

The only option for a client would then be to check that the status code is strictly 200 and nothing else, since any error code cannot be reliably interpreted as anything other than "Something went wrong". But a 200 status code does not mean a success either.

Some servers will return a 200 OK response with a Content-Type header set to a valid image MIME type, but will, instead of returning an actual image, return an error message. This tricks clients into thinking the request is successful and causes errors or crashes when trying to parse the image.

Management of invalid parameters

An invalid parameter is simply a parameter that doesn't make sense in an image context: for example, requesting a resize to 0 (or fewer) pixels or cropping an image beyond its bounds. IIIF defines some of the expected behaviors for such expected values: for example, if you ask to crop an image to a rectangle larger than the image itself, then there is no crop at all and the full image is returned. If you ask for zero pixels, you should get an HTTP 400 or 404 status code.

As mentioned above, some servers will not return the proper HTTP status codes, but some servers will also choose to return a valid image for invalid parameters. This could be arguably acceptable if the server was so well implemented that it goes beyond the IIIF specification and provides support for some combinations of parameters that were simply impossible.

We do not currently know of such servers. Instead, it is entirely possible for a server to give you a full-sized image or show undefined behavior when requesting for invalid parameters. What this means for IIIF clients is that they must strictly verify that they get the image they expect. For example, the client can check the image's file format, its size, or its color channels to check for a proper resize, rotation or requested quality.

Management of unsupported parameters

While an invalid parameter is one that the server couldn't understand, an unsupported parameter is one that the server understands (that is, it falls within the IIIF specification), but that this particular server doesn't support. The specification states this should cause an HTTP 501 code, standing for "not implemented". This code is rather rarely seen in APIs, so a client that overlooks this part of the specification would rarely notice this code. We do not know of any IIIF server implementations that actually use this code, so the lack of implementation on the client side is not much of a concern.

Most IIIF servers will treat unsupported parameters like invalid parameters and return an HTTP 400 or 404 code. This is already better than a 500 or a "fake 200" code, as this makes it clear to the client that it did something wrong and the server is not at fault. However, some implementations will instead try to "clamp" the parameters down to a strict set of parameters that they can accept.

The most common use of IIIF is with web viewers like Mirador, which is based on OpenSeadragon. OpenSeadragon, like other viewer libraries, will make of the tiling features of the Image API to load smaller tiles one by one, allowing servers to cache the most common requests and reducing bandwidth usage.

Some servers take this one step further by only allowing either the full resolution, unmodified image, or the pregenerated tiles, with nothing in between. To compensate for some possible miscalculations on the viewer's part, instead of returning an error code, the server looks for the tile that would be the closest to the requested crop or resize and will return it instead.

However, in Machine Learning workflows, we will likely request very specific crops and resizes. For example, we often try to retrieve a single line of text, to get the only specific image quality or format that a model supports, or to pass along any settings that a user configured on our platform such as a rotation angle. Getting a tile, which could have the wrong format, the wrong size, the wrong quality or the wrong rotation, just causes more errors and forces us to request a full image and do the image processing on our side; this cancels out any benefits in terms of bandwidth usage.

Images with footers

While developing Arkindex, we encountered an IIIF server with an unusual behavior: when requesting the original, full-sized image, or when requesting the tiles advertised in the info.json manifest, everything works as expected. But when requesting any crop, resize, rotation, quality or format other than those defaults, the returned image included an extra footer with the logo of the institution that hosts the image.

This means two things: first, image is not of the expected size, and second, text recognizers would potentially recognize the text in the institution's logo as if it were part of this image. This behavior is definitely not standard, and the Presentation API is supposed to provide the copyright or source information itself without having to add it to the image. Due to this dubious feature, it's again apparent that clients must re-verify every response from a server.

Advertised features

The Image Information Request can be used by a client to find out exactly what the server is capable of. The Image API defines compliance levels: lists of features that can be optional or mandatory depending on the level. Level 0 was designed to be implemented using a completely static website, not requiring any special-purpose server; all that is required is a specific directory structure to mimic the Image API's URI syntax. Each level adds more and more required features.

Servers can advertise that they are compliant with a specific level of the IIIF Image API, and additionally add any optional features that they also provide. For example, a fully-static server could implement Level 0 and additionally provide 90-degrees rotations in its files. An info.json file could then contain, among the other image metadata, a Profile Object:

{
 "profile": [
 "http://iiif.io/api/image/2/level0.json",
 {
 "supports": [
 "rotationBy90s"
 ]
 }
 ]
}

This system enables clients to fully understand what a server is capable of, and possibly circumvent its limits. For example, if mirroring is not supported, then the client might mirror the image by its own means after downloading it.

Many servers however specify a profile that does not truly match what they are capable of. This can be due to server administrators copy-pasting from the specification's examples, not actually testing the features before advertising them, or not updating their server's profile depending on its configuration. While the server profile could be a very useful tool, at the very least for troubleshooting, it usually cannot be trusted.

Maximum size settings

One of the settings defined in the aforementioned Profile Object is the maximum size of returned images. Servers that have to handle very large images often are configured to not accept returning the entire image or returning very large crops, to try to prevent overloading themselves. To announce to clients that they may not be able to exceed a specific size, the IIIF specification defines three properties: maxWidth, maxHeight and maxArea. The latter defines a maximum number of pixels, no matter the exact dimensions.

While the specification defines that requests exceeding those parameters should be rejected, most servers will actually accept those requests but return an image resized to those maximum sizes. While this could be a useful feature, it leads to confusion and miscommunication when the server does not properly advertise its maximum size settings.

Some servers will just not include any of the settings, and some will advertise their maximum width or height using root properties:

{
 "profile": "http://iiif.io/api/image/2/level2.json",
 "maxWidth": 4000
}

This means that, again, the Image Information Request cannot be trusted. The only reliable way to find out what the maximum width or height is is to make multiple requests manually on images that are very large, and looking at the actual returned image dimensions.

IIIF Presentation API

The Presentation API is used to assemble collections of images and link them to various metadata, including source information and physical dimensions. It also enables annotating images, for example by putting text in a rectangle to display a single line of text.

Most of the issues we had with this API were related to the IIIF viewers; those usually were implementing only the small portion of the specification that had examples within <pre> tags, and ignored everything else. Our Arkindex IIIF import implementation, however, also fell into this trap.

All of the IIIF specifications rely on JSON for Linked Data. JSON-LD defines a way to specify that a JSON document has a specific schema, akin to a Document Type Definition (DTD) or XML Schema Definition (XSD) for XML files. While this can be useful for data validation, it also means that the complexity of reading any response from an IIIF server is heavily increased.

The JSON-LD Considerations section describes some of the ways in which the typical IIIF manifest can be modified by JSON-LD:

Any property can be a URL to any file that holds the actual value of this property. It does not have to be a JSON file or use HTTP.

{
 "description": "ftp://host/path/to/file.xml"
}

Most properties can be an array of values instead of a single value.

{
 "description": [
 "This is a description.",
 "This is another description!"
 ]
}

Any value can be defined as an object to give it other metadata, in particular multilingual support:

{
 "description": {
 "@language": "en",
 "@value": "This is a description."
 }
}

It is possible to combine all of those characteristics together:

{
 "description": [
 "This is a description.",
 {
 "@language": "fr",
 "@value": "Ceci est une description."
 },
 {
 "@id": "ftp://host/path/to/file.xml",
 "@format": "text/xml"
 }
 ]
}

This situation means that the few examples given in code block sections of the specifications are not representative at all of what could happen with real-life data, such as:

When JSON-LD schemas are given as URLs, some validators will download the schema every time to validate it, meaning that if the server hosting the original specification goes down, a large amount of servers will go down. This had caused some outages on ActivityPub servers, which also rely on JSON-LD.
This can introduce security issues, since the manifest could instruct a JSON-LD validator to load anything from anywhere.
JSON-LD is not very common and libraries are not available in every language, making it even harder to entirely comply with IIIF.

Because most IIIF viewers do not support most of those JSON-LD features, and since most IIIF Image and Presentation API servers have been developed and deployed solely for the purpose of displaying manifests within those viewers, encountering the Linked Data portion of JSON-LD is rather rare. However, multilingual strings can sometimes be seen, and care should be taken to handle an array of values and a @language attribute in most strings to avoid errors with more advanced manifests.

In Arkindex, we had initially entirely overlooked the JSON-LD part in our IIIF manifest import. We now have support for multilingual strings, and external URLs are ignored for security reasons. We also ignore the schema validation part to allow for any workarounds that we might need, in case we have no choice other than to support a non-compliant manifest to complete a project.

General issues

More general issues related to the administration of IIIF servers have also hampered our development of Arkindex. While these are not related to IIIF specification compliance, they can also make the use of IIIF resources more complicated.

HTTPS and CORS

HTTPS is now both ubiquitous and very easy to setup, particularly with Let's Encrypt. Modern web browsers usually make HTTPS a requirement nowadays, especially when fetching web pages using JavaScript. If a Mirador instance hosted on an HTTPS server tries to load a manifest from an IIIF server that only has HTTP, the web browser can sometimes reject the request.

Another security feature that IIIF server administrators should care about is cross-origin resource sharing (CORS). CORS defines a set of extra HTTP headers that can be used to instruct a web browser when to accept or reject an HTTP request from JavaScript code that points towards an external domain, relative to the current webpage. This is a security feature that could prevent various types of attacks, including cross-site scripting (XSS) attacks.

The IIIF specification recommends that servers set the Access-Control-Allow-Origin to *, which tells web browsers that requests can be made to the IIIF server from anywhere. This is the standard setting for many public APIs as this means that any JavaScript application anywhere can access the data.

Over the last few years, we have seen most of the HTTP-only IIIF servers we knew of be forced to migrate to HTTPS and configure CORS properly. As mentioned earlier, most of those servers have only been built for and tested on IIIF web viewers such as Mirador: with the new web browser restrictions, HTTPS and CORS becomes mandatory.

However, it is still important to know that some servers used to use HTTP, so clients may still find HTTP URLs in some manifests. Some servers that migrated to HTTPS do not always have the HTTPS redirection configured properly, so trying the HTTPS version of a URL when HTTP fails can sometimes bring better results.

No documentation

As developers, most of the time, what we will get from our users is a bug report with a URL to an image and nothing more. For some servers, we can find out which institution hosts the server, and maybe find a page of documentation somewhere in their website to mention that they have an IIIF server. However, that documentation is only for users who do not know IIIF, and many times will just link to the IIIF specification.

If we find out that an image inside a manifest does not exist, or that a server does not behave properly at all, finding which contact form we should fill out or which email address we should contact can be very hard. A WHOIS lookup might sometimes return an email address for abuse reports or security issues, but this would not be the right point of contact. Using a university's contact form means an inquiry could take quite a while to be dispatched to the right person as well, if we even get any answer.

In a few research projects, the main point of contact for that project is able to navigate through the institution by themselves and warn of the issue. Most of the time, we end up adding another workaround specific to the server on our platform or ignoring some missing pages.

This does not have to be so hard. Just adding an email address, a specific service to contact, or a contact form is enough to allow users to give feedback on the server. Even if a server looks like it is running perfectly, it might end up going down or having an issue sometime later, so having a way to report bugs is always better.

No redirections

Another problem we encountered with some servers is that over time, data gets moved around, and URLs suffer from link rot. Images and manifests look like they have disappeared entirely, when they were actually just moved to another directory or server.

There are multiple HTTP status codes to handle redirections and you are probably already encountering them dozens of times a day while browsing online. Configuring an IIIF server to report an HTTP 301 Moved Permanently is a simple way to tell any client that they should now use another URL.

The alternative would be, again, to document the changes more. If the server is only used for one particular research project, notifying all the stakeholders can be enough, but if the server is public, it's more complicated: there's no way to truly know who all might be using the server and who might need some help to cope with any changes on it.

Conclusion

While IIIF does bring promising opportunities on making digital archives available online, it also comes with its fair share of issues. At Teklia, we do not blame anyone for any of those issues, as we made our own mistakes as well. However, we hope that something better can be achieved. We are ready to provide our expertise with IIIF to libraries, research institutions or any other entity with digital archives to share.

In this post, we only mentioned IIIF 2.1, as this is the most widely implemented version. IIIF 3.0 came out over a year ago now, and does make a few interesting attempts at fixing some of the issues discussed in this post. We hope that, as the adoption of IIIF 3 begins, things will change for the better.

We will soon post some tips on how to use automated tools to make a server more compliant with an IIIF specification more easily, as well as how we coped with all of the issues mentioned here in our Arkindex product. Stay tuned!