IIIF at Scale

Tristan Roddis, Cogapp, UK, Rob Lancefield, Yale Center for British Art, USA, Jeffrey Campbell, Yale University, USA, Stefano Cossu, J. Paul Getty Trust, USA, Neil Hawkins, Cogapp, UK

Abstract

The International Image Interoperability Framework (IIIF) is rapidly gaining traction as the preferred way to share open-access images and metadata for cultural and scientific institutions worldwide. It provides efficient and easy use and reuse of high-resolution and zoomable images, and can provide internal and external users with powerful tools for scholarship, collaboration, storytelling, and more. However, once you are convinced by the why, there remains the tricky question of how. This presentation aims to demystify the tools and techniques needed to deliver IIIF images and metadata at scale by illustrating different approaches suited to image counts ranging from a few hundred to several million.

Keywords: iiif

The International Image Interoperability Framework (IIIF, https://iiif.io/) is a protocol, or set of specifications, for delivering images and metadata over the Web. A community of volunteers maintains these specifications, and open source and commercial software products implement them. 

IIIF is rapidly gaining traction as the preferred way to share open-access images and metadata for museums and other cultural and scientific institutions worldwide. It allows the easy use and reuse of high-resolution and zoomable images, and can provide internal and external users with powerful tools for scholarship, collaboration, storytelling, and more. More than a billion images are available online via the IIIF protocol today (Keller & Cramer, 2018).

Adopting IIIF also has significant advantages in regard to efficiency of human and computer resources in managing image and metadata delivery. However, the task of understanding and implementing even the core components of IIIF can be daunting, and the most suitable approaches to doing so often vary according to the amount of content that needs to be made available.

In this paper, we survey the processes and software needed to provide four key elements of the IIIF stable, namely these Application Programming Interfaces (APIs):

  • Image API: delivering images in a standards-based and easy-to-use manner 
  • Presentation API: describing those images in context and how they link to other images
  • Content Search API: finding “annotations” for those images (e.g., transcribed text or educational commentary) 
  • Authentication API: managing access to restricted resources based on access policies

Each section starts with a quick explanation of what each service provides, followed by a look at suitable implementations for small, medium, and large installations. We also discuss workflow considerations involved in preparing your data for delivery via these APIs.

Image API

Why?

Implementing the IIIF Image API specification (https://iiif.io/api/image/) allows websites to easily provide resized and cropped versions of their images simply by changing URL parameters, without the need to pre-generate derivatives of the images for all the possible variations. This is a significant advantage in regard to software and content management. In combination with open-source tools such as OpenSeadragon (https://openseadragon.github.io/), IIPMooViewer (https://github.com/ruven/iipmooviewer), and Leaflet (https://leafletjs.com/), it also allows the easy embedding of zoomable image viewers (figure 1). Furthermore, this API allows custom experiences using images, when combined with suitably modified viewers. One example is “slow looking” (figure 2, http://slowlooking.cogapp.com/).

Image of a painting of a horse

Figure 1: Zoomable image viewer from the Yale Center for British Art.

 

Slow looking intro screen

Figure 2: Slow looking on the Clyfford Still Museum website (https://collection.clyffordstillmuseum.org/).

How?

The scalability of a system can be measured along three axes: availability, volume, and traffic. These axes can affect each other but may also be independently sized: e.g., a small data set may receive a very large amount of traffic, or a set of critically important data may receive low traffic but cannot afford to miss hits. This is especially important for the Image API, which is by far the most influential piece for all of these factors. Therefore, choosing an appropriate image delivery technology is critical to operating at scale. 

To implement the IIIF Image API, special server software is required that can understand URL requests and convert source images on disk into a format suitable for delivery online (usually JPEG). Popular open-source server software includes IIPImage (https://iipimage.sourceforge.io/), Cantaloupe (https://cantaloupe-project.github.io/), and Loris (https://github.com/loris-imageserver/loris). The best approach to use varies according to the amount of content, surrounding infrastructure, and other factors. 

In order for an image server to work efficiently, it needs a specially formatted derivative that embeds several resolutions of the same image. This is treated in further detail below in the “image preparation workflow” section. For now, suffice it to say that converting an institution’s high-quality images (e.g., TIFFs produced by an imaging department) to these specially formatted derivatives (just one per source image) is a strongly recommended step for using the IIIF Image API. It is also the only image conversion that a IIIF adopter needs to worry about.

Tens of images: single server

For a small number of images, a single standalone server is sufficient. In practice this is likely to be a virtual machine or Docker container, but the idea is the same: the IIIF server handles all requests, and it sources the images from its disk or attached storage (figure 3).

Single server

 Figure 3: For a low-traffic website with a small number of images, a single server will work.

Other options for small installations, as well as larger ones discussed below, include using a third-party managed hosting provider such as IIIF Hosting (https://www.iiifhosting.com/) or leveraging existing infrastructure by using a IIIF-compliant digital asset management system (e.g., Luna, http://www.lunaimaging.com/iiif) or collections management system module (e.g., eMuseum https://www.gallerysystems.com/products-and-services/emuseum/). The latter approaches can have data-security implications that are beyond the scope of this paper, but are important to assess.

Thousands of images: server cluster

For larger installations involving thousands of images and moderate traffic, resilience becomes an important issue. Switching from a single server to a cluster of two or more IIIF servers (figure 4) eliminates a single point of failure and allows you to cope with spikes in demand by adding more servers to the cluster (horizontal scaling).

Two servers with shared storage, behind a load balancer

Figure 4: For higher-traffic sites, use a cluster of servers.

While a small number of images could be replicated across each server, with a larger number it is best to maintain a single copy of each image in order to reduce storage costs (this presumes, of course, robust backup of all images as well). These may be placed on a shared storage system such as Amazon’s Elastic File Storage (https://aws.amazon.com/efs/), dedicated storage appliances, or a storage cluster such as Gluster (https://www.gluster.org/). Not all image servers support all storage options, so the safest options are those that can exactly mimic disk storage. This architecture needs to route traffic between the different members of the server cluster; one way to do this is with a dedicated load balancer.

Millions of images: CDN, autoscaling cluster, low-cost storage

When working with millions of images, storage costs are especially important. This makes it important to choose the lowest-cost system available that offers instant, reliable access. If using Amazon Web Services, for example, the Simple Storage System (S3) is over ten times cheaper than EFS ($22.88/TB versus $298.50/TB at the time of writing).

Two servers behind a CDN

Figure 5: Use low-cost storage and aggressive caching to deal with lots of images and high traffic

If your website receives a lot of traffic, it will benefit from a suitable caching strategy. This can take many forms, such as:

The last option provides the further advantage of improved protection against distributed denial of service (DDOS) attacks, by keeping malicious traffic from even reaching your IIIF servers.

Finally, to avoid having to manually maintain a server cluster, and to reduce server costs during times of low traffic, you can implement auto-scaling rules based on server load, traffic volume, etc. Doing this can be relatively straightforward or more complex, depending on which Docker orchestration system or cloud hosting provider you are using.

Image preparation workflow

Along with strategies for delivering images, you will need to consider how to prepare them in a suitable format for delivery by IIIF servers. This involves converting any images to “pyramidal” formats that store derivatives at multiple resolutions within a single file. At the time of writing the two supported formats are JPEG2000 and pyramidal TIFF (PTIFF).

Tens of images: manual conversion

For a very small number of images, it is sufficient to do this manually using image editing software that supports either of these formats (e.g., Adobe Photoshop, GraphicConverter, or Pixelmator).

Thousands of images: command line

With hundreds or thousands of images to process, opening each one in an editor becomes unsustainable. At this point you should consider ways to automate the operation. Popular ways to do this include command line tools such as Imagemagick and VIPS (or in certain cases, an automated batch process in software such as Adobe Bridge using Photoshop Actions). If your target format is JPEG2000, two utilities that can help are the open source OpenJPEG library (https://www.openjpeg.org/) and the proprietary Kakadu library (https://kakadusoftware.com/).

Image conversion and upload to your storage are ideally integrated into the normal image preparation process: for example, as triggers when an image is uploaded to a digital asset management system (DAMS) or collection management system, or by nightly tasks that poll their APIs. This is called an ETL (Extract, Transform, Load) pipeline and is often implemented with custom software that reflects the individual complexity of an institution’s IT architecture. 

Yale employs a custom workflow database to support the museums’ needs related to digital assets. Each museum adds records to this database and copies the associated master files to a storage location. The custom application then submits the master images to a digital preservation system (this is not part of the IIIF workflow), creates a 4K derivative of each image for ingest into the enterprise-scale DAM (also not part of IIIF workflow), and triggers a workflow to create and store a pyramidal TIFF as the file that will be used by the IIIF image server.

The Getty maintains its own ETL software (Cossu, 2018), which pulls data from sources owned by different Getty Programs: the GRI Library and Archives, the Museum Collection Management System and DAMS, etc. The complexity and volume of this data required a solid application framework that supports starting and managing ETL jobs via a web interface and REST API (https://en.wikipedia.org/wiki/Representational_state_transfer), visualizing logs, and, as planned, eventually relying on real-time notifications to trigger updates based on source data changes. This runs on one Amazon Elastic Compute Cloud (EC2) instance. Optimizing code and running many jobs in parallel has proven to be more efficient than horizontal scaling. More dynamic architecture has not been needed; even under peak load the main bottlenecks are the source data systems.

Institutions with simpler production workflows and less dynamic sources may opt for a less complex approach or even a prepackaged software solution. A reasonable tipping point for adding complexity is when someone is spending more time monitoring and assisting daily operations than writing code to automate them. In any case, there are very few scenarios in which a data set changes so little over time as to make completely manual updates viable. 

Millions of images: serverless architectures

If you need to convert lots of images very quickly, you need a system that can scale up very quickly but not incur excessive costs after initial bulk conversion is done. Temporarily renting high-capacity virtual machine (VM) server instances from cloud providers is one approach, but a more elegant solution is to use serverless computing such as AWS Lambda (https://aws.amazon.com/lambda/). These systems can process thousands of tasks in parallel, and they can automatically scale up or down according to need.

This technique is used for the Endangered Archives Programme site, which hosts millions of IIIF images. In this scheme, an AWS Lambda function that carries out the conversion to JPEG2000 is triggered by the arrival of an unconverted TIFF image in an S3 bucket (Roddis & Farquhar, 2018).

Presentation API

Why?

Implementing the IIIF Presentation API specification (https://iiif.io/api/presentation/) allows websites to provide metadata about their images in a standardized and interoperable way. Presentation “manifests” are files in JSON (JavaScript Object Notation, https://www.json.org) format that can carry structural information (e.g., about visible and raking-light images of a painting, reading order for pages in a book) and descriptive metadata (title, artist, accession number, materials, etc.).

Providing data in this standard format makes it possible to integrate third-party open-source viewers into your website. The most widely used IIIF viewers are Mirador (figure 6, https://projectmirador.org/) and the Universal Viewer (figure 7, https://universalviewer.io/). Providing manifests also enables interoperability with information from other repositories (figure 8).

Horse painting in Mirador

Figure 6: A Presentation manifest from the Yale Center for British Art displayed in Mirador.

 

Horse painting in UV

Figure 7: The same manifest displayed in Universal Viewer.

 

Horse painting and flower painting side-by-side in Mirador

Figure 8: Manifests from different organizations (Yale Center for British Art and J. Paul Getty Museum) displayed side-by-side in Mirador.

How?

Delivering manifests at different scales is only minimally affected by their sheer number, since unlike image files, these text files are lightweight to store and process. However, the optimal approach to producing manifests can change depending on their number and frequency of production, among other factors.

Tens of manifests: hand-crafted JSON

If you are only creating a few manifests, you can do so “by hand” by creating individual JSON files and putting these on a web server.

Text editor with JSON

Figure 9: Use a text editor with syntax highlighting to create JSON manifests.

To avoid errors you should use an editor with syntax highlighting (e.g., BBEdit, OxygenXML, or many others), and/or use a JSON validator service such as JSONLint (https://jsonlint.com/). To check the contents of a manifest, you can use the IIIF Consortium’s online validator (https://iiif.io/api/presentation/validator/service/) or the Tripoli validator (https://github.com/DDMAL/tripoli).

Thousands of manifests: templating

Much better than creating manifests by hand is to use a templating system to generate them reproducibly from your source data. If the source data is available in a standard XML format such as LIDO (http://network.icom.museum/cidoc/working-groups/lido/what-is-lido/) you can use a transformation language such as XSLT (figure 10).

Source XML data being transformed by XSLT

Figure 10: Add your data to a standard template to pre-generate manifests.

Alternatively, if the data is available in other formats (e.g., JSON via an API call to your collections management system), you can use any language that understands JavaScript to create a means of converting the data from one format to another. Once generated, these manifest files can be output statically, or stored in a database, before serving to site visitors.

The process Yale is implementing for manifests involves harvesting metadata from each of its museums using the W3C Activity Streams protocol (https://www.w3.org/TR/activitystreams-core/) and writing that to a central database which will also contain image data from the custom workflow database mentioned above. Each museum will generate its Activity Stream from a collections management system or related system. A process will build manifests from this data and store them as flat JSON files in a central location.

The Getty creates manifests as part of the ETL pipeline that creates the IIIF images. Source records are pulled from TMS (Getty Museum) or Rosetta (Getty Research Institute). IIIF presentation structures are generated from these according to model configuration files. The manifests are stored as binary JSON fields in a Postgres database, which has the added benefit of being able to query elements inside the manifests.

Hundreds of thousands of manifests: on-the-fly

If you have information for hundreds of thousands of collection objects, chances are these already exist in some web-accessible database to power your online collection. In that case you may wish to retrieve the information directly from your database, search application server, or API, and reformat it to produce JSON (figure 11) instead of the HTML that you produce for site visitors.

A database generates templated files

Figure 11: Generate templated manifests on-the-fly from your data source.

You should be able to use the same programming tools that are available to template any other content for your online collection. Along these lines, the Getty plans to implement dynamic manifest creation to allow for custom collections (e.g., user-generated ones) to become portable as IIIF manifests; detailed plans for this are yet to be developed. 

Content Search API

Why?

Implementing the IIIF Content Search API specification (https://iiif.io/api/search/) allows websites to provide structured access to “annotations” for each image. These annotations are mainly used to provide literal information about image regions, with one obvious use being to provide a transcription for handwritten or printed text in an image. Annotations are also starting to be used for other purposes including narrative text for online display (Roddis, 2018) and the teaching of art history. The Content Search API makes it possible to search these annotations, and optionally to provide coordinates for the matching regions of an image (figure 12) and/or spelling suggestions for possible matches.

Highlighted words on an image of a typewritten page

Figure 12: The Content Search API allows highlighting of matching image regions on documents on the Qatar Digital Library.

How?

Thousands of annotations: annotation server

If there are only a handful of annotations, it is not worth implementing this API, and indeed they can be stored as static “annotation list” JSON files. Once a substantial number exist, they can be stored in structured storage such as a dedicated IIIF annotation server (figure 13). The job of an annotation server is to store submitted annotations, and to return them in response to simple queries. (A list of the software available to do this can be found on the Awesome IIIF page at https://github.com/IIIF/awesome-iiif#annotation-servers .) A user-facing system can then query the annotation server to find matches and return them in the JSON format dictated by the Content Search API specification.

Annotation server and server-side languages

Figure 13: An annotation server can be used as a backing store.

Millions of annotations: search application server

An annotation server or simple database will work for moderate amounts of data; but in order to search large volumes of content very quickly, you will need to switch to a dedicated search application server (figure 14) such as ElasticSearch (https://www.elastic.co/) or Apache Solr (https://lucene.apache.org/solr/). These have the key advantages of providing extremely fast access to search results, while being robust and easily scalable via clustering. They can also provide features such as spelling suggestions based on stored data.

Search application server and server-side programming

Figure 14: A search application server will speed up retrieval.

Authentication API

Why?

While IIIF embraces at its core the value of sharing public-access resources, some institutions have rights and policy limitations that only allow them to share restricted versions of certain media to the public (e.g., up to a defined size or quality). Since IIIF is also about avoiding duplication of data, the Authentication API was developed to allow a single resource to be presented in different forms to users with different access privileges. The Auth API (for short) defines how an application can interact with a IIIF image server in order to provide this functionality, without dictating methods of authentication or patterns for authorization policies. 

How?

Deploying IIIF Auth requires a fair amount of setup work. The complexity of implementation is not affected as much by the volume of resources as it is by the complexity and diversity of source materials and policies to be enforced. Most of the work may go into areas that are not directly related to IIIF, such as defining access policies with stakeholders, building user privilege groups, ensuring that the policies are conflict-free, etc. IIIF assumes that an authorization system is already in place and should not require any specific work to use with IIIF, but policies may have to be crafted specifically for the resources being shared and the audiences involved. 

Conclusions

Numerous tools for interacting with audiovisual media on the Web are available today. IIIF consists of a set of specifications by which institutions can develop, adopt, swap in and out, and exchange such tools collaboratively without reinventing the wheel. IIIF enables us to develop user experience more efficiently and cost-effectively, especially when even small changes in IT architectures can affect large volumes of content and complex networks of technical services.

Implementing the IIIF APIs provides advantages to the users and maintainers of systems with lots of high-resolution images, as well as to institutions with modestly sized data sets. However, the way in which each API is best implemented varies according to constraints unique to each institution. These constraints may arise from such factors as the organization’s:

  • Number of images
  • Amount of metadata
  • Amount of traffic
  • Internal systems
  • Availability and expertise of staff

In this paper we have presented a variety of widely applicable techniques that can be used as starting points for planning how to create or extend your IIIF implementation at scale. If you want to delve into further detail and more specific use cases, the IIIF community always welcomes discussion on a variety of communication channels (https://iiif.io/community/ ). 

References

Cossu, S. (2018). “Getty Common Image Service Research & Design Report.” IIIF Annual Meeting. Published June 2018. Consulted February 13, 2020. https://drive.google.com/a/getty.edu/file/d/1SDiQ20pPGExFsfwZyx7sJyFEVUe9Bco0/view?usp=drive_open 

Keller, A., & T. Cramer. (2018). “Next Steps for the International Image Interoperability Framework.” Published January 2018. Consulted February 13, 2020.
https://iiif.io/community/consortium/next_steps/#6-budget-plan-for-2018-and-beyond

Roddis, T., & A. Farquhar. (2018). “From at Risk to Open Access: The Endangered Archives of the World.” MuseWeb 2018. Published February 4, 2018. Consulted February 13, 2020.
https://mw18.mwconf.org/paper/from-at-risk-to-open-access-the-endangered-archives-of-the-world/ 

Roddis, T. (2018). “Making Metadata into Meaning: Digital Storytelling with IIIF.” MuseWeb 2018. Published January 31, 2018. Consulted February 13, 2020.
https://mw18.mwconf.org/paper/making-metadata-into-meaning-digital-storytelling-with-iiif/ 


Cite as:
Roddis, Tristan, Lancefield, Rob, Campbell, Jeffrey, Cossu, Stefano and Hawkins, Neil. "IIIF at Scale." MW20: MW 2020. Published February 14, 2020. Consulted .
https://mw20.museweb.net/paper/iiif-at-scale/