AWS Public Sector Blog

Northwestern University Libraries make research more efficient, accessible with AWS Lambda

AWS branded background with text overlay that says "Northwestern University Libraries make research more efficient, accessible with AWS Lambda"

Libraries are where we explore, wade into the unknown, and seek answers. With that mission at the forefront, the technologists and librarians at Northwestern University Libraries (NUL) continually strive to improve their online libraries to enable user discovery and innovation. When they found a technology standard that could make research more efficient and open, they acted.

NUL leveraged an open-source standard, the International Image Interoperability Framework (IIIF), to make it simpler for researchers to examine, compare, share, and cite images and audio/visual files across libraries. “Becoming active in the IIIF Consortium made sense from the beginning,” said Carolyn Caizzi, department head of NUL’s Repository and Digital Curation Workgroup. “There wasn’t a specific technical challenge we were trying to solve. What motivated us was the opportunity to participate in this open standards community of cultural heritage institutions, all focused on how to best share collections efficiently across the globe.”

That community participation, together with NUL’s relationship with Amazon Web Services (AWS), led to innovative approaches to NUL’s digital collections suite. The new suite placed the IIIF standard at the heart of its infrastructure.

What is IIIF and how does it work?

IIIF is a set of open standards that provides a methodology for making image and audio/visual files available at scale. The image API allows researchers to zoom in and conduct a deep analysis of digitized resources while saving bandwidth by only downloading the needed data. It also allows on-the-fly region selection, allowing portions of assets to be used and remixed. It’s a shared framework that standardizes how museums and cultural centers share resources among libraries, allowing researchers to have a consistent, unified experience.

IIIF works similarly to how we commonly view satellite imagery and maps. To allow images to load quickly, the IIIF protocol only requests the quality and the pixels the viewer sees at that moment, with the pixels on the edge of the image cached. The downside is that as the viewer manipulates the image—scrolling and zooming in and out—they essentially make thousands of requests to the server as new pixels load. “It’s a ‘bursty’ protocol, meaning sudden spikes, that has the potential to tax your infrastructure,” said David Schober, team lead and product manager for repository and digital curation at NUL.

NUL’s home-grown innovation: Serverless-IIIF

NUL is no stranger to improvement opportunities, transforming their infrastructure multiple times after bringing their collection online nearly 20 years ago. The bursty nature of IIIF meant that NUL needed to reassess how they used their technological resources. NUL tried to use monolithic servers to support IIIF, but quickly realized they were underutilizing their max load during slower periods and overburdening their servers during busier periods—while spending money to cover the excess load as needed.

With a bit of ingenuity and experimentation, the NUL team invented a new concept: Serverless-IIIF. AWS’s serverless technologies seemed almost custom-built for IIIF. The applications would allow NUL to leverage IIIF without provisioning server resources, because the technology can scale up and down as usage changes.

“It was such a good fit for our needs,” said Michael Klein, technical lead and developer on the repository and digital curation software team. “I realized I could write an AWS Lambda function that could respond to IIIF requests. I ended up creating a codebase that was smaller than just the config file from one server we had been using.”

The proof-of-concept for Serverless-IIIF was faster and far less expensive than alternatives, and the application was adopted and in production within days. “It shows how agile you can be if you have the right tools and documentation,” said Schober. The team credits AWS’s resources and serverless application repository for giving them the guidance and technology they needed to get Serverless-IIIF online quickly.

Interest in the project grew, attracting contributions from members of the Samvera—a global group of technologists who create and maintain repository software—and IIIF communities. Contributors include Princeton University, the University of Notre Dame, Softserv, and Mnemoscene. The expanded scope and use cases further improved the codebase and design, with new conference presentations continuing to expand interest in the project. The current version of Serverless-IIIF uses AWS Lambda and Amazon CloudFront, along with Amazon Simple Storage Service (Amazon S3) for image storage and AWS Identity and Access Management (IAM) for permissions. It is managed and deployed via the AWS Serverless Application Repository.

Improving the user experience while lowering costs

Researchers are vigorously using the new functionality of the digital collections site. During the academic year, NUL supports a daily average of more than 120,000 requests and over 5 gigabytes in transfers. With Serverless-IIIF’s ability to scale up and down to meet needs, high usage has not resulted in higher costs. Klein says NUL spends about $0.66 per day on Serverless-IIIF requests—a massive saving considering the minimum spend on a load-balanced IIIF server is about $4 or more a day. “The ability to scale very wide drove us to the Lambda-based solution,” said Klein.

From a cultural perspective, Caizzi is most excited about how the transformed digital collections can generate new knowledge. “Because of the speed and responsiveness of Serverless-IIIF, all of these primary resources are now more accessible globally. Researchers everywhere can delve deeper into our library and reuse these resources in a variety of contexts. Who knows what they’ll discover?” said Caizzi.

An uncommon collection supported by a common standard

NUL plans to continue heralding the use of IIIF and, specifically, Serverless-IIIF within cultural institutions. Serverless-IIIF is now in use at Princeton, Notre Dame, the Royal Museums Brighton & Hove, and the British Library Shared Research Repository, among others. Still, they’re not limiting themselves to cultural centers. “We see broad relevance here for anyone who manages an image and audio/visual library, including publishers, newspapers, and other media organizations,” Said Schober.

The NUL team isn’t resting, they already have their next technology challenge in mind. “We’re looking to implement 3D image manipulation next,” said Caizzi. “We’re just at the beginning of what we can achieve using Serverless-IIIF, and we’re excited to see where it takes us.”

Colleges, universities, and research institutions around the world use AWS to support research and learning, improve the student experience, make data-driven decisions that save money and resources, and more. Learn more by visiting the AWS Cloud for Higher Education hub.

Read related stories on the AWS Public Sector Blog:

Carolyn Caizzi

Carolyn Caizzi

Carolyn Caizzi has worked in higher education and libraries for 20 years. She is the head of repository and digital curation at Northwestern University Libraries (NUL), a library department of 16 staff, who specialize in digitization, open-source software development, project management, digital object curation, and metadata creation. She previously worked at Yale University Library and the University of Denver supporting and managing technology projects to meet the needs of faculty, researchers, and students. Carolyn has held governance roles in open-source communities, and is a certified Scrum Master, and SHRM-CP.

David Schober

David Schober

David Schober is the product manager and team lead of NUL's repository and digital curation software development team. He has worked at the intersections of Internet technology, cultural heritage, and higher education for more than 20 years. David has focused his career on helping institutions such as Northwestern, The Poetry Foundation, and Rotary International leverage technology to ensure preservation and access to knowledge and cultural heritage. He currently serves on Samvera’s board.

Michael B. Klein

Michael B. Klein

Michael B. Klein is a lead developer on NUL's repository and digital curation software development team. He has devoted most of his career to creating solutions for libraries to create and preserve access to knowledge and cultural heritage. He is a certified AWS Cloud practitioner and acts as the cloud architect on the NUL team. Michael has worked at Stanford University, Oregon State University, and Boston Public Library. Before becoming a library technologist, he was a contract software developer with a diverse portfolio of clients ranging from the U.S. Census Bureau to Fortune 500 companies to small law firms.