Integration and Architecture

Integration and Architecture

This effort is focused on identifying the systems and common interface frameworks, protocols and methods so as to inform development of the application. It will focus on identifying integration needs, touch points and solution paths so that the application can be readily adopted by partner libraries and other libraries.

Current Application Landscape

As part of the information gathering, the team discovered that almost all of the libraries shared the same technology platforms. The Online Public Access Catalogue (OPAC) is used to provide a common catalogue interface for all the library’s physical and electronic collections through either BiblioCommons or the web interface provided by their respective Information Library Systems (ILS) vendor, Triple I or Polaris. The primary eBook distribution platforms we use are: OverDrive, 3M Cloud Reader and/or Baker and Taylor's Axis 360 platform and/or Adobe Digital Editions, which is the free eReader program provided by Adobe for Adobe encrypted content.

Alternate Application Landscape

Out of the 3 eBook platforms, only OverDrive and Baker and Taylor actually transfer content outside of their platforms. Additionally these platforms utilize Adobe DRM and Adobe software for ePub presentation. This underlying technology, is not ePub 3 capable and induces further “lock-in” to the particular vendor's platform without expensive licensing and software acquisition. We additionally tested new technologies such as Sony DADC’s User Rights Management System (URMS) DRM and found it easy to use albeit, hardly used by reading platforms. This is due to restrictive licensing terms for users of Adobe technologies as well as the antiquated methods for rendering content that limit the interactive content and more advanced interface technologies of today.

As you can see on the diagram to the left, the desired end state for libraries seeking to own the user experience and remove the intermediation of the library by its vendors is to simply provide the interface to the eBook collection. This interface in short intermediates the vendors so that discover, borrowing and reading experiences are managed by the library interfaces.

Possible Application Architecture

A possible application architecture of Library Simplified that could intermediate the different platforms, systems and content repositories of the library and 3rd party vendors, would be a “plug-in” infrastructure. In some instances, Library Simplified systems would need to actually take on the functions of other systems when they cannot support the full scope of services needed to provide a unified view into the eBook collections. It will also need to normalize depiction of the collection and the transactional consistency of a seamless user experience.

Server Footprint

The infrastructure foot print that could support such as an application could be fully hosted by the library on license free LAMP stacks (Linux, Apache, MySQL) as well as leased infrastructure such as Amazon Web Services (AWS.)

One area that could be better optimized involves the data remediation services that “wrangle” the disparate meta-data providers and repositories on the Internet. This service is needed to classify and fill in the missing descriptive data of a book and its related cover-art that exists in trade eBooks and Public Domain eBooks. This data and content “wrangling” architecture enables the distribution of Public Domain works hosted by the library such as Project Guttenberg, which houses over 40,000 eBooks, and Audio Books. This area of the this application footprint is “heavy” in terms of data storage requirements because it aggregates a wide variety of data in that may be duplicated, but alternately indexed, redundant and/or unnecessary in the end. The processing services distill and remediate this data into what is needed for the application database. Once processed, it is the remediated and distilled data that the application needs. A potential solution for deployment and implementation would be to host this as a service for other libraries so as to eliminate the initial processing burden and infrastructure for adopters. The service would only process and remediate the difference in the partner library catalogues and new works introduced by the partner’s collection developers as needed. Where there is overlap in the respective collections, the service merely uses the remediated data and works on hand. This service would allow a minimal footprint for other libraries looking to adopt Library Simplified.

Resulting Architecture

Upon implementation, we arrived at a new architecture to facilitate two key technologies: 1) Adobe Vendor ID and 2) OPDS for Libraries. Additionally, this architecture decoupled the Public Domain and Open Access content store from the core circulation management functions so that it can exist as a stand-alone content store that is 3rd party hosted or locally hosted.

Adobe Vendor ID is a specification vs. a true technology. This specification provides licensors the ability to protect their users identities by taking on the authentication and authorization to licensed content. Vendor ID works seamlessly with Adobe's Adept DRM technologies, RMSDK based applications such as Adobe Reader and Adobe CMS. The other benefit is that is more naturally fits in within the library technology ecosystem. Traditionally, libraries have relied on their ILS to provide patron identity management, authentication and authorization services for their digital properties. This native ability allows libraries to preserve that traditional technology implementation until they migrate to more modern and practical SSO and CRM technologies to manage their patron identities, authentication and authorization to services.

Open Publication Distribution System (OPDS) is an open specification for a protocal that is widely used in European and North American (Canada) eBook markets where accessibility is more widely practiced and service providers have adopted a loosely coupled technology stack. The specification community consists of publishers, platform providers and eBooks vendors - OPDS.org . The OPDS specification is a syndication format for electronic publications based on Atom RFC4287 and HTTP RFC2616. OPDS Catalogs enable the aggregation, distribution, and discovery of books, journals, and other digital content by any user, from any source, in any electronic format, on any device. The OPDS Catalogs protocol prioritizes simplicity and speed. Because the transport protocol is essentially an xml document it is easily consumed by a variety of clients and services. It is easily adopted becasue it allows program layers to evolve independently of each other. The OPDS Catalog 1.1 specification is based on a lot of existing, in-production software and collaboration between eBook reading systems, publishers, and distributors; eBook readers like Aldiko, Bluefire Reader, QuickReader, FBReader, Ibis Reader, and others already support the evolving specification as well as eBook providers such as Feedbooks, Internet Archive, O'Reily Media, etc..

Library Simplified APIs will be RestFul APIs exposed via the OPDS protocol. We are working with the OPDS community to extend and define the OPDS specification to include content rights management, authentication, authorization, and lending work flows. These protocol extensions - Opend Distrobution Libraries (ODL) and Authentication for OPDS - will allow for greater interoperability between libraries, ILS systems and eBook vendors. OPDS will be used as the machine to machine interface between Library Simplified's Circulation Manager the Metadata Wrangler and Open Access Content Server. This opens up the potential for deep integration into the Library Simplified middleware application layers so that an array of compliant vendor systems can be easily implemented within a library application ecosystem through a loosely coupled protocol vs API. Furthermore, we believe that as OPACs and ILS vendors should adopt these protocols to allow great opportunity for libraries to curate and create custom user experiences that link their eBook collections with their special collections, physical collections and exhibitions and integrate service providers.

Application Technologies

The application consist of a combination of open source and commercial proprietary software. The following diagrams depict software and resources that make up the application stack.

Mobile client stack

For the mobile client, the highest level of the stack is the open source eBook application for delivering the data model, control functionality and information views (e.g. MVC). However, in simplest terms, the mobile app is 1) an OPDS reader (think of an ATOM or RSS feed reader) that functions as the "Discovery" interface for the library's content and 2) an e-reader that functions as the rendering engine for EPUB packaged content files. The OPDs reader is based on the the OPDS 1.2 specification. The Readium SDK EPUB rendering engine software compiled into the app provides the e-reader function of the app. The Digital Rights Management (DRM) functions required for certain content is handled by the appropriate commercial technology (e.g. Adobe Adept Connector) to manage protected file acquisition and description. There is a portion of the app for registering for a library card. This is handled through a web-view within the app that is securely hosted in the middle-ware. The same card registration tool can be easily swapped out for other SSO methods, or updated for to work with a different ILS or repurposed to other clients interfaces such as the library's website.

Middleware Application Technologies

The middleware consist of custom software application code and language specific resource libraries. The bulk of the application is a Flask application written in Python. The database technology is Postgres SQL and the server technology is Linux. The card registration application is a Rails application using a Postgres SQL database to handle temp data storage and a SOAP interface to transact that data into the Sierra ILS. The different application servers depicted in the diagram provide the core functionality and feature functions (e.g. card registration, open access content hosting and vendor ID.) The core function of the middleware is to consolidate the remotely hosted content services into a unified collection, normalize its representation, normalize the 3rd party host APIs into a standard transaction for the client, and manage access to content. All interfaces between the application servers are based on OPDS (accept for the Database.)