Published in: J. STROBL and C. BEST (Eds.), 1998: Proceedings of the Earth Observation & Geo-Spatial Web and Internet Workshop '98 = Salzburger Geographische Materialien, Volume 27. Instituts für Geographie der Universität Salzburg. ISBN: 3-85283-014-1
James Frew, Nathan
Freitas, Linda Hill, Kevin Lovette,
Robert Nideffer, Qi Zheng
Alexandria Project
University of California, Santa Barbara
The Alexandria Project is a consortium of researchers, developers, and educators, spanning the academic, public, and private sectors, exploring a variety of problems related to distributed digital libraries for geographically-referenced information.
"Distributed" means the library's components may be spread across the Internet, as well as coexisting on a single desktop. "Geographically-referenced" means that all the objects in the library will be associated with one or more regions ("footprints") on the surface of the Earth.
The centerpiece of the Alexandria Project is the Alexandria Digital Library (ADL), an online information system inspired by the Map and Imagery Laboratory in the Davidson Library at the University of California, Santa Barbara. The ADL currently provides access over the World Wide Web to a subset of the MIL's holdings, as well as other geographic datasets.
The ADL architecture is a 3-tier client-server architecture:

The crux of the architecture is a middleware layer with standard interfaces for metadata queries, metadata retrieval (in various formats), and digital object retrieval. These interfaces are intended to support arbitrary clients, of which the ADL client is the first sample implementation. The interfaces allow the clients to either ignore, or selectively explore, the idiosyncratic schemata of the underlying catalog databases.
The ADL client is a graphical interface designed to support interactive queries by ADL users. In the context of the overall ADL architecture, the salient feature of the ADL client is that the ADL client-server interfaces have been designed on the assumption that the ADL client is a a program. Specifically, the client is assumed to be capable of maintaining enough local state to support the notion of a "session", as well as supporting complex real-time user interactions (e.g. rollover help). Thus, the ADL client thus requires far less support from the rest of the ADL architecture than did its predecessor "Web Prototype" client, which was implemented entirely with server-generated HTML pages.
The current version of the ADL client (link will be provided here) is implemented entirely in Java. It may be run either as a standalone application, or as an "applet" in the context of a Java-enabled Web browser (Netscape Navigator, Microsoft Internet Explorer, etc.)
The ADL client communicates with an ADL middleware layer via HTTP. We selected HTTP as the basic protocol owing to its ubiquity, simplicity, and the ease with which current HTTP servers can be extended to support the level of functionality we require. (As CORBA object brokers become ubiquitous, this decision will be revisited.)
Each family of interfaces supported by the ADL middleware is bound a to particular URL. This eliminates the need for top-level switch logic to vector requests to the appropriate handlers, but does require that the client be aware of which URL implements which interface.
Five standard interfaces are currently supported
Unless otherwise specified, all ADL interfaces are implemented as HTTP POSTs to the interface's URL. Method names and parameters are passed as the contents of the post, in KNF (Kevin's Normal Form), a simple LISP-like syntax that supports both method invocation and simple boolean expressions. Return values are provided as ASCII text (MIME type "text/plain"), with name-value pairs separated by CR-LFs. An error message may be returned in lieu of whatever other return values the method supports.
The session interface exposes two methods, get-session-id and get-session-info.
get-session-id accepts user data and client data, and returns a unique session-id:
(get-session-id user-data client-data)
The user data are any information required to identify the user to the rest of the ADL system, such as a name and a password. The client data are any information required to identify the client software, such as client version. The session-id is used in all subsequent calls to ADL interfaces.
The get-session-info method returns a number of session-specific parameters:
(get-session-info session-id)
This should be used to set parameters in the client, such as the URLs associated with the ADL interfaces, that would otherwise have to be hard-coded.
The query interface exposes two methods, start-query and stop-query.
start-query accepts a session-id, a maximum number of results to return, and a query expression, and returns a query-id and minimal standard metadata for each object that satisfies the query expression:
(start-query (session-id session-id)
(maximum-results maximum-number-of-results)
(expression query-expression) )
The query-id, returned immediately, may be used by the stop-query method to terminate a query that appears to be hung or otherwise taking too long to complete.
The query-expression specifies boolean constraints on, and/or combinations of, the ADL search buckets, a standard set of high-level search metadata. Each collection supported by ADL must specify a mapping from its own metadata into the search bucket attributes. This mapping will almost always be many-to-one; i.e., there will inevitably be a loss of precision in querying the search buckets versus querying the collection-specific metadata directly. However, by exposing only a single high-level set of searchable metadata, the ADL middleware allows clients to be built that can both exploit search bucket semantics (e.g. spatially manipulate the "location" bucket), and can search all of ADL via a single connection.
The result interface accepts a session identifier and a list of object identifiers, and returns a list of object metadata. The metadata are formatted according to the XML subset of SGML. XML tags are used to delineate:
Sections are common to all ADL metadata, in that each ADL collection supplies a mapping that groups its metadata into the standard ADL sections.
Groups and name-value pairs are collection-specific; their semantics need not be known to ADL clients.
The browse graphic interface accepts a sesson identifier and a single object identifier, and returns a MIME-typed "browse graphic" representation of the object (usually a GIF, JPEG, or PNG image). If no specific browse graphic representation of the object is available, then a default image (e.g., a "not available" icon) is returned.
The map browser interface accepts a detailed map specification (footprint, projection, etc.) and returns a GIF image of a map of the corresponding Earth surface area.
The map browser interface is provided for the convenience of ADL clients that may need to generate arbitrary maps as visual navigational aids. Clients which do not require these maps (or which generate them by other means) may safely ignore this interface.
Unlike the other ADL interfaces, the map browser interface is implemented as a CGI GET, with the map parameters encoded in the interface URL. This is a historical artifact which we intend to update.
The ADL middleware is responsible for implementing the client-middleware interfaces described above, and mapping them into interactions with an arbtrary number of metadata catalogs. Additionally, the middleware implements whatever access controls ADL requires.
The current ADL middleware layer is implemented by an AOLserver (version 2.2) HTTP server. All ADL-specific functionality described in this section is implemented by Tcl scripts and C functions executing in the context of the AOLserver.
The ADL middleware supports whatever access policy is dictated by ADL. Two specific access controls are supported in the current implementation: host-based and user-based.
Host-based access control is used to refuse connections from clients that are not running at an ADL-approved Internet address. This is currently used to limit access to ADL to only those hosts connected to an IP network managed by the University of California. To this end, the access-control module maintains an explicit list of network numbers from which it will entertain connections. This mechanism is inherently unscalable, although it suffices to meet the UC-only distribution restrictions imposed by third parties on some proprietary materials in the ADL collections (e.g., commercial remote sensing imagery).
User-based access control allows a client to specify a (username,password) tuple during its initial connection sequence. This mechanism is currently not being used for access control as such; rather, it is being used to associate ADL client "sessions" with specific users for evaluation purposes. Each client request is logged to an external database along with a unique session identifier, which includes the (username,password) tuple if it is available. (Further discussion of the logging mechanism is outside the scope of this discussion.)
The ADL middleware maps KNF queries and retrieval requests into SQL queries specific to the underlying ADL servers. If multiple servers, or multiple databases on a single server, must be queried, this layer handles the requisite fan-out/fan-in. Note that we currently assume that multiple databases are disjoint; thus, the fan-in process is currently simple collation, with no duplicate detection or other conflict resolution.
In the current implementation, a basic architectural assumption is that the underlying servers will expose views that are, as closely as the particular server allows, exact correlates to the ADL search buckets and retrieval sections. Thus, the translation functionality of the mapping layer is currently limited to "transliterating" KNF into various dialects of server query languages (e.g., SQL).
The ADL middleware maintains a pool of client connections to whatever underlying servers are currently supported. The database connection layer is responsible for presenting a single functional interface to these shared connections. This serves to both localize whatever special knowledge (e.g. database client library) is needed to communicate with a particular server, and to minimize the setup or teardown time associated with making or breaking database client-server connection.
The ADL system currently includes two collection servers, the ADL catalog and the ADL gazetteer. They have quite different schemas, which has served as a good preliminary test of the our search bucket mapping strategy (congruent views in both the catalog and the gazetter).
Both the catalog and the gazetteer are implemented as Illustra (version 3.3) databases. The only Illustra-specific feature that we currently exploit is the Polygon data type, and its associated operators and indices, to implement the "location" search bucket.
© 1998 Department of Geography, Salzburg University