We consider what is needed to create electronic document libraries which
mimic physical collections of books, papers, and other media.
The quantitative measures of merit for personal workstations-cost, speed, size of
volatile and persistent storage-will improve by at least an order ofmagnitude in the next
decade. Every professional worker will be able to afford a very powerful machine, but
databases and libraries are not really economical and useful unless they are shared. We
therefore see a two-tier world emerging, in which custodians of information make it
available to network-attached workstations. A client-server model is the natural
description of this world.
In collaboration with several state governments, we have considered what would be
needed to replace paper-based record management for a dozen different applications.
We find that a professional worker can anticipate most data needs and that (s)he is
interested in each clump of data for a period of days to months. We further find that
only a small fraction of any collection will be used in any period. Given expected
bandwidths, data sizes, search times and costs, and other such parameters, an effective
strategy to support user interaction is to bring large clumps from their sources, to
transform them into convenient representations, and only then start whatever investigation
is intended. A system-managed hierarchy of caches and archives is indicated.
Each library is a combination of a catalog and a collection, and each stored item has a
primary instance which is the standard by which the correctness of any copy is judged.
Catalog records mostly refer to 1 to 3 stored items. Weighted by the number of bytes
to be stored, immutable data dominate collections. These characteristics affect how
consistency, currency, and access control of replicas distributed in the network should
be managed.
We present the large features of a design for network docun1ent/image library services.
A prototype is being built for State of California pilot applications. The design allows
library servers in any environment with an ANSI SQL database; clients execute in any
environment; conimunications are with either TCP/IP or SNA LU 6.2.
|