Digital library technologies : complex objects, annotation, ontologies, classification, extraction, and security

cover image

Where to find it

Information & Library Science Library

Call Number
ZA4080 .D4495 2014
Status
Available

Summary

Digital libraries (DLs) have introduced new technologies, as well as leveraging, enhancing, and integrating related technologies, since the early 1990s. These efforts have been enriched through a formal approach, e.g., the 5S (Societies, Scenarios, Spaces, Structures, Streams) framework, which is discussed in two earlier volumes in this series. This volume should help advance work not only in DLs, but also in the WWW and other information systems.

Drawing upon four (Kozievitch, Murthy, Park, Yang) completed and three (Elsherbiny, Farag, Srinivasan) in-process dissertations, as well as the efforts of collaborating researchers and scores of related publications, presentations, tutorials, and reports, this book should advance the DL field with regard to at least six key technologies. By integrating surveys of the state-of-the-art, new research, connections with formalization, case studies, and exercises/projects, this book can serve as a computing or information science textbook. It can support studies in cyber-security, document management, hypertext/hypermedia, IR, knowledge management, LIS, multimedia, and machine learning.

Chapter 1, with a case study on fingerprint collections, focuses on complex (composite, compound) objects, connecting DL and related work on buckets, DCC, and OAI-ORE. Chapter 2, discussing annotations, as in hypertext/hypermedia, emphasizes parts of documents, including images as well as text, managing superimposed information. The SuperIDR system, and prototype efforts with Flickr, should motivate further development and standardization related to annotation, which would benefit all DL and WWW users. Chapter 3, on ontologies, explains how they help with browsing, query expansion, focused crawling, and classification. This chapter connects DLs with the Semantic Web, and uses CTRnet as an example. Chapter 4, on (hierarchical) classification, leverages LIS theory, as well as machine learning, and is important for DLs as well as the WWW. Chapter 5, on extraction from text, covers document segmentation, as well as how to construct a database from heterogeneous collections of references (from ETDs); i.e., converting strings to canonical forms. Chapter 6 surveys the security approaches used in information systems, and explains how those approaches can apply to digital libraries which are not fully open.

Given this rich content, those interested in DLs will be able to find solutions to key problems, using the right technologies and methods. We hope this book will help show how formal approaches can enhance the development of suitable technologies and how they can be better integrated with DLs and other information systems.

Contents

Complex objects / Nadia P. Kozievitch and Ricardo da Silva Torres -- Annotation / Uma Murthy, Lois M. Delcambre, Ricardo da Silva Torres, and Nadia P. Kozievitch -- Ontologies / Seungwon Yang and Mohamed Magdy Gharib Farag -- Classification / Venkat Srinivasan and Pranav Angara -- Text extraction / Sung Hee Park, Venkat Srinivasan, and Pranav Angara -- Security / Noha Elsherbiny.

1. Complex objects. Definitions -- Technologies for handling complex objects -- Comparison of co-related technologies (DCC, Buckets, OAI-ORE) -- Related work -- Formalization -- Complex object -- Case study: fingerprint digital library -- Integration of digital libraries -- Implementation. 2. Annotation. Related work -- Superimposed information -- Subdocuments and hypertext -- Subdocuments and SI in digital libraries -- Subdocuments and annotations -- Review of select definitions -- Complex objects -- Formalization and approach to a DL with superimposed information (SI-DL) -- 5S extensions -- Collections and catalogs -- Services -- SI-DL -- Case study: using the SI-DL metamodel to describe SuperIDR -- SuperIDR -- Analyzing and describing SuperIDR. 3. Ontologies. What is an ontology -- Kinds of ontologies -- Ontology languages -- Literature review -- Ontology engineering -- Ontology and digital libraries -- Ontology engineering -- Methodologies -- Tools -- Reasoning ontology -- Ontology applications -- Semantic web -- Focused crawling -- Case study: crisis, tragedy, and recovery (CTR) ontology -- Approach -- Exercises and projects. 4. Classification. Motivation -- ETDs and NDLTD -- Problem summary -- Research questions -- Contributions of this project -- Related work -- Definitions -- Hierarchical text classification -- Naive Bayes classifier -- Neural networks classifier -- Search-based strategy -- Comparative analysis -- Scalability analysis -- Streams -- Structures -- Spaces -- Scenarios -- Societies -- Formal definition of classification -- Hierarchical classification -- Case study: hierarchical classification of ETDs -- Building a taxonomy -- Crawling ETD metadata -- Categorizing ETDs. 5. Text extraction. Rationale and scope -- Research topic -- Problems and applications -- Related work -- Algorithms -- Feature selection -- Formalization -- Informal definitions -- Formal definitions -- Document segmentation -- Reference section extraction. 6. Security. Basic concepts -- Related work -- Content -- Performance -- User -- Functionality -- Architecture -- Quality -- Policy -- Formalization -- Streams -- Structures -- Spaces -- Scenarios -- Societies -- Connecting the 5s -- Case studies -- CTRnet/IDEAL -- CINET -- Summary -- Exercises and projects.

Subjects

Subject Headings A:

Other details