AUTOMATIC INTEGRATION OF RELATIONAL DATABASE SCHEMAS
Date
2000-10-16
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This paper focuses on capturing the semantics of data stored in
databases with the goal of integrating data sources within a company, across a
network, and even on the World-Wide Web. Our approach to capturing data
semantics revolves around the definition of a standardized dictionary which
provides terms for referencing and categorizing data. These standardized
terms are then stored in semantic specifications called X-Specs which store
metadata and semantic descriptions of the data. Using these semantic
specifications, it becomes possible to integrate diverse data sources even
though they were not originally designed to work together. The
centralized version of the architecture is presented which allows for the
independent integration of data source information (represented using X-Specs)
into a unified view of the data. The architecture preserves full autonomy of
the underlying databases which are transparently accessed by the user from a
central portal. Distributing the architecture would by-pass the central portal
and allow integration of web data sources to be performed by a user's browser.
Such a system which achieves automatic integration of data sources would have
a major impact on how the Web is used and delivered. Unlike wrapper
or mediator systems which achieve data source integration by manually defining
an integrated view, our architecture automatically constructs an integrated
view from information independently provided by the data sources. Thus, the
contribution is an algorithm for schema integration not just a methodology for
accessing data sources whose knowledge has been precombined into mediated
views. The integrated view is a hierarchy of concepts that is queried by
semantic name. Thus, the system provides both logical and physical access
transparency by mapping user queries on high-level concepts to physical schema
elements in the underlying data sources. Notes: Joint released technical
report. Released as TR-00-15 for the University of Manitoba, and 2000-662-14
for the University of Calgary.
Description
Keywords
Computer Science