The
Semantic Web:
Architectural Patterns for Evolution
IS 3957: Doctoral Seminar: Systems and Technology
IS 2937: Advanced Topics: Systems and Technology
IS 3957/IS 2937
Spring 2003(03-2)
CRN: 46626/38151
Tuesday, 3:00-6:00
Room 406
Michael B. Spring
Department of Information Science and Telecommunications
University of Pittsburgh
spring@imap.pitt.edu
This doctoral seminar, like most doctoral seminars, is
intended as a directed but flexible exploration of a topic. What is found below is a beginning of
the exploration, not the exploration in totality. This course is open as an Advanced Topic course for MSIS
students who have had document processing and/or client server systems. Both PhD and MSIS students should anticipate
working with the instructor to define additional readings and work related to
the course.
Introduction
This seminar will address the development of the semantic web. The term “semantic web” appears to have
been coined by Tim Berners-Lee.
The first reference I can find is from his book “Weaving the Web”. Since that time, both the concept he
put forward and the term have been used in a variety of different ways. At the
very least, the semantic web alludes to the fact that new methods of organizing
the resources on the web will be required, that agents as well as humans will
be browsing the web, and that the resources on the web will increasingly be
dynamic programs rather than static files.
This seminar will explore the technologies, concepts and current research on
building systems that offer a more semantic web. The seminar will be focused in part by the interests of the
students taking the seminar, which will be open to advanced masters students as
well as PhD students. What is
clear is that the semantic web will somehow contain more “information” and that
that information will be of a form that will enable algorithmic as well as
human processing of information.
Whatever eventually evolves, some things about the web are already
clear:
1.
HTML documents and the http protocol have been widely accepted
and represent an important base upon which anything new will be built. It is most likely that the semantic web
will use successors to the current array of clients and servers that can handle
the web as currently configured as well as new forms of resources that may add
additional functionality. Put more
simply, the Semantic Web will supplement, not supplant, the Simple Web.
2.
Business will play a greater and greater role in shaping the
web and its capabilities. While
the web was initially dominated by academics and altruistic information
sharing, the web of the future will be increasingly devoted to commerce of a
wide variety of forms.
3.
The amount of information on the web continues to grow. The reality is that individual link
traversal is a lousy way to find information. Classificatory libraries (yahoo) and full text indexing
(search engines) have emerged as the dominant mechanisms for locating
resources. Both of these
approaches, expert classification and text indexing, have significant
limitations. Much of what the
semantic web efforts are directed at is the design of systems that will not be
subject to these limitations.
4.
The resources that are available on the web are increasingly
opaque and/or transitory. Opaque
resources are programs that generate output based on some input. It is meaningless to full text index
such a resource – the real content is opaque to search engines. Similarly, some data that is generated
as a part of a resource is highly transitory, so even if it is indexed, it is
not likely to be the same when accessed at a later time. It is believed that “semanticizing”
these resources would help to overcome this limitation.
5.
Distributed computing in its many forms is growing in popularity
and it is only natural to expect that some form of distributed computing also
makes sense for the future of the web.
There is a desire to create information stores on the web that are
standard enough to be processed by programs.
6.
The development of the XML family of standards has resulted in
a fabulously rich conceptual infrastructure for the development of new
technologies, tools, and capabilities.
The seminar will explore how these various goals and trends might be
realized. Within this context, the
participants will be expected to develop well reasoned position papers and
prototype implementations.
Goals
The goals of the seminar are:
- to provide an introduction to
the literature on the topic.
- to provide an introduction to
the relevant technologies and languages.
- to provide an opportunity for
the participants to engage in a discussion.
- to establish a framework for
future research in this area.
- to develop a meaningful set
of implementations that demonstrate a more semantic web.
Organization
The seminar will be broken down into three parts:
- Orientation and Proposal
(weeks 1-4)
- The participants will
read “Weaving the Web”, “What Will Be” and a series of articles on
various aspects of the Semantic Web. By the end of the third week, the participants in the
seminar should have formed some more operational definitions and some
requirements for the next generation web.
- During this same
period, the seminar leader will introduce the notion of an information
marketplace and will lead an effort to refine an existing proposal to NSF
to develop a collaboration infrastructure.
- Technologies and
Methodologies (weeks 4-8)
- Key technologies that
will underlie the next generation web will be explored. This will focus on a thorough
review of XML, certificates, and distributed computing. The seminar leader will be
responsible for presenting this material, and the participants will be
expected to wade through the various standards and specifications.
- The participants will
work with the instructor to identify the nature of the design problem for
the next generation or semantic web. Is it an enterprise application, a business framework,
an information marketplace, an API? How do these various systems differ? What are the specifics of the
design suggested by a collaboration infrastructure.
- Positions, Prototypes,
Plans, and Problems (weeks9-15)
- The remainder of the
seminar will be devoted to development of prototype designs and
participant-led discussions related to these prototypes. It is expected
that each participant PhD student will have selected a narrow focus
within the area for investigation and presentation.
- Advanced masters
students taking the special topics course will work with the PhD student
of their choice in implementation of a prototype.
- PhD students will be
responsible for collecting and distributing additional readings related
to the focus of their work and for guiding a discussion.
- The participants will
use this period to demonstrate and discuss the software modules
developed. These sessions will focus on walkthroughs of the projects and
suggestions for improvement and testing.
Outcomes
There are four expected outcomes expected in this seminar for PhD students:
- Each participant is expected
to develop a contract that defines their goals for the seminar. This will include some preliminary
statement about the specifics of the next three points in their case. It will also serve as a
preliminary statement by them about how in general they will contribute to
the seminar, what they expect the seminar to accomplish for them, and what
kinds of expertise they are able to offer the other participants.
- Each participant is expected
to develop an overview of the literature in the area to the point where
they will be able to identify several additional papers in the area, read
and digest those papers, and guide a class discussion of the papers.
- Each participant is expected
to write a review of the literature that begins with the papers discussed
in the class and continues on to other relevant papers in some specialized
area of the participant’s choice. The focus of the review should be to
raise an issue or make a point about what should be possible or might be
done by way of further research in this area.
- Each participant is expected
to develop a design for a software system.
There
are two expected outcomes of this course for advanced master’s students:
- Each participant is expected
to contribute to a website that will be developed during the seminar. Each Master’s student is expected
to contribute to this site.
Two contributions will be required. The first will be a condensation of the readings and
the discussion in class based on those readings. These will constitute 5-15 pages and will both condense
the readings and discussion and expand upon it. It is anticipated that students will explain how the
readings are related, give examples and develop simulations as necessary
to explain the ideas.
- Each participant is expected
to contribute to the development of a software system that provides a
prototype example of some aspect of the next generation web.
Preliminary Reading List
·
Berners-Lee, Tim, James Hendler and Ora Lassila, The
Semantic Web. Scientific American, May, 2001, 35-44.
· Tim
Berners-Lee, Weaving the Web. Harper, 1999.
· Michael
Dertouzous, What Will Be. Harper, 1997.
·
Steven M. Cherry, "Weaving
A Web of Ideas. Engines that search for meaning rather than words will make the
Web more manageable."In IEEE Spectrum Online
(September 2002).
· Uche
Ogbuji, "The
Languages of the Semantic Web." By. In New Architect Volume 7, Issue
6 (June 2002), pages 30-33.
· Jim
Hendler and Brian Parsia, "XML and the Semantic
Web. It's Time to Stop Squabbling -- They're Not Incompatible." In XML Journal Volume 03, Issue 10
(October 2002).
· Sandro
Hawk,e How the Semantic Web Works http://www.w3.org/2002/03/semweb/
·
Aaron Swartz, The Semantic Web In Breadth http://logicerror.com/semanticWeb-long
· The
Semantic Web: An Introduction http://infomesh.net/2001/swintro/