| |
National needs for digital library technology:
images, languages, and data fusion.
The need to gather information about terrorism has highlighted
areas of application for digital library technology. Among
key issues are the ability to do image analysis, whether
face scanning at airports or analysis of satellite surveillance
photographs, and the need for search and translation from
languages not previously given much study. But perhaps
one of the most difficult problems is the ability to merge
information from many sources and in many media.
Spoken language processing is a familiar area, but more
attention has been given to statistical techniques in
recent years, and we now see an interest in handling languages
where we do not have large amounts of bilingual text.
Fortunately, statistical methods are improving and we
now see projects (at CMU and JHU, for example) on transference
of information between languages and on statistically
based translation.
Image retrieval is an area of great activity right now,
including 3-D graphics software and video as well as static
2-D pictures. The basic paradigm for image retrieval is
to crawl around the image with some low-level feature
extractor, and then use the numbers resulting as classifiers
for retrieval. The simplest feature to extract is the
color histogram, which is why Robert Wilensky always says
to be suspicious of any image retrieval demo which concentrates
on finding pictures of sunsets. Although we have moved
on from that, we are still well short of something a photo
or film librarian would use. About half the queries asked
of a film librarian require knowing the names of things
in a picture - not just "tower" or "river"
but "Eiffel Tower" or "River Seine".
Thus it may be as important to pursue projects based on
recognizing existing images from a dictionary (e.g. faces)
as to do basic feature extraction and classification.
Some interesting projects are the shape recognition and
image labeling work at Berkeley (Malik and Forsyth), scale
and viewpoint invariance search at Oxford (Zisserman),
and face recognition at CMU. Interesting applications
include analysis of aerial photographs, analyzing images
of geological structures, and (recently) a stress on individual
tracking by looking through pictures. The most recent
research frontier is in 3-D models and searching, ranging
from historic building reconstruction to drug design.
The need to integrate large amounts of information from
multiple sources, and to present information in more intelligible
ways, are also challenging our researchers. But again,
progress is being made: there is now a 100 terabyte public
text data base (the Internet Archive) and we are learning
more techniques for summarization and filtering. We still
need better ways to display information and to exploit
it in practice.
|
|