Multilevel Representation and Query Processing in Multimedia Database Systems
Abstract Multimedia documents of the future will be composed of several monomedia data, consisting of text, audio, image and video data. Composition and searching of such complex documents pose several database challenges. This proposal is aimed at defining and addressing some of these issues.
Multimedia documents are already being used in several applications including manufacturing, medicine, education, business, entertainment, etc. For example, the semiconductor industry uses multimedia documents extensively and the PINACLS group is currently attempting to develop standards for product information exchange via multimedia documents. Further, there is a growing interest in manufacturing industry to develop technology for supporting interactive electronic technical manuals (IETM).
A multimedia database should be able to accommodate the heterogeneity that may exist among users of the database due to differences in their pre-conceived interpretation or intended use of the database. In order to accommodate such heterogeneity, it is essential to have a model for multimedia data and documents that can respond to various types of queries posed by different users. Some example queries are: (i) an instructor in an aeronautic engineering class may search for a document with a video clip showing the take-off scenario of a space shuttle. (ii) an investigative agency may request, ``Use all available data to find every person who has spoken to this suspect in the last week'', (iii) a researcher may request, ``Find documents on music containing a video appearance of the singer Michael Jackson''. For this project, we propose to develop a framework for modeling multimedia data and design mechanisms for managing and accessing relevant information and answering the types of queries mentioned above.
Recently, there have been some attempts to model multimedia documents. The PI's have been actively pursuing research in this area and have proposed a framework for modeling multimedia data. However, many challenges remain to be solved to fully utilize the information available in a multimedia database. Using results from our previous research, we propose to develop spatio-temporal models for multimedia objects that will facilitate a multilevel search for specific documents, events or objects in a multimedia database. This will allow a multimedia database system to process a large class of heterogeneous queries.
One of the main challenges is to allow imprecision in automated low-level processing of raw multimedia data. Annotation of the database with textual descriptors is usually done via image processing techniques. However, such approaches have their limitations because the current known image processing algorithms are not robust enough to guarantee perfect results for every query. For multimedia data, he problems caused by this imprecision need to be identified and solved. The PI's are also currently pursuing research in various aspects of image processing. We propose to model the imprecision of image processing using the principles of fuzzy logic. Fuzzy parameters will be extracted from a the search request as well as passed back to the user in the form of a query response. We plan to use such an approach to accommodate and possibly overcome the barriers currently faced by the research community for developing an automatic image processing and indexing system.