Multimedia Information Retrieval

Mark David Dunlop

Being a thesis submitted for the degree of Doctor of Philosophy in the Department of Computing Science at the University Of Glasgow.

7 October 1991

(c) Mark D. Dunlop 1991

Postscript & PDF full version by chapter.

Summary

With recent advances in screen and mass storage technology, together with the on- going advances in computer power, many users of personal computers and low end work- stations are now regularly manipulating non- textual information. This information may be in the form of drawings, graphs, animations, sound, or video (for example). With the increased usage of these media on computer systems there has not, however, been much work in the provision of access methods to non- textual computer based information.

An increasingly common method for accessing large document bases of textual information is free text retrieval. In such systems users typically enter natural language queries. These are then matched against the textual documents in the system. It is often possible for the user to re-formulate a query by providing relevance feedback, this usually takes the form of the user informing the system that certain documents are indeed relevant to the current search . This information, together with the original query, is then used by the retrieval engine to provide an improved list of matched documents. Although free text retrieval provides reasonably effective access to large document bases it does not provide easy access to non- textual information. Various query based access methods to non- textual document bases are presented, but these are all restricted to specific domains and cannot be used in mixed media systems.

Hypermedia and hypertext, on the other hand, is an access method for document bases which is based on the user browsing through the document base rather than issuing queries. A set of interconnected paths are constructed through the base which the user may follow. Although providing poorer access to large document bases the browsing approach does provide very natural access to non- textual information. The recent explosion in hypermedia systems and discussion has been partly due to the requirement for access to mixed media document bases.

Some work is reported which presents an integration of free text retrieval based queries with hypermedia. This provides a solution to the scaling problem of browsing based systems, these systems provide access to textual nodes by query or by browsing. Non- textual nodes are, however, still only accessible by browsing - either from the starting point of the document base or from a textual document which matched the query.

A model of retrieval for non- textual documents is developed, this model is based on document's context within the hypermedia document base, as opposed to the document's content. If a non- textual document is connected to several textual documents, by paths in the hypermedia, then it is likely that the non- textual document will match the query whenever a high enough proportion of the textual documents match. This model of retrieval uses clustering techniques to calculate a descriptor for non- textual nodes so that they may be retrieved directly in response to a query. To establish that this model of retrieval for non- textual documents is worthwhile an experiment was run which used the text only CACM collection. Each record within the collection was initially treated as if it were non- textual and had a cluster based description calculated based on citations, this cluster based descriptor was then compared with the actual descriptor (calculated from the record's content) to establish how accurate the cluster descriptor was. As a base case the experiment was repeated using randomly created links, as opposed to citations. The results showed that for citation based links the cluster based descriptions had a mean correlating of 0.230 with the content based description (on a range from 0 to 1, where 1 represents a perfect match) and performed approximately six times better than when random links were used (mean random correlation was 0.037). This shows that citation based cluster descriptions of documents are significantly closer to the actual descriptions than random based links, and although the correlation is quite low, the cluster approach provides a useful technique for describing documents.

The model of retrieval presented for non- textual documents relies upon a hypermedia structure existing in the document base, since the model cannot work if the documents are not linked together. A user interface to a document base which gives access to a retrieval engine and to hypermedia links can be based around three main categories:

* browsing only access, use the retrieval engine to support link creation

* query only access, use links to provide access to non-text

* query and browsing access

Although the last user interface may initially appear most suitable for a document base which can support queries and browsing it is also potentially the most complex interface, and may require a more complex model of retrieval for users to successfully search the document base. A set of user tests were carried out to establish user behaviour and to consider interface issues concerning easy access to documents which are held on such document bases. These tests showed that, overall, no access method was clearly better or poorer than any other method. The traditional view that hypermedia was easier to use by novices, but free text querying was better for experts, was supported to a certain extent but the differences were not large. The tests also raised several areas for consideration when building a user interface to a document base with some hypermedia structure and a retrieval engine.

The provision of query and browsing access within a user interface also raises an issue concerning relevance feedback: in a traditional retrieval engine the user could only give relevance feedback on documents which matched the last query (since these are the only documents which can be accessed). In a system which allows the user to browse the neighbourhood of matched document it is possible that the user will view, and thus give relevance feedback, on documents which do not match the query. A discussion and experiments are presented which show that the effect of feedback follows intuition for positive feedback, but that negative feedback, under the vector space model, is not as intuitive and cannot be considered an inverse operation to positive feedback.

Further details

The remainder of the thesis is not yet available on-line. For a printed version, please contact the Research Reports Secretary at Computing Science Department, University of Glasgow, Glasgow G12 8QQ, Scotland. You should quote the following details: M D Dunlop, Multimedia Information Retrieval, reference 1991/R21.