Visuals - Film - Discourse

Project Description

The interpretation and analysis of images, moving images and image-text combinations raises substantial scientific challenges in many respects and is also increasingly required for dealing with today's media and styles of information dissemination. Such combinations involving distinct modes of presentation develop over time and have now become the most frequently used mode of communication of all. Communicative artefacts relying on combinations of distinct presentational modes are already common in traditional documents, newspapers and documentaries, webpages, human-computer system user interfaces, teaching materials of all kinds, and many others. This situation of growing predominance notwithstanding, our theoretical understanding of how exactly such complex communicative artefacts carry the meanings that they do or, also common and perhaps even more important, why they sometimes fail to carry their intended meanings (for example, in instruction manuals that fail to instruct or in film sequences that are difficult to interpret), is surprisingly underdeveloped and fragmentary. The overarching goal of this project is to achieve a demonstrable and measurable improvement and extension both of our fundamental understanding of, and ability to automatically process, this multimodal meaning-making process. To achieve this, we are bringing together technical-formal quantitative methods and hermeneutic, interpretative qualitative methods in a combined mutually enriching architecture for analysis and interpretation.

Sub-project 1 - Image Understanding

The goal of this sub-project is to develop techniques that enable the recognition of concepts in news images. The challenge is that such images are complex and subject to very high variation in their characteristics. The quantity of image structures that need to be interpreted in order to distinguish them from depictions of different concepts makes highly complex tasks. The variance of characteristics describes the variety of different arrangements of image structures that could be given for images of a concept.

Up to now it is not possible to recognize complex concepts within images with a high variance of characteristics automatically. Most of the present approaches are only able to recognize concepts when strong restrictions concerning characteristics like background, lighting or object specification are fulfilled. It is for this reason that these algorithms usually are specialised for one particular area of application and cannot easily be transfered to other areas.

The sub-project aims to contribute to the task of finding and realising universal architectures for image understanding. The goal is to develop processes that can learn arbitrary concepts. As the analysis of news images is determined by changing factors concerning the image contents, this is inevitable.

Sub-project 2: Automatic interpretation of film

The field of event detection is one important component of the automatic analysis of film. Events in film may on the one hand be signalled by filmic technical devices, e.g., as a sudden volume change; on the other hand they may also be more abstract, like, e.g., `the return of the protagonist'. These abstract events pose highly challenging problems. In general, however, automatically detected events of all kinds could support indexing of film, thus enabling more effective and efficient archiving and retrieval with respect to the considered event classes.

Sub-project 2 is concerned with the automatic detection of those events in film which are of interest for the abstract, more interpretative analyses of film required for high level analysis. Closely cooperating with sub-project 4, a set of event classes will be compiled. The considered events should, on the one hand, be automatically detectable and, on the other hand, be useful for a linguistic/narrative/interpretative discourse analysis of film. Assisted by automatic methods in this way, larger corpora of film will be made accessible for discourse analysis.

Sub-project 3 – Image-Text in Press and News

At the center of the partial project at Jacobs University (TP 3) is the Political-Iconographic Archive of Prof. Marion G. Müller, with approximately 15.000 visuals, mainly press photographs, but also cartoons and art reproductions, collected over the past 14 years and annotated according to a system of categories pertinent to contemporary political communication. The archive PIAV – Political Iconographic Archive of Vision - is following in the footsteps of the Warburg-tradition (Kulturwissenschaftliche Bibliothek Warburg, Hamburg) of the cultural historian Aby Warburg (1866-1929) and the Hamburg art historian Martin Warnke.

Press photographs are "fast food for the eyes". Digital archives of e.g. press agencies and newspapers keep the visuals in stock, but the archiving rationale follows a strictly journalistic logic. Visual research is interested in different aspects of the visual and needs different retrieval tools to search for those visuals. While for the news medium the name of the photographer, date of publication and the location at which the photo was shot, is of prime interest, the visual researcher is interested in the motifs depicted. To this day, there is no classification system for visual motifs of press photographs. Using Müller's PIAV-index, the collaborative research project will develop a novel typology to classify mass-mediated visuals, which will be used as basis for the prototype of an image analysis and retrieval tool by TP 1. One of the many results expected from this research project is the improved accessibility of digital visual archives as well as a better understanding of the interaction between text and visuals when making sense of news that are conveyed in a 'multimodal' form - combining "Visual - Text - Discourse". Research findings have implications for both cultural research into visuals as well as for semantic image analysis, visualization and visual archiving.


Sub-project 4: Filmic narration: the discourse structure of film

The main goal this subproject is to set out a catalogue of filmic constructions that are predicted both to play an important role in guiding viewers' interpretations of film as it is unfolding and which can serve as targets for automatic analysis. Accordingly the method employed begins more informally and hermeneutic, based on film literature and a multimodal linguistic analysis of a corpus of films selected for analysis. Then, in the next step, the constructions identified are set out formally in terms of the precise filmic features that are necessary for their recognition. These features will be explored with respect to their automatic recognition based on the results coming from subproject TP2 and, to the greatest extent possible, mechanisms for their automatic recognition will be specified. Finally, potential meanings for the construction will be proposed based on the viewings. These proposals will then stand as hypotheses to be empirically investigated and refined by examining retrieved cases from films that have been automatically segmented and analysed. In this way, the catalogue of filmic constructions can be extended and validated empirically. A similar collection of constructions will be made for news reports and videos. It is expected that constructional meaning may also be dependent on particular filmic genres. This will be verified or refuted by this comparative study.