Through the Cosmoroe Travel Search Engine, one can navigate through examples of image-language associations in TV travel series. In particular, one can find examples from two series, both of which have been translated for the needs of this service in both Greek and English:

License - Downloads

All annotation/analysis related to these examples belongs to ILSP/ATHENA R.C. and is licensed through an Open Commons Non Commercial ShareAlike license. The annotation files comprise the "Cosmoroe Annotated Data Corpus" (ISLRN: 668-823-721-622-8).

Ownership of the copyright of each video segment or static frame presented in each example remains with the original owners; this material is included here only for illustration purposes. See full details here. The analyses presented in these examples are the researchers'own and do not reflect the views of the video copyright owners.

Why TV Travel

TV travel series usually involve one or more presenters visiting different places, being in contact with the locals, interviewing people, explaining the habits, traditions and way of life in specific locations. They are highly interactive and adventurous. Language is used to refer to a variety of things, ranging from tangible things directly depicted in the programme to more abstract concepts. It covers a wide range of concrete and abstract concepts, as is the case in everyday interaction. There is a mix of specific terms and everyday colloquial language that is being used, and there are no strict restrictions in terms of the vocabulary to be used, the length of the descriptions or the visual modalities.

Therefore, these audiovisual files include a variety of language modalities (speech, and text: subtitles, scene-text, graphic-text etc.), visual modalities (natural image sequences/filming, graphics such as maps), that depict not only objects/entities, but also gestures (e.g. deictic, emblems, iconic, metaphoric) and other body movements. In many cases, the files contain section-titles, i.e. captioned frames, in which one may observe modality interaction between an image and its caption, as one would with a static photograph and its accompanying caption. Thus, we have selected these TV travel series for cross-media semantic annotation due to the richness of the interacting modalities available in this genre.

Some Statistics

The total number of multimodal relations annotated in the two travel series, the number of textual and visual arguments participating in them, and also the way relations and arguments are connected are given here, here, and here respectively. Furthermore, looking at the lemmas found either as a textual argument or a label for a visual argument, the following statistics have been counted:

Textual Arguments related annotations
Anchor Text Lemmatised words: English Greek
669 742
Visual Arguments related annotations
Body Movement Lemmatised labels: English Greek
113 112
Gesture Lemmatised labels: English Greek
15 14
Shot Lemmatised labels: English Greek
206 212
Keyframe Lemmatised labels: English Greek
129 134
Textual And Visual Arguments related annotations
Total Lemmas: English Greek
882 989

Related Publications


All research related to these annotations has been carried out in the framework of the FP7-ICT Project POETICON++ (Grant No: 288382) and its predecessor, POETICON (Grant No: 215843), by:

We also thank Maria Lada and Maria Koutsobogera for preliminary annotation of the files.

Back To Top