1st International Workshop on Computer vision + ONTology Applied Cross-disciplinary Technologies

September 7th, 2014 in Zurich, Switzerland, in conjunction with ECCV 2014

Best Paper Award offered by IOS Press

Best Student Paper Award offered by IAOA

Image and video understanding is the process of converting elementary visual entities (pixels, voxels) to symbolic forms of knowledge (textual tags, predicates), by means of various kinds of models (statistical classifiers, neural networks, expert systems, etc.). It represents the highest processing level in a computer vision system, operating usually on top of a basic processing layer, which extracts intermediate image representations (patches, volumes). Due to the unconstrained nature of photographic images and videos, and the lack of fully reliable low-level features, the process of image understanding may be helped grounding it with a prior semantic model describing any domain knowledge, which may operate during both learning and inference. This semantic layer is usually represented by means of an ontology, intended as a set of primitive concepts and relations expressed by axioms providing an interpretation to the vocabulary chosen for the visual description of a domain.

After the early steps in the eighties, the research domain that cross-pollinates computer vision and formal ontology stagnated, limited probably by the lack of available domain ontologies. However, more recently, with the creation of shared resources as ImageNet, TinyImage, Labelme, on the computer vision side, and WordNet on the formal ontology side, the area exploded, leading to an exponential growth in the scientific community.

The aim of CONTACT 2014 is to bring together a wide range of researchers in computer vision and machine learning on one side, and formal ontology on the other, to share innovative ideas and solutions for exploiting the potential synergies emerging from the integration of the two domains, for object and event recognition, scene, image and video understanding, with the long term goals of promoting the development of a proper visual ontology and a better understanding of how such a visual ontology could be used for visual inference.

CONTACT 2014 will be the first of a series of events which will have an interesting yet unique characteristic: in order to gradually connect the communities of computer vision and formal ontologies in a tight relationship, the CONTACT workshop will be hosted iteratively in a major computer vision conference (as ECCV is) and in a major ontology conference (as FOIS - Formal Ontologies for Information Systems): this way, the cross-fertilization will take place by involving the best of the two communities, with alternative emphasis on one facet or the other.

Call for Papers

Download the NEW call for papers


The topics of interest for the convention include, but are not limited to, the following areas:

  • Semantic image/video understanding
  • Ontology-based cognitive vision
  • Semantic visual features
  • Visual concept ontology
  • Ontological engineering
  • Prototype theory
  • Ontology and probability theory
  • Automatic and semi-automatic ontology learning and inference
  • Web-based knowledge acquisition
  • Ontology representation for computer vision
  • Ontology- and knowledge-based vision systems
  • Ontologically driven scene understanding
  • Semantically-aided object identification
  • Semantically-aided action recognition
  • Expert systems for image and video processing
  • Video surveillance
  • Social signal processing
  • Medical imaging
  • Remote sensing


The workshop proceedings will be published post-conference by Springer; in this way, the final camera ready can include comments received during the conference. PRE-papers are papers which should account for the received reviews, that will be loaded on the conference USB stick to allow their dissemination during the ECCV event.

Best Paper Award

A "Best Paper Award" of 500€ granted by IOS press will be conferred to the author(s) of a full paper presented at the workshop, selected by the Organizing Committee based on the best combined marks of paper reviewing, assessed by the Program Committee, and paper presentation quality, assessed by Organizing Committee at the conference venue.

Best Student Paper Award

A "Best Student Paper Award" of 300 euros granted by IAOA will be conferred to a PhD (or Master or Bachelor) student who is among the authors of a full paper presented at the workshop, selected by the Organizing Committee based on the best combined marks of paper reviewing, assessed by the Program Committee, and paper presentation quality, assessed by Organizing Committee at the conference venue. Moreover, free membership to the IAOA will be granted to three PhD students who are among the authors of papers selected following the same criteria.

Special Issue

A special issue of a top-class journal on the topics of the workshop is planned for the next year (2015).


Organizing Committee:

Marco Cristani, University of Verona, Italy

Roberta Ferrario, ISTC-CNR Trento, Italy

Jason J. Corso, SUNY Buffalo, USA

Web & Publicity Chair:

Francesco Setti, ISTC-CNR Trento, Italy

Program Committee:

More focused on computer vision:

Francois Bremond, INRIA Sophia Antipolis, France

Rita Cucchiara, Università degli Studi di Modena, Italy

Trevor Darrell, University of California - Berkeley, USA

Jianping Fan, UNC-Charlotte, USA

Li Fei-Fei, Stanford University, USA

David Forsyth, University of Illinois at Urbana-Champaign, USA

Xian-Sheng Hua, Microsoft Research - Redmond, USA

Yu-Gang Jiang , Fudan University Shanghai, China

Ioannis (Yiannis) Kompatsiaris, CERTH-ITI, Greece

Mirco Musolesi, University of Birmingham, UK

Bernd Neumann, Universität Hamburg, Germany

John R. Smith, IBM T. J. Watson Research Center, USA

Concetto Spampinato, Università di Catania, Italy

Rahul Sukthankar, Google Research, USA

Qi Tian, University of Texas at San Antonio, USA

Antonio Torralba, MIT, USA

Chris Town, University of Cambridge, UK

Andrea Vedaldi, University of Oxford, UK

Alessandro Vinciarelli, University of Glasgow & IDIAP Research Institute

Song-Chun Zhu, University of California - Los Angeles, USA

More focused on applied ontology:

Mehul Bhatt, University of Bremen, Germany

Antony Galton, University of Exeter, UK

Marcello Frixione, Università di Salerno, Italy

Céline Hudelot, Ecole Centrale Paris, France

Rongrong Ji, Xiamen University, China

José Manuel Molina López, Universidad Carlos III de Madrid, Spain

Alessandro Oltramari, Carnegie Mellon University, USA

Kate Saenko, UMass Lowell, USA

Mohan Sridharan, Texas Tech University, USA

Konstantin Todorov, University Montpellier 2, France

Shiqi Zhang, Texas Tech University, USA

Technical Program

9.00 – 9.05


09.05 – 10.05

Keynote Talk: Fei-Fei Li, Stanford University, USA
A Tale of Two Senses: Recognizing Pictures and Grounding Words
An ultimate goal of computer vision is to be able to generate a story describing a given image. We argue that one fruitful way to represent the highly structured real-world images is to leverage on the ontological structure of the semantic world. In this talk, we first briefly show how object labelling can be more accurate and informative by exploiting the taxonomical structure of ImageNet. We then describe a recent, unpublished work on grounding fragments of descriptive sentences to their corresponding image fragments. This is done through a multi-modal embedding of visual and natural language data. Unlike previous models that directly map images or sentences into a common embedding space, our model works on a finer level and embeds fragments of images (objects) and fragments of sentences (typed dependency tree relations) into a common space. Our experimental evaluation shows that reasoning on both the global level of images and sentences and the finer level of their respective fragments significantly improves performance on image-sentence retrieval tasks.

10.05 – 10.30

Uncertainty Modeling Framework for Constraint-based Elementary Scenario Detection in Vision System
Crispim-Junior C. and Bremond F.

10.30 – 11.15

Coffee Break

11.15 – 11.40

Mixing Low-Level and Semantic Features for Image Interpretation
Donadello I. and Serafini L.

11.40 – 12.05

Events detection using a video-surveillance Ontology and a rule-based approach
Kazi Y., Lablack A., Ghomari A. and Bilasco I. M.

12.05 – 12.30

Semantic-Analysis Object Recognition: Automatic Training Set Generation Using Textual Tags
Abdulhak S. A., Riviera W, Zeni N., Cristani M., Ferrario R. and Cristani M.

12.30 – 14.00

Lunch Break

14.00 – 15.00

Keynote Talk: Werner Ceusters, New York State Center of Excellence in Bioinformatics & Life Sciences, USA
What can Ontological Realism and Referent Tracking contribute to computer vision?
The goal of Ontological Realism (OR) is the development of high quality ontologies that faithfully represent what is general in reality. The goal of Referent Tracking (RT) is to obtain equally faithful representations of what is specific, for instance through the creation of datasets whose data items would exclusively describe particulars and relationships between particulars in terms of the types and type-level relationships described in – ideally – OR-based ontologies. Whereas the principles of OR can contribute to the design of a universally applicable visual ontology able to seamlessly integrate with relevant domain ontologies, RT offers a framework for keeping track not only of what is described through visual recognition, but also of how the field of video recognition itself advances in terms of its theories, tools and applications.

15.00 – 15.25

Characterizing predicate arity and spatial structure for inductive learning of game rules
Dwibedi D. and Mukerjee A.

15.25 – 15.50

Perceptual Narratives of Space and Motion for Semantic Interpretation of Visual Data
Suchan J., Bhatt M. and Santos P. E.

15.50 – 16.00


16.00 – 16.45

Coffee Break

16.45 – 17.10

Multi-entity bayesian networks for knowledge-driven analysis of ICH content
Nikolopoulos S., Chantas G., Kompatsiaris I., Grammalidis N., Dimitropoulos K., Kitsikidis A. and Douka S.

17.10 – 17.35

ALC(F): a new description logics for spatial reasoning in images
Hudelot C., Atif J. and Bloch I.

17.35 – 18.00


Keynote Speakers

Fei-Fei Li

Stanford University, USA

Fei-Fei Li is an associate professor in the Computer Science Dept. at Stanford University. Her main research interest is in vision, particularly high-level visual recognition. In computer vision, Fei-Fei’s interests span from object and natural scene categorization to human activity categorizations in both videos and still images. In human vision, she has studied the interaction of attention and natural scene and object recognition, and decoding the human brain fMRI activities involved in natural scene categorization by using pattern recognition algorithms. Fei-Fei received a Ph.D. degree from CalTech in 2005 and a B.S. degree in physics from Princeton University. From 2005 to August 2009, Fei-Fei was an assistant professor in the Electrical and Computer Engineering Department at University of Illinois Urbana-Champaign and Computer Science Department at Princeton University, respectively. Fei-Fei is a recipient of a Microsoft Research New Faculty award, the Alfred Sloan Fellowship, a number of Google Research Award, an NSF CAREER award and winner of a number of international visual computing competitions (AAAI-SVRC 2007, PASCAL VOC 2011).

Werner Ceusters

New York State Center of Excellence in Bioinformatics & Life Sciences, USA

Werner Ceusters studied medicine, neuro-psychiatry, informatics and knowledge engineering in Belgium. Since 1993, he has been involved in numerous national and European research projects in the area of Electronic Health Records, Natural Language Understanding and Ontology. Prior to moving to Buffalo, he was Executive Director of the European Centre for Ontological Research at Saarland University, Germany. He is currently Professor in the Psychiatry Department of the School of Medicine and Biomedical Sciences, SUNY at Buffalo NY, Director of the Ontology Research Group of the New York State Center of Excellence in Bioinformatics and Life Sciences, Director of Research of the UB Institute for Healthcare Informatics, and PhD Program Director of the UB Department of Biomedical Informatics. His research is focused on the application of Referent Tracking for data management and the requirements of ontologies and terminologies to be useful for annotation under this framework.

Copyright © 2014