Behaviour from Head Pose
The aim of the project is to automatically identify the direction in which
people are facing from a distant camera in a surveillance situation to
provide input to higher level reasoning systems. The direction in which
somebody is facing provides a good estimate of their gaze direction, which
can be used to infer familiarity between people or interest in surroundings.
It can be seen as closing the gap between a coarse description of humans
from a distance and a more detailed motion of limbs, usually obtained from a
closer view. The work is partly funded by HERMES, located in work package 3
and 4.
Active Scene Exploration
Effective use of resources is an underlying theme of this project. The
resources in question are a set of cameras which overlook a common area from
varying viewing angles. These cameras are heterogenous and have different
parameters for control, e.g. some are static, some are pan, tilt and zoom
cameras. Information theoretic measures are used to choose the best
surveillance parameters for these cameras, whereas best can be defined by
higher level reasoning, or human operators. Currently, the work concentrates
on objective functions from information-theory and the use of sensor data
fusion techniques to make informed decisions.
As part of the HERMES project, the goal is to establish a perception/action
cycle with specific consideration of varying zoom levels. The distributed
camera system can be interpreted as an abstract sensor which is content with
higher level objectives as input.
The coarsest scale of an agent representation is considered to track agents
and note their trajectories, together with other coarse scale features, that
will be useful for action and intention recognition. The aim is then to
generate behaviour and conceptual descriptions about the agent itself and
its relationship with respect to other agents and predefined objects in the
scene.
Cognitive Computer Vision
Recent work in visual tracking and camera control has looked at the issues
involved in activity recognition using parametric and non-parametric belief
propagation in Bayesian Networks, and begun to touch on the issues of
causality. The current research takes all of these areas forward. The
ultimate goal will be to combine these techniques to produce a pan/tilt/zoom
camera system, and/or network of cameras, that can allocate attention in an
intelligent fashion via an understanding of the scene, inferred
automatically from visual data.
The topic is directly related to the EU project HERMES, which is in the
exciting and socially relevant area of intelligent visual surveillance. The
aim of the research is to develop cameras systems that could be considered
to exhibit emergent cognitive behaviour, through developing algorithms and
ontologies for understanding of visual scenes.
The video compares the monocular SLAM system running with and without object detection in a spit-screen view. The system without the object detection looses track due to insufficient features, and at this point the video is slowed down to highlight this. The system with the object detection continues and at the end of the video it has successfully detected all five objects and accurately localized them in the world.