Share

Research at CVAP

The Computer Vision and Active Perception Laboratory performs research in computer vision, robotics and machine learning.

Robotic systems that provide advanced service in industry, for search and rescue operations, in medical applications or as assistants to elderly will become an integral part of the future society. Initial systems have for the past couple of decade been deployed in industrial and service sectors but there are still no systems that are ready for the consumer markets. Robots are expected to act in the environments built for humans, hence they should posses some of the human capabilities both in terms of locomotion (e.g. to handle stairs), dexterity (e.g. to manipulate objects and tools made by and for humans) and reasoning (e.g. to exchange information with humans in a natural manner).

In the context of mobile robotics we work along two main directions. We look at methods for adaptation to, and exploitation of, long-term experience. The aim is to have the robot run for a long time (months) and show that tasks can be executed better by making use of the gathered knowledge. We study how 3D space changes over time and extract quantitative and qualitative spatio-temporal models supporting both navigation, task planning and reasoning. In another thread of work we look at how we can model and leverage on the structure that is inherent in indoor environments.Most research to date have actively avoided making assumptions about the structure as the focus has been the underlying algorithms and theory and its ability to deal with a completely unknown setting. We want to learn statistical models from, for example, floor plans datasets, to form strong priors.

We also work on robot interaction with and manipulation of objects. Compared to humans or primates, the sensing and dexterity of today’s robotic hands and medical prostheses is extremely limited. The latter commonly have only a single degree of freedom allowing them to grasp and manipulate a limited set of objects. Replicating the effectiveness and flexibility of human hands requires a fundamental rethinking of how to exploit the available mechanical dexterity. To achieve this, our approach is to develop mathematical models and techniques for motion representation suitable for i) recognition and understanding of dexterous motion of human hands, ii) generation of hand control strategies in robots, and iii) evaluation of the mechanical dexterity of complex structures. The general idea is to marry the topology-based representations with statistical graphical models supporting mappings between symbolic level and low-level sensory control spaces.

Research in Computer Vision is directed towards technologies for human visual perception and memory support. Visual perception is central to our ability to interact with the surrounding world.  Vision is also very central in forming memories of our daily life and interactions with other people. The degradation of our abilities to perceive and memorize visual stimuli is therefore of general disadvantage. The possibilities of compensating for this degradation have historically been very small.  Developments in technology of camera design, computers and the understanding of methods for building artificial visual systems that interpret and organize visual information is however changing this situation to the better. In the future there is a definite possibility that artificial systems will be built that capture and process visual information with the same performance as the human visual system and are able to organize the visual input into memories that can be used to support a degrading human perception and memory. These systems will, when fully developed serve as "cognitive prostheses", in analogue to the way physical prostheses replace human body parts. Even today however it can be noted that systems can be built that aid in e.g reading and in interpreting essential visual information in the environment.

CVAP advances these systems towards fully autonomous visual information aids by:
1) developing the state of the art of automatic artificial visual processing with special emphasis on visual data from systems that can be worn in an unobtrusive manner, and
2) investigating methods of automatic visual memory selection from these
systems in order to enhance failing human visual memory.

Our work on analysis of human activity in video follows three directions. One area is modeling of human activity and human-object interactions for the purpose of robot learning from human demonstration. Another area is modeling and recognition of chains of human activities for the purpose of mining large amounts of video during forensic investigations, or for recognizing activity and making decisions in real-time surveillance. Third, we are beginning to address the very challenging task of modeling suddle non-verbal cues, such as facial tensions or small shoulder movements. Working methods would mean, for instance, new powerful tools for cognitive psychology studies, and more accurate synthesis of human motion and appearance in games movies.