Financial support

Touchless interface by realtime depth video analysis

Members

Elias Ximenes, Odemir M. Bruno

Introduction

The touchless interface is being widely studied and developed for the computer user interaction. In recent years, interfaces for video games based on human motion, obtained by cameras of all kinds are being released for various consoles and computing platforms.

Besides video games, other applications should be benefited with this technology, making the utilization of electronic devices more safe, simple and immersive, such as data entry on mobile devices, navigation in a virtual environment, modeling of three-dimensional objects, control robots and avatars, simulation of tools from the real world, as well as situations where contact with the user equipment is not advisable either to prevent the transmission of pathogens to them or to minimize possible physical damage to equipment.

(a) Processes of a computer vision-based interface.

Approaches:

Depth sensors

We decided to use depth cameras based on infrared structured light, (like Microf Kinect, Prime Sensor e Asus XTion), as an input device for our experiment.

The main reasons for that were its ability to record scenes matrices depth, and the range of useful information for several tasks that these devices can provide.


Contour Extraction with Smart Snakes

Knowing the three-dimensional coordinates of the hands, the segmentation of depth can be made by removing pixels from a given distance range around a detected position. thus, we can segmentad the user's hands and small objects that they are loading or playing. To mitigate the noise characteristic of technology, we use active contour algorithms in the corresponding color images, which deform a contour to match features of interest in an image, such as borders or boundaries. The name given Smart Snakes is due to the fact that the contours are deformed during the iterative process, such as snakes in motion.

 

Mahalanobis distance of Furrier signature

The contour points coordinates frequencies of hands different poses and other objects preselected by the user, are extracted using an FFT (Fast Fourier Transformation). We use this information as a countor signature. Other possible approaches would be the calculation of its fractal dimension, a graph of distance  for angle and projection of countor.

The average of each signatures dimension from allmembers by the same class is calculated and compared using the Mahalanobis distance of the countor signature obtained in real-time by users control.