the-midnight-paper

Robust Computer Vision-Based Detection of Pinching for One and Two-Handed Gesture Input

Authors: Andrew D Wilson

UIST 2006, Montreux, Switzerland

gesture, computer vision, bimanual interaction, hand tracking

Strength

Interesting use of connected components of the segmented background against the hand to detect pinch gesture. By detecting the hole formed when the thumb touches the forefinger, this technique avoids the use of complex hand shape analysis, sophisticated pattern recognition techniques and fingertip tracking to detect pinch gesture.

Weakness

The technique assumes a camera will be mounted on top of a keyboard to detect gestures. Though this setup makes background static, it is quite uncommon and not much convenient for general use. While paper defines hole components as those background components of significant size that have no pixels on the border, it does not explain if this can be generalized across different hand colors/sizes. Also, no mechanism for object selection.

Critique

I would extend this pinch detection technique to non-static backgrounds with semi/un-controlled viewing circumstances. This technique should further be applied to create coherent interactions which stays consistent across different user preferences(one/two handed inputs). For example, single hand and bimanual interaction techniques are not coherent in TAFFI prototype but this can be fixed.


Usefulness of pinch gesture

In Graph Theory, there exists a path between any two vertices of a “connected component”. In image processing, such components refer to connected groups of identically labeled pixels in a binary image; often each component corresponds to a distinct object which is subsequently analyzed

In CV, image segmentation is the process of partitioning a digital image into multiple segments (sets of pixels, also known as super-pixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images

The pinch detection algorithm:

The largest background component usually corresponds to the background surrounding the hand, while any remaining background components of significant size usually corre- spond to holes formed by the hand. In practice, it is better to define hole components as those background components of significant size that have no pixels on the border, rather than relying on size alone.