Hand Pose Estimation using Multi-viewpoint Cameras

The complexity of the human hand makes its use as an input device complicated. The most common way is to fit the hand with a data glove that encodes finger joint angles. It is coupled with electro-magnetic tracker to encode 3D position of the wrist. This set-up gives a (almost) complete information on the hand. However, it is also cumbersome and expensive.

An alternative is to use camera(s) to obtain hand pose and motion information. Our research uses a vision-based model-based approach to estimate the hand pose. We use a skeletal model of the hand with 31 DOF and cover it with a quadrics surface which acts as a skin.

Silhouette images of an actual hand pose from multiple viewpoint cameras are converted to voxel data. This becomes the observation data of the pose estimation system. The original work by Ueda et al (link on the right) uses force-based deformation to adjust the skeletal model to fit the voxel data.

I used non-linear filter to be able to determine the complete hand pose of the human hand instead of model-fitting. Unscented Kalman Filter (UKF) was used to simultaneously obtain the local (finger joint angles) and global (wrist position and palm orientation). Results using virtually generated hand motion showed the feasibility of recovering complete hand posture using multiple cameras and voxel data.

Related Publications:

A Causo, E Ueda, K Takemura, Y Matsumoto, J Takamatsu and T Ogasawara, “User-adaptable Hand Pose Estimation Technique for Human Robot Interaction”, Journal of Robotics and Mechatronics,  21(6). pp. 739-748, Dec 2009. 

A Causo, E Ueda, K Takemura, Y Matsumoto, J Takamatsu and T Ogasawara, “Predictive Tracking in Vision-based Hand Pose Estimation Using Unscented Kalman Filter and Multi-viewpoint Cameras”, In: Human-Robot Interaction. Daisuke Chugo (Ed.), ISBN: 978-953-307-051-3, pp.155-170, I-Tech, 2010.2. 

A Causo, M Matsuo, E Ueda, K Takemura, Y Matsumoto, J Takamatsu and T Ogasawara, “Hand Pose Estimation using Voxel-based Individualized Hand Model”, Proc of the 2009 IEEE/ASME Int Conf on Advanced Intelligent Mechatronics (AIM 2009),pages 451-456, Suntec Convention and Exhibition Center, Singapore. 14-17 July 2009.

A Causo, M Matsuo, E Ueda, Y Matsumoto, and T Ogasawara, “Individualization of voxel-based hand model”, Proc of the 4th ACM/IEEE Int Conf on Human Robot Interaction (HRI 2009). pages 219-220. La Jolla, California, USA. 2009.

A Causo, E Ueda, Y Kurita, Y Matsumoto, T Ogasawara. “Model-based Hand Pose Estimation Using Multiple Viewpoint Silhouette Images and Unscented Kalman Filter”. Proc of the 17th Int Symp on Robot and Human Interactive Communication (RO-MAN 2008). pages 291-296. Munich, Germany. 1-3 Aug 2008. [download here]

CAUSO Albert, 上田悦子,松本吉央,小笠原司. “UnscentedKalmanFilterを用いた指と手姿勢の同時推定”,第25回日本ロボット学会学術講演会予稿集, 1H24, 13-15 Sep 2007.

AJ Causo, E Ueda, Y Matsumoto, and T Ogasawara. “Simultaneous Estimation of Hand-Pose Parameters Using Multiviewpoint Silhouette Images”, 第6回計測自動制御学会システムインテグレーション部門講演会(SI2005), Dec 2005.