Posted by: kurtsh | February 8, 2010

ARTICLE: “Binary Body Double: Microsoft Reveals the Science Behind Project Natal for Xbox 360”

image This was unique enough to highlight:  An overview of our motion capture controller, “Project Natal” for Xbox 360

There really hasn’t been much in the way of an explanation on how Natal works until now.  This interview with Scientific American goes over the components of the Natal “camera” along with the technology that enables the device to interpret movement in real time across multiple body types and differentiate between individuals even when the overlap.

Instead of trying to preprogram actions, Microsoft decided to teach its gaming technology to recognize gestures in real time just like a human does: by extrapolating from experience. Jamie Shotton, a researcher at Microsoft Research Cambridge in England, devised a machine learning algorithm for that purpose. It also recognizes poses and renders them in the game space on-screen at 30 frames per second, a rate that conveys smooth movement. Essentially, Natal-enhanced Xboxes will do motion capture on the fly, without the need for the mirror-studded spandex suit of conventional motion-capture approaches.

Training Natal for this task required Microsoft to amass a large amount of biometric data. The firm sent observers to homes around the globe, where they videotaped basic motions such as turning a steering wheel or catching a ball, Kipman says. Microsoft researchers later laboriously selected key frames within this footage and marked each joint on each person’s body. Kipman and his team also went into a Hollywood motion-capture studio to gather data on more acrobatic movements.

"During training, we need to provide the algorithm with two things: realistic-looking images that are synthesized and, for each pixel, the corresponding part of the body," Shotton says. The algorithm processes the data and changes the values of different elements to achieve the best performance.

To keep the amount of data manageable, the team needed to figure out which elements were most relevant for training. For example, the system doesn’t need to recognize the entire body mass, but only the spacing of skeletal joints. After whittling down the data to the essential motions, the researchers mapped each unique pose to 12 models representing different ages, genders and body types.

imageimage image


Categories