A Look At Challenges In Creating The Next Generation Kinect For The Xbox One

12

I think Playstation is going to seriously regret not bundling their version of Kinect with the PS4.  Maybe they knew it would not be as good as Kinect 2.0 anyways.

Cyrus Bamji, Microsoft partner hardware architect for Microsoft’s Silicon Valley-based Architecture and Silicon Management group, and members of his team were trying to incorporate a time-of-flight camera into Xbox One. 

A time-of-flight camera emits light signals and then measures how long it takes them to return. That needs to be accurate to 1/10,000,000,000 of a second; the speed of light. With such measurements, the camera is able to differentiate light reflecting from objects in a room and the surrounding environment. That provides an accurate depth estimation that enables the shape of those objects to be computed.

That speed-of-light capability would be a major advancement for the Kinect sensor portion of Xbox One, being released to 13 launch markets next month. The new Kinect, a key differentiator for Xbox One against its competition, needed to capture a larger field of view with greater accuracy and higher resolution. An infrared sensor will enable object identification requiring little to no light, and improved hand-pose recognition, giving gamers and more casual users the ability to control the console with their hands.

“When we take a relatively new technology, such as time-of-flight, and put it into a commercial product, there are a whole bunch of things that happen,” he says. “There are things that we didn’t know how important they were until the product was made. For example, we know theoretically that motion blur in time of flight is a big problem, but just how important is only discoverable when you’re building a product with it and that product needs to deliver an excellent experience.”

Accurate depth measurement in diverse scenes with the new camera’s high resolution and a wider field of view also pose user-experience issues, making it difficult to keep small objects, such as a finger, from fading into the background, for instance. While those features delivered more versatile device performance, they also created issues of their own in real-life scenarios, such as the need for accurate depth measurement in diverse, high-resolution scenes. That, as well as improving the wider field of view and the motion blur, required clean data—quickly. Xbox One had to be ready for the 2013 holiday season.

The analog nature of the time-of-flight data posed challenges to delivering such a solution.

“The time-of-flight data coming out of our sensor is per pixel, per frame, and there is a lot more analog information,” Acharya says. “Another issue was that the foreground objects close to the background objects would melt into the background—again, due to the analog nature of how our sensor provides the depth data for pixels that land on edges.”

“This resulted in a lot of information, and to make it easier for foreground/background extraction and scene segmentation, use by software and game developers, the requirement was to clean up this data simultaneously by adding software algorithms in the pipe, yet without incurring a performance hit. This was crucial. We started with various work streams and, in the end, settled on making optimization to the parameters in the system to overcome the issue.”

The collaborators wanted to deliver a clear separation of foreground and background even if the objects are close to each other. That, too, proved difficult. And then there was motion blur.

 “Motion blur,” Acharya explains, “is a parameter that needs to be minimized and is not technology-specific. The time-of-flight camera uses global shutter, which has helped reduce motion blur significantly—from 65 milliseconds in the original Kinect to fewer than 14 milliseconds now.”

Other challenges presented themselves. For one thing, processing time became an issue. In the academic literature about time-of-flight systems, processing time wasn’t an issue. In the laboratory environment, the technology worked fine. But Xbox One needs to process a whopping 6.5 million pixels per second. And only a small part of Xbox One’s computing power could be harnessed for this task. The lion’s share is reserved, understandably, for essentials such as gaming, skeleton tracking, face recognition, and audio.

“You need to do very, very light computation for each pixel,” Krupka says, “and this is one of the things that made the problem challenging and different from the typical approach in the academic literature in this field.”

Remarkably, it all came together, and that means that while entertainment lovers worldwide will soon find themselves delighted by the Xbox One experience, so, too, will those eager to develop for the platform. Reducing that edge-data noise makes the data developer-ready, and being able to segment clearly between the foreground and the background solves a complex computational problem. The data is clean, and it can be absorbed more easily by game developers.

Another fascinating feature of the Kinect sensing device in Xbox One stems from its infrared sensor, which can identify objects in a completely darkened room. It can recognize people and track bodies even without any light visible to the naked eye. It can identify a hand pose from four meters away, see the fingers of a child, and remember your identity even minus room illumination.

The wider field of view makes it possible for more players to play an Xbox One game at the same time. With the new console, as many as six players can crowd into one scene. A tall adult can play with a small child without either being squeezed out of the picture. Users get a better experience if they’re standing close by, farther away, or on the periphery of the room.

And the improved hand-pose recognition enables users to interact with the Xbox One just by using their hands—no controller necessary. Thanks to the infrared camera, hand activities can be identified at any illumination or with none at all. Prior hand-pose solutions were able to deliver speed or accuracy, but not both. The hand-pose solution jointly devised by the Xbox team and Microsoft Research can do both.

Source: The Official Microsoft Blog

About Author

Suril is a scientist, journalist and obsessive Microsoft observer. He holds an advanced degree in Biotechnology with minors in Biochemistry, Microbiology, and Molecular Biology. Send him tips on twitter: http://www.twitter.com/surilamin

  • koenshaku

    Co-Developed with the NSA…

    • cs

      That joke is getting old…

    • rjmlive

      Stop

    • Yuan Taizong

      I almost voted this up thinking he was talking about the N.A.S.A. but that’s a completely different branch of government.

    • nohone

      You are using the internet to post your comment, an internet where the NSA has the ability of tracking everything you do. You use a phone, which has a direct line into the NSA in Fort Meade. They are searching your garbage daily so they know what foods you like to eat, and that way they can slip the nano bots into your food so they can control you. There are currently black helicopters flying above you with infrared so they can track your every move, they also have Tomahawk missiles prepared in case you do something they do not like. There are police across the streets holding brain wave harvesting devices so they can not only read, but also control every action you take.
      If I were you, I would worry less about Kinect and worry more about what is happening today. I suggest you make yourself a faraday cage (the tin foil hat is not good enough, you need that can block higher energy devices) and start diging an underground shelter. Start today, there is no time left. We will miss you, but you need to do it now. We will let you know when it is safe to emerge, but don’t tell us where you build it in case “they” get to us, you are too important and we don’t want to be the one to give away your location.

      • auziez

        lol. this made me giggle.

      • donzebe

        Lol

      • Guest

        Achievement unlocked! Nohone wins the interwebs.

    • Termin8ed

      Don’t forget the SPF 30+ for your tinfoil hat

  • Yuan Taizong

    I can’t wait to see what video-game developers will do with this :-D
    This technology is really impressive.

  • http://www.lejournaldunumerique.com/ Beugré Jean-Augustin

    Actually Sony do have the PlayStation Camera but it’s a joke compare to Kinect 2.0 and it’s not inclued in the bundle.

    • techieg

      Yes, this is what I’m saying with the pricing game Sony is playing because when you add the lame camera to the PlayStation the bundle is basically the same price as the Xbox One. I hope the public figures out this sales gimmick for something of much less value.