Several ideas for 3D hand position tracking-please comment and answer questions

Hello,

I plan to build a sensing interface to track the position of one hand in a virtual cube with edges of 30 cm. The purpose is to control a musical setup hence precision and reactivity are seeked.

I have thought about four setups described below but I don't know which one to choose. Comments and also answers to the questions would be highly appreciated.

Setup #1

Tracking technique: two IR emitters on the palm of the hand and an array of IR receivers on the table. The vertical position would be found with the global strength of the received signal and the XY position would be found according to the receivers having the strongest signals.
Advantages:

  • should react very quickly
  • the presence of two emitters adds the possibility to track the orientation too
  • the installation of the emitter on the hand suppresses risks of interferences with the wrist in the detection

Drawbacks: must wear a glove connnected to the board
Questions: is there a risk of interferences with ambient light ? In this case could the replacement of IR with a specific color (blue, green) help ?

Setup #2

Tracking technique: array of IR Sharp optical triangulation sensors on the table. The sensors report the vertical position directly and the XY position can be calculated with the peaks in the sensed data.
Advantages:

  • should react quickly (at least with some models of Sharp sensors) but not as fast than with regular IR sensors
  • nothing has to be attached to the hand
  • the Sharp sensors are known to be immune against interferences of ambient light

Drawbacks: wrist will be sensed too hence the reported Y position might be wrong.
Questions: can the wrist become invisible by wearing black clothes ?

Setup #3

Tracking technique: four ultrasonic distance sensors are placed in the four corners of a 30 per 30 cm square on the table, directed to the center of the virtua square and upwards.
Advantages:

  • only four sensors needed
  • nothing has to be attached to the hand

Drawbacks:

  • wrist will be sensed for sure hence the reported Y position might be wrong.
  • the necessary synchronization of the four ultrasonic sensors will make the sensing period very long (around 40 ms) thus this setup won't be as reactive as #1, #2 and even #4

Questions:- isn't there any risk of serious non-linearities at the sides and close to the sensors even if they are oriented diagonally ?

  • is it dangerous for the ears to stay close (circa 50 cm) to the ultrasonic transceivers for long periods ?

Setup #4

Tracking technique: one ultrasonic emitter is set on the palm of the hand and four ultrasonic receivers are placed in the four corners of a 30 per 30 cm square on the table, directed vertically. The duration of the travel between the emitter and each receivers is used to calculate the XYZ position.
Advantages:

  • only four sensors needed
  • the installation of the emitter on the hand suppresses risks of interferences with the wrist in the detection
  • the four ultrasonic receivers can receive simultaneously therefore the sensing period is much shorter than in setup #3

Drawbacks: must wear the emitter on the palm so it might be a bit problematic if the hand has to perform operations on other controllers during the concert.
Questions: is it dangerous for the ears to be at a distance of circa 50 cm to the ultrasonic emitter for long periods ?

Currently setups #1 and #4 appear to me as the best solutions at first sight. I know setup #2 has been used in an existing device so it's feasible.

Thank you for helping.

what about kinect?

Laser finger tracking. A pair of galvanometers scan the laser in a small circle. Light sensors detect the light bouncing off the fingertip. The galvanometers are adjusted if the little circle takes it off the fingertip.

http://www.k2.t.u-tokyo.ac.jp/members/alvaro/Publications/LaserTracking.pdf

See also: Finger tracking - Wikipedia

Isn't Kinect made for bigger distances (at least one meter) ? Also I can't find how often the data are updated. Do you have any idea ?

Ok now I have the information about Kinect: 30 fps framerate and 1.2 m minimal distance. So it is not appropriate for a tabletop sensing device.

1.2m minimal distance with unmodified optics. What prevents you from modifying the lens?

Err... ...nice idea for sure but how do you do it ? Curious to know even if it won't change the sampling rate anyhow.

Are you trying to track the hand, or the fingers? People working on Augmented Reality (AR) have made big strides in markerless tracking recently. It's something that calls for a decent spec PC though, not an Arduino.

Looking for low-tech brute force solutions, depending what environment you plan to work in, would any sort of ultrasonic reflection or light beam breaking be feasible?

Of all your ideas, setup #4 is likely to be the most practical - if you structure it somewhat akin to how the Mattel PowerGlove (Power Glove - Wikipedia) was built (three ultrasonic receiver elements in a vertical plane, and a single ultrasonic transmitter element on the hand). Note that on the PowerGlove, there were two transmitters on the hand (to detect yaw/pitch/roll of the hand in a limited amount); since you don't indicate the need for yaw/pitch/roll detection, a single element should suffice.

There should be no danger of exposure to the ultrasound, even at the closer range. The greater issue will be that of accuracy (but for all of your suggestions, you'll have that issue; note that for your 3rd suggestion, the hand is fairly soft, and will -not- reflect ultrasound well, if at all) - mainly from hard-surface reflections and such (which is why with the PowerGlove, when people were experimenting with it for homebrew PC VR tracking purposes back in the 1990s, it was suggested to mount it to a soft/cusioned-covered suface instead of to a hard wall or similar).

You might find it easiest to simply find (try Ebay) a complete PowerGlove and hack it; there is still plenty of information out there on how to do it (if you need any, I have collected just about everything on the internet about it). At one time, there was a device you could plug it into called the "Minelli Box", which was a serial interface box that (IIRC) used an 8051 microcontroller with custom code on it to translate the serial stream from the PowerGlove into a more intelligible stream of serial data for interfacing with a PC. If you look hard enough, you can find the schematic of the box, plus the code (but given the amount of time that has passed, you will have to dig a lot - I have a copy, though).

The main information about hacking the PowerGlove, though, comes from an article that appeared in Byte magazine around the time (early 1990s - maybe 1991, but my brain is fuzzy on these details). All in all, hacking a PowerGlove to work with the Arduino would likely be a much simpler route than re-inventing the wheel (then again, maybe this particular wheel needs reinventing).

Note that regarding your other suggestions - there was another Nintendo hand tracking interface called the U-Force (U-Force - Wikipedia), which used infrared to track the hand in a limited area; it was deemed pretty terrible to use, though - but it was another device that was briefly hacked for homebrew VR use (at least, I believe it was - I think I have some information somewhere on it).

You may also be interested in this paper, which I found during some googling (it discusses using the wiimote for hand-tracking):

http://rrsg.uct.ac.za/theses/ug_projects/wronski_ugthesis.pdf

Another possibility would be to use a couple of web cameras in a frame-locked fashion (finding such cameras that work this way isn't easy, though - quality cameras with a high enough frame-rate that can do this tend to be expensive), plus some machine vision software (ie, OpenCV or similar) to perform stereo image analysis to return the position of the hand and fingers (this is going to take a lot more effort on the software sides rather than on the hardware end, of course).

Finally - if you want high-accuracy in small areas, your best bet will likely be using magnetic tracking; the leaders in this regard have long been the companies Polhemus (http://polhemus.com/) and Ascension Technology (http://www.ascension-tech.com/). Note, though, that their devices are anything but inexpensive (I own an older Ascension Flock of Birds that I managed to pick up cheap off of Ebay - to purchase it new today would run well over $6000.00 USD!).

Recently, there have been a couple of "new arrivals" on the scene of 3D tracking as well - XSens (http://www.xsens.com/) and PNI Sensor (http://www.pnicorp.com/products/spacepoint-gaming).

There are also a ton of papers and other information out there on doing 3D tracking, using optics, infrared, ultrasound, capacitive sensing, magnetically, etc - you'll literally find gigabytes of information out there on it. None of it is easy (and the so-called "easy" methods tend to be either noisy or inaccurate - or both), at least not in the way you are imagining. Do some research, and gather more information, before you make a decision on what to try.

PeterH:
Are you trying to track the hand, or the fingers? People working on Augmented Reality (AR) have made big strides in markerless tracking recently. It's something that calls for a decent spec PC though, not an Arduino.

Actually I'd like to track the center of the hand because the size of the hand is big compared to the size of the tracking space. That's another reason to put the emitter at the center of the palm.

cr0sh:
Of all your ideas, setup #4 is likely to be the most practical - if you structure it somewhat akin to how the Mattel PowerGlove (Power Glove - Wikipedia) was built (three ultrasonic receiver elements in a vertical plane, and a single ultrasonic transmitter element on the hand). Note that on the PowerGlove, there were two transmitters on the hand (to detect yaw/pitch/roll of the hand in a limited amount); since you don't indicate the need for yaw/pitch/roll detection, a single element should suffice.

Yes I think I'll try this setup also because this is the one with the minimum of sensors. I'll check the rates for the PowerGlove on eBay but haven't ultrasonic sensors evolved a bit since that time ?

You may also be interested in this paper, which I found during some googling (it discusses using the wiimote for hand-tracking):

http://rrsg.uct.ac.za/theses/ug_projects/wronski_ugthesis.pdf

Interesting. I discovered this too: Wii IR camera as standalone sensor - Tutorials - RobotShop Community

The Wiimote is reported to have a 100 Hz sampling rate, which is much better than the Kinect.

Another possibility would be to use a couple of web cameras in a frame-locked fashion (finding such cameras that work this way isn't easy, though - quality cameras with a high enough frame-rate that can do this tend to be expensive), plus some machine vision software (ie, OpenCV or similar) to perform stereo image analysis to return the position of the hand and fingers (this is going to take a lot more effort on the software sides rather than on the hardware end, of course).

I try to avoid camera-based solutions because they require much processing power on the computer side and I try to keep it for sound processing. I have been bulding a multitouch table with only one camera and yet it needs 50% of a recent processor to run (using CCV).

Finally - if you want high-accuracy in small areas, your best bet will likely be using magnetic tracking; the leaders in this regard have long been the companies Polhemus (http://polhemus.com/) and Ascension Technology (http://www.ascension-tech.com/). Note, though, that their devices are anything but inexpensive (I own an older Ascension Flock of Birds that I managed to pick up cheap off of Ebay - to purchase it new today would run well over $6000.00 USD!).

What's the difference between those systems and a 9DOF-IMU containing a 3D-magnetometer alongside a 3D-accelerometer and a 3D-gyrometer ?

Recently, there have been a couple of "new arrivals" on the scene of 3D tracking as well - XSens (http://www.xsens.com/) and PNI Sensor (http://www.pnicorp.com/products/spacepoint-gaming).

Xsens seems to be a classical IMU-based solution, which is good to sense orientation but can be problematic to sense position don't you think ?

There are also a ton of papers and other information out there on doing 3D tracking, using optics, infrared, ultrasound, capacitive sensing, magnetically, etc - you'll literally find gigabytes of information out there on it. None of it is easy (and the so-called "easy" methods tend to be either noisy or inaccurate - or both), at least not in the way you are imagining. Do some research, and gather more information, before you make a decision on what to try.

Yes. That's what I have been doing for some time but I haven't much experience in electronics so I learn everyday.

Thanks a lot.

Perhaps something capacative? Im not sure how they work too well but perhaps an array of sensitive capacitative sensors can detect x,y and calculate z
im not sure at all the precisness but just an idea

winner10920:
Perhaps something capacative? Im not sure how they work too well but perhaps an array of sensitive capacitative sensors can detect x,y and calculate z
im not sure at all the precisness but just an idea

I began with capacitive sensing after having read that tutorial from Instructables : DIY 3D Controller : 8 Steps (with Pictures) - Instructables I built it but it's not accurate at all. I read tons of articles about that subject afterwards but to have sensing distances corresponding to my project it becomes a bit complex. There are issues with shapes of the electrodes, you need a driven shield, etc. It's too complex for me at the moment but it could be a solution for someone with a better knowledge of basic issues in electronics and electric fields.

amundsen:
Yes I think I'll try this setup also because this is the one with the minimum of sensors. I'll check the rates for the PowerGlove on eBay but haven't ultrasonic sensors evolved a bit since that time ?

Not really. For instance, back in the 1980s, one of the most commonly used ultrasonic sensors for robotics was pulled from an old Polaroid camera; today, that same sensor is still being sold (it is one of the better ones out there), known as the SensComp/Polaroid 6500 series. PowerGloves are getting more difficult to find, though - especially with all the components (you can generally find the glove, but the L-bar might be missing).

amundsen:
I try to avoid camera-based solutions because they require much processing power on the computer side and I try to keep it for sound processing. I have been bulding a multitouch table with only one camera and yet it needs 50% of a recent processor to run (using CCV).

You can use multiple machines, and read the data over the network or other connection; maybe I'm weird - I've got a ton of older PCs and components and other junk in my shop, just begging for being put to a purpose. Even if you don't, PC hardware is a commodity item and very cheap to acquire even brand new. Dedicate a machine to the vision processing and tracking task if you must.

amundsen:
What's the difference between those systems and a 9DOF-IMU containing a 3D-magnetometer alongside a 3D-accelerometer and a 3D-gyrometer ?

IMU/magnetometer solutions rely upon the earth's magnetic field, and gravity - to determine position and orientation, whereas magnetic trackers rely on a generated (and known) magnetic field being output by the device, and sensed via three small orthogonally-opposed sense coil sensors mounted on the object being tracked (these sensors used to be quite large - a bit smaller than a golf-ball; but recently, they've gotten them down to smaller than a grain of rice). The sensing allows both position and orientation information to be determined from the generated magnetic field - it tends to be much more accurate than other methods, from what I understand. It also tends to be much more expensive, unfortunately (you wouldn't believe the amount of signal processing needed for the received signals to clean up noise, then to calculate the position/orientation).

amundsen:
Xsens seems to be a classical IMU-based solution, which is good to sense orientation but can be problematic to sense position don't you think ?

Perhaps - I would think both would have problems with drift over time; but I don't have any direct experience with either, so I can't comment on any specifics. I just wanted to point them out as new possibilities - I have also heard that both solutions are relatively inexpensive, compared to the "professional" offerings by Polhemus/Ascension.