Augmented Reality Glasses + Depth Camera -> Stereo Vision -


on our current project using depth camera mounted on top of user's head recognize fingers, hands , touch events. works quite , can used new type of input device.

our next step use augmented reality glasses display buttons/controls onto user's palm. step need transformation of our recognized data (finger tip, corner points of palm quadrangle) display them onto correct location on augmented reality glasses. in future use real 3d output scene, displaying 2d image our glasses. can imagine whole setup stereo view depth camera , users eyes cameras.

to transformation matrix successively display random point on output image , user has hold finger tip onto location. point correspondences between input image (depth camera) , output image (augmented reality glasses). use 20 of these correspondences , use emgu's findhomography() method transformation matrix.

for our first effort works ok, it's not perfect. how should proceed better results?

what have:

  • 2d pixel coordinates in our input image (depth camera 320x240)
  • 3d coordinates (relative our depth camera)
  • (corresponding 2d pixel coordinates in output image)

what need:
method maps 2d pixel coordinate or 3d coordinate relative our depth camera our output image (2d now, maybe 3d later).

question:
type of transformation should use here? findhomography(), getperspectivetransformation(), fundamentalmatrix?, essentialmatrix?

any help/suggestion appreciated. thank in advance!

fist, findhomography(), getperspectivetransformation() same transformations except former takes repetitive attempts @ later in ransac framework. match points between planes , aren’t suitable 3d task. fundamentalmatrix , essentialmatrix aren’t transformation buzz words heard ;). if trying re-project virtual object camera system glasses point of view have apply rotation , translation in 3d object coordinates , re-project them.

the sequence of steps is:
1. find 3d coordinates of landmark (say hand) using stereo camera;
2. place control close landmark in 3d space (some virtual button?);
3. calculate relative translation , rotation of each of goggles viewpoints w.r.t stereo camera; example, may find right goggles focal point 3cm right stereo camera , rotated 10 deg around y axis or something; importantly left , right goggles focal point shifted in space creates image disparity during re-projection (the greater depth smaller disparity) brain interprets stereo cue depth. note there plenty of other cues depth (for example blur, perspective, known sizes, vergence of eyes, etc.) may or may not consistent disparity cue.
4. apply inverse of viewpoint transformation virtual 3d object; example if viewer goggles move right (wrt stereo camera) object moved left;
5. project these new 3d coordinates image col=xf/z+w/2 , row=h/2-yf/z; using opengl can make projection nicer.


Comments

Popular posts from this blog

Android layout hidden on keyboard show -

google app engine - 403 Forbidden POST - Flask WTForms -

c - Why would PK11_GenerateRandom() return an error -8023? -