2009-06-16

Xbox Project Natal

A little less than a year ago I remember stumbling across the Zcam from 3DV Systems, the company promised two orders of magnitude decreases in the cost of flash array lidar through mass production- the trick is to market it as a device anyone can use, not just as a robotics or general automation tool. The company promised to be in the market by the end of 2008, and after emails went unanswered I assumed it was vaporware.



The closest competitor would be the Mesa Imaging SwissRanger, which I think goes for $5000-$10000. Beyond that there are very expensive products from Advanced Scientific Concepts or Ball Aerospace that are in the hundreds of thousands of dollars range at least. ASC made a deal with iRobot that might bring the price down through economies of scale, though they probably aren't going to put it on the Roomba anytime soon. More likely the Packbot which already costs $170K, why not round that up to half a million?

In late 2008 to early 2009 rumors surfaced that Microsoft was going to buy 3DV Systems, and now we have the official announcements about Natal. And of course no mention of 3DV Systems (which hasn't updated their webpage in over a year) or even how it measures the phase shift or time of flight of light pulses in a sensor array to produce depth images. Given enough processing power, the right software, and good lighting, it would be possible to do everything seen in the Natal videos with a single camera. The next step up would be stereo vision to get depth images- it's possible that's what Natal is, but it seems like they would have mentioned that since that technology is so conventional.

But that won't stop me from speculating:

Natal is probably a 0.5-2 megapixel webcam combined with a flash lidar with a resolution of 64x64 or 128x128 pixels, and maybe a few dozen levels of depth bins.

The low resolution means there is a ton of software operating on the video image and the depth information to derive the skeletal structure for full body motion capture. All that processing means the speed and precision is going to be somewhat low- it would be great to buy one of these and be able to record body movements for use in 3D animation software, machinima, independent games, or full body 3D chat (there's no easy way to do intersections or collisions with other people in an intuitive way so don't get too excited), but I doubt it will capture a lot of nuance.

The lidar might be continuous wave (CW) like the SwissRanger. This has an interesting property where beyond the maximum range of the sensor, objects appear closer again- if the range was 10 feet, an object 12 feet away is indistinguishable from one 2 feet away, or 22 feet away.

Beyond that, hopefully MS sees the potential for this beyond an Xbox peripheral. It would be criminal not to be able to plug this into a PC, and have at least Windows drivers, an SDK + DirectX support. The next most obvious thing would be to use it to promote MS Robotics Studio, and offer a module for that software to use the Natal. If it just has a USB connection then it could be placed on a moderately small mobile robot, and software could use the depth maps for collision avoidance and with some processing power be able to computer 3D or 2D grid maps (maybe like this) and figure out when it has returned to the same location.

The next step is to make a portable camera that takes a high megapixel normal image along with a depth image. Even with the low resolution and limited range (or range that rolls over), the depth information could be passed on to photosynth to reduce the amount of pictures needed to make a good synth. MS doesn't make cameras, but why not license the technology to Nikon or Canon? Once in dedicated cameras, it's on to cell phone integration...

The one downside is that the worst application seems to be as a gaming device, which is bad because I'd like it to be very successful in order to inspire competing products and later generations of the same technology. It is certainly not going to have the precision of a Wii MotionPlus, and maybe not even a standard Wii controller (granted that it can do some interesting things that a Wii controller can't).

But even if it isn't a huge success, it should be possible to get a device from the first generation, and it's only a matter of time before someone hacks it and produces Linux drivers, right?

2009-04-06

OpenCV example, and why does Google do so poorly?

Take searching for cvGetSpatialMoment:
http://www.google.com/search?hl=en&q=cvGetSpatialMoment&btnG=Google+Search&aq=f&oq=

All the top results are nearly useless, just code that doesn't help much if you don't know what cvGetSpatialMoment does.

The "CV Reference Manual" that comes with an install of OpenCV probably should come up first (the local html files of course aren't google searchable), or any real text explanation or tutorial of the function. So scrolling down further there are some odd but useful sites like http://www.ieeta.pt/~jmadeira/OpenCV/OpenCVdocs/ref/opencvref_cv.htm. I guess the official Willow Garage docs here haven't been linked to enough.

The official OpenCV book on Google is highly searchable, some pages are restricted but many are not.

Through all that frustration I did manage to learn a lot of basics to load an image and process a portion of the image to look for a certain color, and then find the center of the region that has that color.

IplImage* image = cvLoadImage( base_filename, CV_LOAD_IMAGE_COLOR );


split it into two halves for separate processing
IplImage* image_left = cvCreateImage( cvSize( image->width/2, image->height), IPL_DEPTH_8U, 3 );
cvSetImageROI( image, cvRect( 0, 0, image->width/2, image->height ) );
cvCopy( image, image_left );


convert it to hsv color space
IplImage* image_left_hsv = cvCreateImage( cvSize(image_left->width, image_left->height), IPL_DEPTH_8U, 3 );
cvCvtColor(image_left,image_left_hsv,CV_BGR2HSV);


get only the hue component using the COI '[color] Channel Of Interest' function
IplImage* image_left_hue = cvCreateImage( cvSize(image_left->width, image_left->height), IPL_DEPTH_8U, 1 );
cvSetImageCOI( image_left_hsv, 1);
cvCopy(image_left_hsv, image_left_hue);


find only the parts of an image within a certain hue range
cvInRangeS(image_left_hue, cvScalarAll(huemin), cvScalarAll(huemax), image_msk);


erode it down to get rid of noise
cvErode(image_msk,image_msk,NULL, 3);


and then find the centers of mass of the found regions
CvMoments moments;
cvMoments(image_msk, &moments, 1);
double m00, m10, m01;

m00 = cvGetSpatialMoment(&moments, 0,0);
m10 = cvGetSpatialMoment(&moments, 1,0);
m01 = cvGetSpatialMoment(&moments, 0,1);

// TBD check that m00 != 0
float center_x = m10/m00;
float center_y = m01/m00;


Copy the single channel mask back into a three channel rgb image

IplImage* image_rgb = cvCreateImage( cvSize(image_msk->width, image_msk->height), IPL_DEPTH_8U, 3 );
cvSetImageCOI( image_rgb, 2);
cvCopy(image_msk,image_rgb);
cvSetImageCOI( image_rgb, 0);


and draw circles on a temp image where the centers of mass are
cvCircle(image_rgb,cvPoint(int(center_x),int(center_y)), 10, CV_RGB(200,50,50),3);


All the work of setting channels of interest and regions of interest was new to me. I could have operated on images in place rather than creating many new ones, taking up more memory (and I would need to remember to free the memory created by all of them), but for debugging it's nice to keep around the intermediate steps.

2009-03-29

mewantee example

I've made enough fixes to mewantee to open it open and allow most of it to be viewed without logging in, and creating a user no longer requires activation.

There isn't much on there right now, but I have a good example: There's a project called crossephex I was working on a few months ago, and I'll probably start on it again soon. It's supposed to be a vj/visuals generating tool for processing similar to gephex. I need a bunch of basic graphics to use as primitives to mix with each other to create interesting effects, so on mewantee I have this request, which asks for help from other people generating those graphics. Each one shouldn't take more than a few minutes to make, of course I could do it myself but I think it's a good example of what the site might be good for.

2009-03-25

mewantee!

I created a website called mewantee using google appengine. It's closed to the public right now, but I need some users to try it out and tell me if they run into any problems using it normally, or any feedback at all. If you login with a gmail account (google handles the login, I won't know anything except your email address, and even that will be hidden from other users), I'll be sent a notification email and I can then activate your account.

What is it about? Mainly I'd like it to incentivize the creation of creative commons and open source content and it uses a sort of economic model to do it. Even if it is too strange or the kind of users needed to make it work don't show up, it was a good exercise to learn python and appengine.

Something else to figure out- I have mewantee.com pointing to mewanteee.appspot.com, is there any way to make it stay mewantee.com to everyone else like they way this blog is really on blogspot.com but is seen as binarymillenium.com.

2009-02-21

Gephex 0.4.3 updated for Ubuntu 8.10

Since there hasn't been a better version of Gephex since 0.4.3 (though I haven't tried compiling the repository recently, last time was not successful), I've downloaded the source and hacked it until it built on Ubuntu 8.10 updated to today:

http://binarymillenium.googlecode.com/files/gephex-0.4.3updated.tgz

I haven't tested it all the way, especially the video input modules, but it probably works.

Most of the changes have to do with updates to gcc, where it treats classname::method in cpp files as errors, and some files needed to include stdlib.h or string.h that didn't before. Also some structure definition in libavcodec had to be messed with- the static declaration removed.

nasm, qt3 in the form libqt3-headers, and libxv-dev had to be installed (and other non-standard things for 8.10 that I already had installed for other purposes). For qt3, flags for the include, bin, and lib dir needed to be passed to configure.

I had to run configure in the ffmpeg library and disable mmx with the --disable-mmx flag, putting that flag in the top-level makefile didn't work. My configuration specific makefiles are in the tarball so you would definitely have to rerun configure to override them.

Next I'll be creating a new custom gephex module for my ARToolkit multimarker UI project.

----

Update

I've tested this build more extensively, and have discovered that the Ubuntu visual effects that are on by default cause the gephex output window to flicker. To disable them go to System | Preferences | Appearance | Visual Effects and select none. It's possible I need to build gephex with OpenGL support and these options will co-exist better.

Also, my screencap frei0r module I've depended on extensively in the past updates extremely slowly on the laptop I'm using currently, it may be an ATI thing (I originally developed it on an Nvidia system).