binarymillennium: processing

Showing posts with label processing. Show all posts

2009-09-12

Instructions for rendering with Processing on Amazon EC2

There are detailed instructions elsewhere on how to get started with EC2 in general, here are the high level things to do for my headless rendering project:

Get a unix command line environment that has python and ssh, I use cygwin under Windows, other times I dual boot into Ubuntu.

Get an Amazon EC2 account, create a ~/username.pem file, and make environmental variables for the keys (follow boto instructions).
Make sure pem permission are set to 700.

Edit ssh_config so that StrictHostChecking is set to no, otherwise ssh sessions started by the scripts will ask if it's okay to connect to every created instance- I could probably automate that response though.

Make sure there are no carriage returns (\r) in the pem file in Linux.

Get Elasticfox, put your credentials in.

Get boto

Get trajectorset

Create a security group called http that at least allows your ip to access a webserver of an ec2 instance that uses it.

At this point it should be possible to run ec2start.py, visit the ip address of the head node and watch the results come in. The ec2start script launches a few instances, one head node that will create noise seeds to send to the worker nodes via sqs, and then wait for the workers to process the seeds and send sqs messages back. The head node then copies the results files and renders the graphics, copying the latest results to folder that can be seen by index.html for web display.

My code is mainly for demonstration, so the key things I did that will help with alternate applications follow:

Custom AMI

You can use the AMI I created with the id 'ami-2bfd1d42', I used one of the Alestic Ubuntu amis and added Java, Xvfb, Boto, and a webserver like lighttpd (I forget if Xvfb was already installed or not).

Headless rendering

The EC2 instance lack graphics contexts at first, and trying to run a graphical application like an exported Processing project will not work (TBD did I ever try that?). Xvfb creates a virtual frame buffer that Processing can render to after running these commands:

Xvfb :2
export DISPLAY=:2

Launching processes and detaching from them

I use python subprocess.Popen frequently to execute commands on the instances like this:


cmd = "Xvfb :2" 
whole_cmd = "ssh -i ~/lucasw.pem root@" + dns_name + " \"" + cmd + "\""   
proc = subprocess.Popen(whole_cmd, shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
(stdout,stderr) = proc.communicate()

The problem is when one wants to run something and close the connection, and leave it running - like Xvfb above, it needs to run and stay running. One method is to leave the ssh connection open, but there is a limit of about 20 ssh sessions.

The trick is to use nohup:

cmd = "nohup Xvfb :2"

Don't put extra quotes around the command to execute, which brings me to the next topic.

Quote escaping

There are a few bash commands that require parts to be in quotes- but in python the bash command is already is in quotes, and python will not understand the inner set of quotes unless they are escaped with the backslash:
cmd = "echo \"blah\" > temp.txt";

Then at other times an additional level of quote escaping is required:
cmd = "echo \\\"blah\\\" > temp.txt";
(I do this when I pass all of the cmd variable to be executed by ssh, and ssh wants it in quotes)

One backslash escapes on level of quoting, three escapes two levels? It's because the escaping backslash itself needs to be escaped. This gets confusing fast, and some experimentation with python in interactive mode is required to get it right.

Config file driven

It's not currently, not as much as at it needs to be, which makes it very brittle- to change plots requires making about three different edits, when a source config file should specify it for all.

2009-08-27

Computing Cloud Rendering with Processing and Amazon EC2

This project is my first experiment with using Amazon EC2 for cloud rendering. The source code is all there and I'll post detailed instructions on how to use it later, but here is a speeded up video of the output:

Computing Cloud Rendering from binarymillenium on Vimeo.

It looks kind of cool, but not too exciting- but there's potential for better things.

What I've done is launched several compute instances on EC2, where worker nodes create individual lines seen in the plots, and then pass data back to a head node, which creates the plots, puts them on a web page for real-time feedback, and stores all the frames for retrieval at the end of the run.

The plots are aggregations of all the results, blue is the presence of any line, and white is a high density of lines, and greenish tinge signifies the line was from a recently aggregated set. It's interesting because the more lines are aggregated, the less the plot changes, so it becomes increasingly boring.

All the plotting and data generation is done using java applications exported from Processing. 3D graphics are also possible, and something like this earlier video could be ported to the scripts I've made. There is no graphics card accessible on the EC2 machines, but virtual frame buffer software like Xvfb and software rendering (either Processing's P3D or software opengl) make it possible to trick the application into thinking there is.

It's not distributed rendering since all the rendering is on one computer, but I think I need to distribute the rendering in order to speed it up.

There is potential for more dynamic applications, involving user interaction through webpages, or simulations that interact with the results of previous simulations, and communicate with other nodes to alter what they are doing.

2009-08-05

Quick jmatio in Processing example

1. Download jmatio from mathworks file exchange
2. unzip and put contents in folder called jmatio
3. rename lib dir to library
3. rename library/jamtio.jar to library/jmatio.jar
4. create a mat file in the sketch data dir called veh_x.mat which contains an array called veh_x
5. Run the following code:


import com.jmatio.io.*;
import com.jmatio.types.*;

  MatFileReader mfr = null;
  try {
    mfr = new MatFileReader(sketchPath + "/data/veh_x.mat" );
  } catch (IOException e) {
   e.printStackTrace();
   exit(); 
  }
  
if (mfr != null) {
  double[][] data = ((MLDouble)mfr.getMLArray( "veh_x" )).getArray();
  
  println(data.length +" " + data[0].length + " "  + data[0][0]);
  
}

TBD use getContents instead of requiring the mat file name and array name be the same.

2009-02-18

Marker Tracking as Visualization Interface

My idea is that I would be able to do an ARToolkit based visualization performance by using a clear table with markers I can slide, rotate, add and remove, and all those movement could correspond to events on screen. Unlike other AR videos the source video wouldn't be incorporated into the output necessarily, the markers provide an almost infinitely expressive set of UI knobs and sliders.

So far I have this:

AR User Interface from binarymillenium on Vimeo.

The lighting is difficult, the markers need to be white and black pixels but the plexiglass tends to produce reflections. Also if the light source itself is visible a marker will not be able to be right on top of it. I need a completely black backdrop under the plexiglass so there are no reflections that will obscure the markers, and also more numerous and softer diffuse lights.

One way to solve the reflection problem is to have the camera looking down at a table, though it's a little harder to get the camera up high enough, and I didn't want my hands or body to obscure the markers- the clear table idea is more elegant and self-contained.

The frame rate isn't very high, I need to work on making it all more real-time and responsive. It may have to be that one computer is capturing video and finding marker positions and sending them to another computer completely free to visualize it. Also more interpolation and position prediction could smooth things out, and cover up gaps if a marker isn't recognized in a frame, but that could produce more lag.

2009-01-29

Bundler - the Photosynth core algorithms GPLed

[update- the output of bundler is less misaligned looking than this, I was incorrectly displaying the results here and in the video]

Bundler (http://phototour.cs.washington.edu/bundler) takes photographs and can create 3D point clouds and camera positions derived from them similar to what Photosynth does- this is called structure from motion. It's hard to believe this has been out as long as the publically available Photosynth but I haven't heard about it- it seems to be in stealth mode.

Bundler - GPLed Photosynth - Car from binarymillenium on Vimeo.

From that video it is apparent that highly textured flat surfaces do best. The car is reflective and dull grey and so generates few correspondences, but the hubcaps, license plate, parking strip lines, and grass and trees work well. I wonder if this could be combined with a space carving technique to get a better car out of it.

It's a lot rougher around the edges lacking the Microsoft Live Labs contribution, a few sets I've tried have crashed with messages like "RunBundler.sh: line 60: 2404 Segmentation fault (core dumped) $MATCHKEYS list_keys.txt matches.init.txt" or sometimes individual images throw it with "This application has requested the Runtime to terminate it..." but it appears to plow through (until it reaches that former error).

Images without good EXIF data trip it up, the other day I was trying to search flickr and find only images that have EXIF data and allow full view, but am not successful so far. Some strings supposed limit search results by focal length, which seems like would limit results only to EXIF, but that wasn't the case.

Bundler outputs ply files, which can be read in Meshlab with the modification that these two lines be added to ply header:

element face 0
property list uchar int vertex_index

Without this Meshlab will give an error about there being no faces, and give up.

Also I have some Processing software that is a little less user friendly but doesn't require the editing:

http://code.google.com/p/binarymillenium/source/browse/trunk/processing/bundler/

Bundler can't handle filenames with spaces right now, I think I can fix this myself without too much work, it's mostly a matter of making sure names are passed everywhere with quotes around them.

Multi-megapixel files load up sift significantly until it crashes after taking a couple of gigabytes of memory (and probably not able to get more from windows):


...
[Found in EXIF tags]
  [CCD width = 5.720mm]
  [Resolution = 3072 x 2304]
  [Focal length (pixels) = 3114.965
[Found 18 good images]
[- Extracting keypoints -]

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

Resizing them to 1600x1200 worked without crashing and took only a few hundred megabytes of memory per image, so more megapixels may work as well.

The most intriguing feature is the incremental option, I haven't tested it yet but it promises to be able to take new images and incorporate them into existing bundles. Unfortunately each new image has a matching time proportional to the number of previous images- maybe it would be possible to incrementally remove images also, or remove found points that are in regions that already have high point densities?

2008-11-21

Depth buffer to 3d coordinates?

I'm having trouble transforming screen coordinates back to 3d, which this post describes- can anyone help me?

---
Update - I've got it figured out now, I should have been using gluUnProject:



FloatBuffer fb;

fb = BufferUtil.newFloatBuffer(width*height);

gl.glReadPixels(0, 0, width, height, GL.GL_DEPTH_COMPONENT, GL.GL_FLOAT, fb); 
fb.rewind();

int viewport[] = new int[4]; 
double[] proj=new double[16];
double[] model=new double[16];
gl.glGetIntegerv(GL.GL_VIEWPORT, viewport, 0);
gl.glGetDoublev(GL.GL_PROJECTION_MATRIX,proj,0);
gl.glGetDoublev(GL.GL_MODELVIEW_MATRIX,model,0);

...
for(int i...
for (int j...
...
glu.gluUnProject(i,height-j,rawd, model,0,proj,0,viewport,0,pos,0); 
float d = (float)-pos[2];

After all that depth d will be linear and in proper world coordinates.

2008-09-08

Increased Dynamic Range For Depth Maps, and Collages in Picasa 3

360 Vision + 3rd Person Composite from binarymillenium on Vimeo.

After I compressed the above video into a WMV I was dissatisfied with how little depth detail there is in the 360 degree vision part - it's the top strip. I initially liked the cleaner single shade look, but now I realize the utility of using a range of colors for depth fields (or IR/thermal imaging also)- it increases the number of colors to represent different depths beyond 256 to a larger number. Earlier I tried using one color channel for higher order bits and another for lower order bits (so the depth could be computed like red*256+green) for a total of 256*256 depth levels (or even 256^3 or 256^4 using alpha), but visually it's a mess.

But visual integrity can be maintained while multiplying that 256 levels by five or a bit more with additional work.

Taking six colors, three of them are pure red, green, blue, and inbetween there is (255,255,0) for yellow and the other two pure combinations of two channels. Between each subsequent set there can be 256 interpolated values, and in the end a color bar like the follow is generated with 1280 different values:

The bottom color bar shows the differences between adjacent values- if the difference was none then it would be black in spots, so my interpolation is verified.

Applying this to the lidar data, I produced a series of images with a processing project:

After making all the images I tried out Picasa 3 to produce a collage- the straightforward grid makes the most sense here. Picasa 3 crashed a few times in the collage editor before I was able to get this exported.

2008-08-28

Color Correction

I have the colors figured out now: I was forgetting to byteswap the two color bytes, and after that the rgb elements line up nicely. And it's 5:6:5 bits per color channel rather than 4 as I thought previously, thanks to Marvin who commented below.

The sphinx above looks right, but earlier the boxer shown below looked so wrong I colored it falsely to make the video:

The Boxer - Photosynth Export from binarymillenium on Vimeo.

But I've fixed the boxer now:

The python script is updated with this code:


  bin.byteswap()
  red =  (bin[0] >> 11) & 0x1f
  green = (bin[0] >> 5)  & 0x3f
  blue =  (bin[0] >> 0)  & 0x1f

2008-08-27

Exporting Point Clouds From Photosynth

Since my last post about photosynth I've revisited the site and discovered that the pictures can be toggled off with the 'p' key, and the viewing experience is much improved given there is a good point cloud underneath. But what use is a point cloud inside a browser window if it can't be exported to be manipulated into random videos that could look like all the lidar videos I've made, or turned into 3D meshes and used in Maya or any other program?

Supposedly export will be added in the future, but I'm impatient like one of the posters on that thread so I've gone forward and figured out my own export method without any deep hacking that might violate the terms of use.

Using one of those programs to intercept 3D api calls might work, though maybe not with DirectX or however the photosynth browser window is working. What I found with Wireshark is that http requests for a series of points_m_n.bin files are made. The m is the group number, if the photosynth is 100% synthy then there will only be one group labeled 0. The n splits up the point cloud into smaller files, for a small synth there could just be points_0_0.bin.

Inside each bin file is raw binary data. There is a variable length header which I have no idea how to interpret, sometimes it is 15 bytes long and sometimes hundreds or thousands of bytes long (though it seems to be shorter in smaller synths).

But after the header there is a regular set of position and color values each 14 bytes long. The first 3 sets of 4 bytes are the xyz position in floating point values. In python I had to do a byteswap on those bytes (presumably from network order) to get them to be read in right with the readfile command.

The last 2 bytes is the color of the point. It's only 4-bits per color channel, which is strange. The first four bits I don't know about, the last three sets of 4 bits are red, blue, and green. Why not 8-bits per channel, does the photosynth process not produce that level of precision because it is only loosely matching the color of corresponding points in photos? Anyway as the picture above shows I'm doing the color wrong- if I have a pure red or green synth it looks right, but maybe a different color model than standard rgb is at work.

I tried making a photosynth of photos that were masked to be blue only- and zero synthiness resulted - is it ignoring blue because it doesn't want to synth up the sky in photos?

Anyway here is the python script for interpreting the bin files.

The sceneviewer (taken from the Radiohead sceneviewer) in that source dir works well for displaying them also.

Anyway to repeat this for any synth wireshark needs to figure out where the bin files are served from (filter with http.request), and then they can be downloaded in firefox or with wget or curl, and then my script can be run on them, and processing can view them. The TOC doesn't clearly specify how the point clouds are covered so redistribution of point clouds, especially those not from your own synths or someone who didn't CC license it, may not be kosher.

2008-08-20

Makeavi

Discovered a neat windows (and vista) tool for turning image sequences into videos: http://makeavi.sourceforge.net/

1280x720 in the 'Microsoft Video 1' format worked well, though 57 MB of pngs turned into 135 MB of video. 'Uncompressed' didn't produce a video just a small 23kb file. 'Intel IYUV' sort of produced a video but not correctly. 'Cinepak' only output a single frame. 'VP60 Simple profile' and 'VP61 Advanced Profile' with the default settings worked, and actually produces video smaller than the source images, though quicktime player didn't like those files. Vimeo seems to think VP61 is okay:

More Velodyne Lidar - overhead view from binarymillenium on Vimeo.

This new video is similar to the animated gifs I was producing earlier, but using a new set of data. Vimeo seems to be acting up this morning, I got 75% through an upload of the entire file (the above is just a subset) and it locked up. I may try to produce a shorter test video to see if it works.

I have around 10 gigs of lidar data from Velodyne, and of course no way to host it.

My process for taking pcap files and exporting the raw data has run into a hitch- wireshark crashes when trying to 'follow udp stream' for pcap files larger than a couple hundred megabytes. Maybe there is another tool that can do the conversion to raw?

2008-08-03

Animated gif of height map

animated gif of height map

Source code is here:

http://code.google.com/p/binarymillenium/source/browse/#svn/trunk/processing/velodyne

One interesting thing I discovered is that animated gifs with an alpha channel don't just let the back ground of the web page show through, they also don't clear the last frame of the gif- which was confusing for this gif before I blackened the background with ImageMagick:


for i in *png; do convert $i -background black -flatten +matte flat_$i; done
convert flat*png velodyne_hgt.gif

Velodyne Lidar Sample Data

Applying the db.xml calibration file

As with the pcap parsing, I originally thought I'd look into Python xml parsing. I'm sure if I was really interested in parsing xml I would have tried some of them out, but the interface I was hoping to find would look like this


import pyxmlthing
a = pyxmlthing.load("some.xml")
 some_array= a.item('9').rotCorrection

And I would have an array of all the rotCorrections of all the items of type 9. Instead I found a several xml parsers that required tons of code to get to a point where I'm still not sure if I could get at the rotCorrections or not. Which may be why I don't care for xml, flat-files are it.

So I just used vim to strip all the xml out and leave me with a nice text file that looks like this


0,     -3.8,     -7.0046468,     20,     21.560343,     -2.5999999
    1,     -1.5,     -6.7674689,     26,     21.516994,     2.5999999
    2,     5,     0.44408101,     28,     20.617426,     -2.5999999
    3,     6.8000002,     0.78093398,     32,     20.574717,     2.5999999

Where the first column is the laser index (0-63), the next might be the rotCorrection and so on.

To apply the calibration data, it's very important that the indexing derived from the raw data is correct- reading the 0xDDEE vs. the 0xDDFF (or similar) that designates upper or lower laser block is important.

The velodyne manual doesn't have a good diagram that shows the xyz axes and what is positive or negative direction for both the original lidar angle and the correction angles and offsets, so some experimentation is necessary there. The manual did mention it in this case the rotCorrection had to be subtracted from the base rotation angle. The vertical and horizontal offset are pretty minor for my visualization but important for accurate measurements obviously.

Processing viewer

Using Aaron Koblin's House of Cards SceneViewer as a starting point, I wrote code to take the point cloud data and display it:

Monterey Full from binarymillenium on Vimeo.

The data was split into files each containing a million data points and spanning a second in time. With some testing I found the lidar was spinning at about 10 Hz (of the possible 5, 10, or 15 Hz), so I would bite off 1/10 of the file and display that in the current frame, then the next 1/10th for the next frame, and then load the next file after 10 frames. A more consistent approach would be to split the data into a new text file for each frame.

Next Steps

Right now I'm running a large job to process each point-cloud frame into png height-map files as I did for the Radiohead data. It doesn't work as well with the 360 degrees of heights- some distances just have to be cut off and ignored to keep the resolutions low (although with the png compression having large empty spaces doesn't really take up any additional room). A lot of detail is lost on the nearby objects compared to a lot of empty space between distant objects.

So either using that data or going back to processing the raw point cloud, I'd like to track features frame-to-frame and derive the vehicle motion from them. I suspect this will be very difficult. Once I have the vehicle motion, I could create per-frame transformations that could create one massive point cloud or height map where still objects are in their proper places and other vehicles probably become blurs.

After that, if I can get a dataset from Velodyne or another source where a moving ground lidar moves in a circle or otherwise intersects its own path somewhere, then the proof that the algorithm works will be if in that big point cloud that point of intersection actually lines up. (though I suspect again that more advanced logic is need to re-align the data after the logic determines that it has encounter features that it has scene before).

2008-07-30

Velodyne Lidar- Monterey Data

There's something wrong with how I'm viewing the data, the streaks indicate every revolution of the lidar results in an angular drift- maybe I'm re-adding a rotational offset and allowing it to accumulate?

No, it turned out to be a combination of things- applying the wrong calibration data (essentially flopping the lower and upper laser blocks). Now it's starting to look pretty good, though there is a lot of blurriness because the environment is changing over time- that's a car pulling out into an intersection.

Velodyne Lidar - Monterey from binarymillenium on Vimeo.

2008-07-29

Velodyne Lidar Sample Data: Getting a .pcap into Python

Velodyne has provided me with this sample data from their HDL-64E lidar.

Instead of data exported from their viewer software, it's a pcap captured with Wireshark or a another network capture tool that uses the standard pcap format.

Initially I was trying to extract the lidar packets with libpcap and python, using pcapy or similar, and went through a lot of trouble getting the pcap library to build in the cygwin environment. Python and libpcap were able to load the data from the pcap, but rendered the binary into a long string with escape codes like '\xff'.

I then discovered that Wireshark has a function called 'follow udp stream'- right-click on the data part of a packet in Wireshark and then export as 'raw'. The exported data doesn't preserve division between packets any longer, but since each is of a consistent length (1206 bytes) it's easy to parse.

Python does work well for loading binary data from file:


import array

f = open('data.raw')  # the data exported from wireshark in the raw format

bin = array.array('B') # setup an array of typecode unsigned byte

bin.fromfile(f, 1206) # each packet has 1206 bytes of data

The contents of bin now have 1206 bytes of data that looks like this:


>>> bin
array('B', [255, 221, 33, 39, 5, 9, 67, 178, 8, 116, 160, 13, 126, 222, 13, 63, 217, 8, 162, 204, ...

The 'B' element doesn't prevent indexing into the real data with bin[0] and getting 255 and so on.

The first two bytes indicate whether the data is from the upper or lower array, and the 33 39 is interpreted to mean that the reading was taken when the sensor head was rotated to (39*255+33)/100 or 99.78 degrees.

For 32 times after those first two bytes there are pairs of distance bytes followed by single intensity bytes, and then it starts over for a total of 12 times... and there are 6 more bytes with information that is unnecessary now. See here for what I currently have for parsing the raw file, later I will get into add the per laser calibration data.

2008-07-26

More HoC: Preprocessing into pngs

house of cards height

house of cards intensity

2008-07-21

Radiohead - House of Cards 2

Radiohead - House of Cards from binarymillenium on Vimeo.

This rendered slowly, less than 1 fps. One possible speedup would be to pre-process the csv data into binary heightmap files, rather than loading and processing each frame.

Processing code is here:
http://code.google.com/p/binarymillenium/source/browse/trunk/processing/hoc/hoc.pde

It would be nice if they uploaded the data from the woman singing, and the party scene.

2008-07-15

Radiohead - House of Cards

I've been playing with the data for a couple of hours in Processing. The main problem is that the points are not consistent across animation frames, so it is necessary to produce a new set of points in a regular grid that then can be tessellated easily. By the end of the week I ought to have a video up in the official group on youtube and in higher quality on vimeo.

2008-06-23

openprocessing

This site is good for showcasing Processing projects, but is really rough around the corners- it lacks almost every feature of sites for sharing photos or art or anything except for leaving comments, and tagging. The 'viewed nn times' increments when the page is reloaded, which is like a hit counter straight out of 1997. There's no real good way for good graphs to float to the top and be seen more except for an unknown process by which editors select certain submissions for exhibition. There are a lot of applets that don't work but aren't hidden or demoted from view.

It's not like there are no other options for hosting java applets for free- google code works pretty good, and by putting google analytics javascript into the html page it's easy to track the traffic to the page.

But there do seem to be a number of people that visit the site and look at my submissions, but I think I have more unique views from using the exhibition rollup on the official processing.org site.

2008-06-14

thingamajiggr, and 2 new Processing effects

I'm beginning to incorporate more processing effects into my sets, so I developed two for thingamajiggr. They are less friendly to live alteration than gephex graphs, but if the bulk of the effect is done it's easy to live-code simple keyboard controls or change parameters.

I need a way to get video or screen captures into Processing, on Linux this is not easy- if it doesn't already exist I'm thinking of sending image data from gephex to Processing through an internal loopback path.

Audio input used to work for me but now it doesn't.

First effect: multicolored perlin lines. When shown live the colors dovetailed with led lighting of an artwork in the main theater room:

(click on the image to see the java applet)

The second effect is a standard gas/fluid simulation that uses springs. I spent a lot of time trying to get more complicated flow fields working, but I ended up moderately simple:

(click on the picture to see and play with the java applet)

Things that went wrong or I need to fix:

Holding my laptop on my lap is not very ideal when my laptop has an rgb output without locking screw holes. The connector fell out a few times. The other lesson is to always have good surface so the laptop can be left still.

Not being able to eliminate window manager titles and borders around processing output. I should find a more minimalist window manager.

2008-06-08

Optical Flow in Processing

The standard Lucas-Kanade method as described in the wiki page entry on it here. One key thing not mentioned there (at least until I edit the page) is to use the Sobel operator to get partial derivatives in the x and y direction.

link to applet

Glancing at the OpenCV implementation I notice there is a check for certain kinds of matrices that aren't invertible by the primary method, I may need to be doing that for more robust operation.