As with the pcap parsing, I originally thought I'd look into Python xml parsing. I'm sure if I was really interested in parsing xml I would have tried some of them out, but the interface I was hoping to find would look like this
a = pyxmlthing.load("some.xml")
And I would have an array of all the rotCorrections of all the items of type 9. Instead I found a several xml parsers that required tons of code to get to a point where I'm still not sure if I could get at the rotCorrections or not. Which may be why I don't care for xml, flat-files are it.
So I just used vim to strip all the xml out and leave me with a nice text file that looks like this
0, -3.8, -7.0046468, 20, 21.560343, -2.5999999
1, -1.5, -6.7674689, 26, 21.516994, 2.5999999
2, 5, 0.44408101, 28, 20.617426, -2.5999999
3, 6.8000002, 0.78093398, 32, 20.574717, 2.5999999
Where the first column is the laser index (0-63), the next might be the rotCorrection and so on.
To apply the calibration data, it's very important that the indexing derived from the raw data is correct- reading the 0xDDEE vs. the 0xDDFF (or similar) that designates upper or lower laser block is important.
The velodyne manual doesn't have a good diagram that shows the xyz axes and what is positive or negative direction for both the original lidar angle and the correction angles and offsets, so some experimentation is necessary there. The manual did mention it in this case the rotCorrection had to be subtracted from the base rotation angle. The vertical and horizontal offset are pretty minor for my visualization but important for accurate measurements obviously.
Using Aaron Koblin's House of Cards SceneViewer as a starting point, I wrote code to take the point cloud data and display it:
Monterey Full from binarymillenium on Vimeo.
The data was split into files each containing a million data points and spanning a second in time. With some testing I found the lidar was spinning at about 10 Hz (of the possible 5, 10, or 15 Hz), so I would bite off 1/10 of the file and display that in the current frame, then the next 1/10th for the next frame, and then load the next file after 10 frames. A more consistent approach would be to split the data into a new text file for each frame.
Right now I'm running a large job to process each point-cloud frame into png height-map files as I did for the Radiohead data. It doesn't work as well with the 360 degrees of heights- some distances just have to be cut off and ignored to keep the resolutions low (although with the png compression having large empty spaces doesn't really take up any additional room). A lot of detail is lost on the nearby objects compared to a lot of empty space between distant objects.
So either using that data or going back to processing the raw point cloud, I'd like to track features frame-to-frame and derive the vehicle motion from them. I suspect this will be very difficult. Once I have the vehicle motion, I could create per-frame transformations that could create one massive point cloud or height map where still objects are in their proper places and other vehicles probably become blurs.
After that, if I can get a dataset from Velodyne or another source where a moving ground lidar moves in a circle or otherwise intersects its own path somewhere, then the proof that the algorithm works will be if in that big point cloud that point of intersection actually lines up. (though I suspect again that more advanced logic is need to re-align the data after the logic determines that it has encounter features that it has scene before).