#99: Parallel flux tensor analysis for efficient moving object detection


The flux tensor motion flow algorithm is a versatile computer vision technique for robustly detecting moving objects in cluttered scenes. The flux tensor calculation has a high computational workload consisting of 3-D spatiotemporal filtering operations combined with 3-D weighted integration operations for estimating local averages of the flux tensor matrix trace. In order to achieve efficient real-time processing of high bandwidth video streams a data parallel multicore algorithm was developed for the Cell Broadband Engine (Cell/B.E.) processor and evaluated in terms of the energy to computation efficiency compared to a fast sequential CPU implementation. Our multicore implementation is 12 to 40 times faster than the sequential version for HD video using a single PS-3 Cell/B.E. processor and is faster than realtime for a range of filter configurations and video frame sizes. We report on the power efficiency measured in terms of performance per watt for the Cell/B.E. implementation which is at least 50 times better than the sequential version.