Lecture Notes in Computer Science (ACIVS),
Volume 5807,
pgs. 320--332,
2009
The rapid increase in pixel density and frame rates of mod- ern imaging sensors is accelerating the demand for fine-grained and em- bedded parallelization strategies to achieve real-time implementations for video analysis. The IBM Cell Broadband Engine (BE) processor has an appealing multi-core chip architecture with multiple programming models suitable for accelerating multimedia and vector processing appli- cations. This paper describes two parallel algorithms for blob extraction in video sequences: binary morphological operations and connected com- ponents labeling (CCL), both optimized for the Cell-BE processor. Novel parallelization and explicit instruction level optimization techniques are described for fully exploiting the computational capacity of the Syner- gistic Processing Elements (SPEs) on the Cell processor. Experimental results show significant speedups ranging from a factor of nearly 300 for binary morphology to a factor of 8 for CCL in comparison to equivalent sequential implementations applied to High Definition (HD) video.