#86: A hierarchical framework for semantic scene classification in soccer sports video


M. H. Kolekar and K. Palaniappan

IEEE Region Ten (TENCON) Int. Conference, pgs. Online, 2008

fmv, features, fusion, visual events, data mining, cbir

PlainText, Bibtex, PDF, URL, DOI, Google Scholar

Abstract

In this paper, we propose a novel hierarchical framework for soccer (football) video classification. Unlike most existing video classification approaches, which focus on shot detection followed by classification based on clustering using shot aggregation, the proposed scheme perform a top-down video scene classification which avoids shot clustering. This improves the classification accuracy and also maintains the temporal order of shots. In the hierarchy, at level-I, we use audio features, to extract potentially interesting clips from the video. At level- 2, we classify these cUps into field view and non-field view using feature of dominant grass color ratio. At level-3a, we classify field view into three kinds of views using motion-mask. At level-3b, we classify non-field view into close-up and crowd using skin color information. At level-4, we classify close-ups into the four frequently occuring classes such as player of team-A, player of team-B, goalkeeper of team-A, goalkeeper of team-B using jersey color information. We show promising results, with correcdy classified soccer scenes, enabUng structural and temporal analysis, such as highlight extraction, and video skimming.