pyTAG: Python-based Interactive Training Data Generation for Visual Tracking Algorithms


Visual object tracking has been always an important topic in the computer vision community due to its wide range of applications in many domains. While a considerable number of unsupervised and supervised visual tracking algorithms have been developed, the visual tracking field continues to explore improved algorithms and challenging new applications such as multispectral object tracking, multiobject tracking and tracking from moving platforms. Ground truth-based evaluation of tracking algorithms is an essential component for measuring tracker performance. However, manual ground-truth generation by tagging objects of interest in video sequences is an extremely tedious and error prone task that can be subjective especially in complex scenes. In visual tracking, some of the common challenges based on the environment characteristics and object movements are occlusion of the objects of interest, splitting or merging of groups of objects, id switches, and target drift due to model update errors during tracking. These problems can cause video object tracking algorithms to lose the object of interest or start tracking the wrong object due to id switches between objects of similar appearance or target drift due to model update errors. To improve longer, persistent, more accurate tracking for intelligent scene perception, the new generation of tracking algorithms incorporate machine learning based approaches. These supervised tracking algorithms require large training sets, especially deep convolution neural networks with millions of parameters. Therefore, it is important to generate accurate training data and ground truth for tracking. In this study, a training data and ground-truth generation tool called pyTAG is implemented for visual tracking. The proposed tool's plugin structure allows integration, testing, and validation of different trackers. The tracker can be paused, resumed, forwarded, rewound and re-initialized after it loses the object during training data generation. Most importantly pyTAG allows users to change the object tracking method during ground-truth generation. This feature provides the flexibility to adapt the challenges that occur in ground-truth generation by switching the object tracking method that performs better than the other object tracking methods. This tool has been implemented to assist researchers to rapidly generate ground-truth and training data, fix annotations, run and visualize custom or existing object tracking techniques.