This project presents a comprehensive 3D analysis of tennis, utilizing advanced computer vision methodologies. Our investigation centers on detecting players, tracking ball trajectories, establishing a 3D comprehension of the court, and classifying player poses. This framework aims to create a reliable system for fault detection and precise game statistics reporting.
- Introduction
- Ball Tracking
- Efficient Tennis Broadcast (ETB)
- Player Detection and Tracking
- Detecting Tennis Court
- Pose Classification
- Motion Detection
- Deep Learning Framework
- Advertising
- References
This project aims to combine traditional computer vision techniques with state-of-the-art deep learning networks to create a reliable framework for 3D tennis analysis.
Detecting a tennis ball during matches is challenging due to its high velocity and potential occlusion. Various methods, including classical and deep learning models, are used to accurately detect the ball, enabling detailed analytics and supporting ball-based analysis.
- Fazio et al. (2018): Utilized stereo smartphone videos for ball trajectory estimation.
- Qazi et al. (2015): Developed an automated ball tracking system using machine learning and image processing techniques.
- Huang et al. (2019): Employed a convolutional neural network (CNN) to predict the tennis ball’s position.
We use TrackNet, a CNN that processes three sequential frames to produce a heatmap indicating the ball's location. The model achieves up to 99.7% accuracy in ball position prediction.
To address occlusion, we use:
- Interpolation: Estimates the ball’s position in the current frame using the last visible points.
- Multi-View Correspondence: Employs 3D knowledge of epipolar lines to estimate the ball’s position using additional cameras.
ETB introduces an innovative approach to broadcasting tennis matches under limited internet bandwidth conditions by focusing on key elements - the players and the ball.
ETB uses a fixed background image and dynamically updates and transmits only the regions surrounding the players and the ball, significantly reducing data transmission requirements.
For a comprehensive understanding and visual representation of ETB, visit our GitHub repository, where a demonstrative video is available.
We use MOG background removal and the CSRT tracking algorithm to detect and track players. Morphology and connected component analysis techniques enhance the accuracy and reliability of player tracking.
Implemented using OpenCV, the system integrates MOG2 background subtraction and CSRT tracking to monitor player movement throughout the game.
We apply a mask to remove non-white pixels and then use RANSAC to fit lines to the remaining white pixels corresponding to the court lines.
RANSAC identifies lines corresponding to the tennis court lines and the net line.
Keypoints are identified by systematically finding the intersection of the detected lines.
Using the identified keypoints, we calibrate the camera to map 3D coordinates to 2D pixel locations.
We utilize the THREE DIMENSIONAL TENNIS SHOT HUMAN ACTION DATA SET to classify tennis movements.
- Canny Edge Detection: Extracts useful information from video frames to provide input to a neural network for classification tasks.
Motion detection identifies changes in object position within a video sequence, crucial for various applications.
Implemented using OpenCV, this technique involves comparing each video frame against a background model.
Median filtering is used to reduce noise in the detected edges, enhancing the fidelity of the motion detection.
Our approach combines a fine-tuned CNN for spatial feature extraction with an LSTM for temporal modeling. This setup processes tennis videos to predict action classes.
Evaluated on a dataset of tennis serve videos, our model is trained and tested with metrics such as categorical cross-entropy loss and accuracy.
We use homography to integrate advertisements into the tennis court's surface, ensuring a natural appearance during live broadcasts.
- Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, 8(6), 679–698.
- Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions.
- Huang, Y., Liao, I., Chen, C., Ik, T., & Peng, W. (2019). TrackNet: A Deep Learning Network for Tracking High-speed and Tiny Objects in Sports Applications. CoRR, abs/1907.03698.
- Qazi, T., Mukherjee, P., Srivastava, S., Lall, B., & Chauhan, N. R. (2015). Automated ball tracking in tennis videos. 2015 Third International Conference on Image Information Processing (ICIIP), 236–240.