Gesture Once (p.k.a. Learn-ASL) is a machine learning project designed to recognize American Sign Language (ASL) gestures and translate them into text using the YOLOv8 object detection model and MediaPipe for hand landmark detection. This project aims to bridge communication gaps for ASL users by providing an educational and efficient sign-to-text conversion service for learning basic ASL, including the alphabet and many common phrases in ASL.
Object Detection with YOLOv8: Recognizes ASL letters and gestures from a live video feed. Hand Landmarks with MediaPipe: Enhances gesture recognition by aligning bounding boxes to hand landmarks. Gesture Logging: Logs the highest predicted gesture with confidence scores to a text file for debugging and potential user interfaces.
Although it'd be awesome to have this deployed so others can freely test the model, deploying it will be computationally expensive, and users may run into network issues regardless. However, setting this up locally is extremely easy! Instructions are available below.
The system can be engineered to detect ASL gestures using a camera which converts sign language into text in real-time. This would enable deaf individuals to communicate more easily with those who do not understand ASL. Additionally, the system can be extended to translate ASL videos into text for users who do not know ASL.
The system can serve as a learning platform for users who are practicing sign langauge. It can be developed so that it can evaluate a user’s sign language accuracy and provide instant feedback.
Make sure Python is installed.
- Install the necessary Python packages
pip install -r requirements.txt- Change directory to the frontend and install necessary dependencies
cd frontend/
npm install- Run the client
npm run dev- Change directory to the backend and run the server that serves the YOLOv8 model
cd ..
cd backend/
python model_api.py- Start signing!
Interface Development: Build a GUI for real-time gesture-to-text translation. Dataset Expansion: Incorporate more ASL gestures for robust recognition. Performance Optimization: Optimize logging and frame processing speed.
Dataset Link: ASL Letters Dataset
Jay Noppone Pornpitaksuk, Claudio Perinuzzi, Loyd Flores, Kenneth Guillont
