7.0 KiB
Faster Whisper - Audio Transcription Service
Audio transcription service using Faster Whisper with GPU acceleration (NVIDIA).
📋 Prerequisites
- Windows with WSL2 (Ubuntu 24.04)
- Docker Desktop for Windows with WSL2 backend
- NVIDIA GPU with drivers installed on Windows
- NVIDIA Container Toolkit configured in WSL2
- Access to mounted volumes (
/mnt/e/volumes/faster-whisper/)
WSL2 GPU Setup
Ensure your WSL2 Ubuntu has access to the NVIDIA GPU:
# Check GPU availability in WSL2
nvidia-smi
# If not available, install NVIDIA Container Toolkit in WSL2
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
🚀 Quick Start
# Start the service
docker compose up -d faster-whisper
# Check logs
docker logs faster-whisper -f
# Stop the service
docker compose down
⚙️ Configuration
Environment Variables
| Variable | Value | Description |
|---|---|---|
PUID |
1000 | User ID for file permissions |
PGID |
1000 | Group ID for file permissions |
TZ |
Europe/Paris | Timezone |
WHISPER_MODEL |
turbo | Model to use (tiny, base, small, medium, large, turbo) |
WHISPER_LANG |
fr | Transcription language |
WHISPER_BEAM |
5 | Beam search size (1-10, accuracy vs speed tradeoff) |
Available Models
| Model | Size | VRAM | Speed | Accuracy |
|---|---|---|---|---|
tiny |
~75 MB | ~1 GB | Very fast | Low |
base |
~142 MB | ~1 GB | Fast | Medium |
small |
~466 MB | ~2 GB | Medium | Good |
medium |
~1.5 GB | ~5 GB | Slow | Very good |
large |
~2.9 GB | ~10 GB | Very slow | Excellent |
turbo |
~809 MB | ~6 GB | Fast | Excellent |
Note: The
turbomodel is an excellent compromise for RTX 4060 Ti (8 GB VRAM).
Volumes
/mnt/e/volumes/faster-whisper/audio→/app: Audio files directory to transcribe/mnt/e/volumes/faster-whisper/models→/root/.cache/whisper: Downloaded models cache
Windows Note: The path
/mnt/e/in WSL2 corresponds toE:\drive on Windows.
🎯 Usage
REST API
The service exposes a REST API on port 10300.
Transcribe an audio file
# Place the file in /mnt/e/volumes/faster-whisper/audio/
# Or on Windows: E:\volumes\faster-whisper\audio\
# From WSL2:
curl -X POST http://localhost:10300/transcribe \
-F "file=@audio.mp3"
# From Windows PowerShell:
curl.exe -X POST http://localhost:10300/transcribe -F "file=@audio.mp3"
Check service status
curl http://localhost:10300/health
Web Interface
Access the web interface: http://localhost:10300
The interface is accessible from both Windows and WSL2.
🔧 Administration
Check GPU Usage
# From WSL2 host
nvidia-smi
# From inside the container
docker exec faster-whisper nvidia-smi
# Monitor GPU in real-time
watch -n 1 nvidia-smi
Update the Image
docker compose pull faster-whisper
docker compose up -d faster-whisper
Change Model
- Edit
WHISPER_MODELin docker-compose.yml - Restart the container:
docker compose up -d faster-whisper
The new model will be downloaded automatically on first startup.
Performance Optimization
Adjust Beam Search
WHISPER_BEAM=1: Maximum speed, reduced accuracyWHISPER_BEAM=5: Good compromise (default)WHISPER_BEAM=10: Maximum accuracy, slower
Monitor Memory Usage
docker stats faster-whisper
Clean Old Models
Models are stored in /mnt/e/volumes/faster-whisper/models/ (WSL2) or E:\volumes\faster-whisper\models\ (Windows).
# From WSL2 - List downloaded models
ls -lh /mnt/e/volumes/faster-whisper/models/
# Delete an unused model
rm -rf /mnt/e/volumes/faster-whisper/models/<model-name>
# From Windows PowerShell
Get-ChildItem E:\volumes\faster-whisper\models\
# Delete an unused model
Remove-Item -Recurse E:\volumes\faster-whisper\models\<model-name>
📊 Monitoring
Real-time Logs
docker logs faster-whisper -f --tail 100
Check Container Status
docker ps | grep faster-whisper
Restart on Issues
docker restart faster-whisper
🐛 Troubleshooting
Container Won't Start
-
Verify NVIDIA Container Toolkit is installed in WSL2:
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi -
Check permissions on volumes:
ls -la /mnt/e/volumes/faster-whisper/ -
Ensure Docker Desktop WSL2 integration is enabled:
- Open Docker Desktop → Settings → Resources → WSL Integration
- Enable integration with Ubuntu-24.04
"Out of Memory" Error
- Reduce the model (e.g., from
turbotosmall) - Reduce
WHISPER_BEAMto 3 or 1 - Close other GPU-intensive applications on Windows
- Check GPU memory usage:
nvidia-smi
Poor Transcription Quality
- Increase the model (e.g., from
smalltoturbo) - Increase
WHISPER_BEAMto 7 or 10 - Check audio quality of source file
- Verify the correct language is set in
WHISPER_LANG
WSL2 Specific Issues
GPU Not Detected
# Check Windows GPU driver version (from PowerShell)
nvidia-smi
# Update WSL2 kernel
wsl --update
# Restart WSL2
wsl --shutdown
# Then reopen Ubuntu
Volume Access Issues
# Check if drive is mounted in WSL2
ls /mnt/e/
# If not mounted, add to /etc/wsl.conf
sudo nano /etc/wsl.conf
# Add these lines:
[automount]
enabled = true
options = "metadata,uid=1000,gid=1000"
# Restart WSL2
wsl --shutdown
📁 File Structure
Windows: E:\volumes\faster-whisper\
WSL2: /mnt/e/volumes/faster-whisper/
├── audio/ # Audio files to transcribe
└── models/ # Whisper models cache
🪟 Windows Integration
Access Files from Windows Explorer
- Navigate to
\\wsl$\Ubuntu-24.04\mnt\e\volumes\faster-whisper\ - Or directly to
E:\volumes\faster-whisper\
Copy Files to Transcribe
From Windows:
Copy-Item "C:\path\to\audio.mp3" -Destination "E:\volumes\faster-whisper\audio\"
From WSL2:
cp /mnt/c/path/to/audio.mp3 /mnt/e/volumes/faster-whisper/audio/
🔗 Useful Links
- LinuxServer Docker Image Documentation
- Faster Whisper GitHub
- OpenAI Whisper Documentation
- WSL2 GPU Support
📝 Notes
- Service automatically restarts unless manually stopped (
restart: unless-stopped) - On first startup, the model will be downloaded (may take a few minutes)
- Supported audio formats: MP3, WAV, M4A, FLAC, OGG, etc.
- The service runs in WSL2 but is accessible from Windows
- GPU computations are performed on the Windows NVIDIA GPU