Files
AITools/Readme_FasterWhisper.md
2025-10-11 23:13:08 +02:00

7.0 KiB

Faster Whisper - Audio Transcription Service

Audio transcription service using Faster Whisper with GPU acceleration (NVIDIA).

📋 Prerequisites

  • Windows with WSL2 (Ubuntu 24.04)
  • Docker Desktop for Windows with WSL2 backend
  • NVIDIA GPU with drivers installed on Windows
  • NVIDIA Container Toolkit configured in WSL2
  • Access to mounted volumes (/mnt/e/volumes/faster-whisper/)

WSL2 GPU Setup

Ensure your WSL2 Ubuntu has access to the NVIDIA GPU:

# Check GPU availability in WSL2
nvidia-smi

# If not available, install NVIDIA Container Toolkit in WSL2
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

🚀 Quick Start

# Start the service
docker compose up -d faster-whisper

# Check logs
docker logs faster-whisper -f

# Stop the service
docker compose down

⚙️ Configuration

Environment Variables

Variable Value Description
PUID 1000 User ID for file permissions
PGID 1000 Group ID for file permissions
TZ Europe/Paris Timezone
WHISPER_MODEL turbo Model to use (tiny, base, small, medium, large, turbo)
WHISPER_LANG fr Transcription language
WHISPER_BEAM 5 Beam search size (1-10, accuracy vs speed tradeoff)

Available Models

Model Size VRAM Speed Accuracy
tiny ~75 MB ~1 GB Very fast Low
base ~142 MB ~1 GB Fast Medium
small ~466 MB ~2 GB Medium Good
medium ~1.5 GB ~5 GB Slow Very good
large ~2.9 GB ~10 GB Very slow Excellent
turbo ~809 MB ~6 GB Fast Excellent

Note: The turbo model is an excellent compromise for RTX 4060 Ti (8 GB VRAM).

Volumes

  • /mnt/e/volumes/faster-whisper/audio/app : Audio files directory to transcribe
  • /mnt/e/volumes/faster-whisper/models/root/.cache/whisper : Downloaded models cache

Windows Note: The path /mnt/e/ in WSL2 corresponds to E:\ drive on Windows.

🎯 Usage

REST API

The service exposes a REST API on port 10300.

Transcribe an audio file

# Place the file in /mnt/e/volumes/faster-whisper/audio/
# Or on Windows: E:\volumes\faster-whisper\audio\

# From WSL2:
curl -X POST http://localhost:10300/transcribe \
  -F "file=@audio.mp3"

# From Windows PowerShell:
curl.exe -X POST http://localhost:10300/transcribe -F "file=@audio.mp3"

Check service status

curl http://localhost:10300/health

Web Interface

Access the web interface: http://localhost:10300

The interface is accessible from both Windows and WSL2.

🔧 Administration

Check GPU Usage

# From WSL2 host
nvidia-smi

# From inside the container
docker exec faster-whisper nvidia-smi

# Monitor GPU in real-time
watch -n 1 nvidia-smi

Update the Image

docker compose pull faster-whisper
docker compose up -d faster-whisper

Change Model

  1. Edit WHISPER_MODEL in docker-compose.yml
  2. Restart the container:
    docker compose up -d faster-whisper
    

The new model will be downloaded automatically on first startup.

Performance Optimization

  • WHISPER_BEAM=1: Maximum speed, reduced accuracy
  • WHISPER_BEAM=5: Good compromise (default)
  • WHISPER_BEAM=10: Maximum accuracy, slower

Monitor Memory Usage

docker stats faster-whisper

Clean Old Models

Models are stored in /mnt/e/volumes/faster-whisper/models/ (WSL2) or E:\volumes\faster-whisper\models\ (Windows).

# From WSL2 - List downloaded models
ls -lh /mnt/e/volumes/faster-whisper/models/

# Delete an unused model
rm -rf /mnt/e/volumes/faster-whisper/models/<model-name>
# From Windows PowerShell
Get-ChildItem E:\volumes\faster-whisper\models\

# Delete an unused model
Remove-Item -Recurse E:\volumes\faster-whisper\models\<model-name>

📊 Monitoring

Real-time Logs

docker logs faster-whisper -f --tail 100

Check Container Status

docker ps | grep faster-whisper

Restart on Issues

docker restart faster-whisper

🐛 Troubleshooting

Container Won't Start

  1. Verify NVIDIA Container Toolkit is installed in WSL2:

    docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
    
  2. Check permissions on volumes:

    ls -la /mnt/e/volumes/faster-whisper/
    
  3. Ensure Docker Desktop WSL2 integration is enabled:

    • Open Docker Desktop → Settings → Resources → WSL Integration
    • Enable integration with Ubuntu-24.04

"Out of Memory" Error

  • Reduce the model (e.g., from turbo to small)
  • Reduce WHISPER_BEAM to 3 or 1
  • Close other GPU-intensive applications on Windows
  • Check GPU memory usage: nvidia-smi

Poor Transcription Quality

  • Increase the model (e.g., from small to turbo)
  • Increase WHISPER_BEAM to 7 or 10
  • Check audio quality of source file
  • Verify the correct language is set in WHISPER_LANG

WSL2 Specific Issues

GPU Not Detected

# Check Windows GPU driver version (from PowerShell)
nvidia-smi

# Update WSL2 kernel
wsl --update

# Restart WSL2
wsl --shutdown
# Then reopen Ubuntu

Volume Access Issues

# Check if drive is mounted in WSL2
ls /mnt/e/

# If not mounted, add to /etc/wsl.conf
sudo nano /etc/wsl.conf

# Add these lines:
[automount]
enabled = true
options = "metadata,uid=1000,gid=1000"

# Restart WSL2
wsl --shutdown

📁 File Structure

Windows: E:\volumes\faster-whisper\
WSL2:    /mnt/e/volumes/faster-whisper/
├── audio/          # Audio files to transcribe
└── models/         # Whisper models cache

🪟 Windows Integration

Access Files from Windows Explorer

  • Navigate to \\wsl$\Ubuntu-24.04\mnt\e\volumes\faster-whisper\
  • Or directly to E:\volumes\faster-whisper\

Copy Files to Transcribe

From Windows:

Copy-Item "C:\path\to\audio.mp3" -Destination "E:\volumes\faster-whisper\audio\"

From WSL2:

cp /mnt/c/path/to/audio.mp3 /mnt/e/volumes/faster-whisper/audio/

📝 Notes

  • Service automatically restarts unless manually stopped (restart: unless-stopped)
  • On first startup, the model will be downloaded (may take a few minutes)
  • Supported audio formats: MP3, WAV, M4A, FLAC, OGG, etc.
  • The service runs in WSL2 but is accessible from Windows
  • GPU computations are performed on the Windows NVIDIA GPU