Adding GPU support for transcription
This commit is contained in:
18
README.md
18
README.md
@@ -17,6 +17,22 @@ The worker reaches OpenWebUI at `$OPENWEBUI_URL` (default: http://host.docker.in
|
||||
|
||||
Note: `.env.example` includes placeholders for both **Meili** and **OpenWebUI** configuration. Be sure to set `OPENWEBUI_URL` to point to your OpenWebUI container accordingly.
|
||||
|
||||
## GPU (CUDA) Setup
|
||||
|
||||
To run Whisper on NVIDIA GPU:
|
||||
|
||||
- Install the NVIDIA driver on the host and the NVIDIA Container Toolkit.
|
||||
- Copy `.env.example` to `.env` and set:
|
||||
- `DOCKER_GPU_RUNTIME=nvidia`
|
||||
- `NVIDIA_VISIBLE_DEVICES=all` (or a specific GPU index)
|
||||
- `WHISPER_DEVICE=cuda` (or `auto`)
|
||||
- `WHISPER_PRECISION=float16` (recommended for GPU)
|
||||
- Rebuild and start:
|
||||
- `docker compose up -d --build`
|
||||
- Check logs for `device='cuda'` when the transcribe worker loads the model.
|
||||
|
||||
This repo's app image is based on `nvidia/cuda:12.4.1-cudnn9-runtime-ubuntu22.04`, which includes the CUDA and cuDNN user-space libraries that faster-whisper requires. On non-GPU hosts it still runs on CPU.
|
||||
|
||||
## Components Overview
|
||||
|
||||
- **scanner**: Scans your media folders (`library` and `transcripts`) for new or updated files, triggering ingestion and processing workflows.
|
||||
@@ -33,6 +49,8 @@ Note: `.env.example` includes placeholders for both **Meili** and **OpenWebUI**
|
||||
- `WHISPER_MODEL`: Whisper model variant to use for transcription (e.g., `small`, `medium`, `large`).
|
||||
- `WHISPER_PRECISION`: Precision setting for Whisper inference (`float32` or `float16`).
|
||||
- `WHISPER_LANGUAGE`: Language code for Whisper to use during transcription (e.g., `en` for English).
|
||||
- `WHISPER_DEVICE`: Device selection for faster-whisper (`cpu`, `cuda`, or `auto`). Default is `cpu` in docker-compose to avoid GPU lib issues on non-GPU hosts.
|
||||
- `WHISPER_CPU_THREADS`: CPU threads used for Whisper when `WHISPER_DEVICE=cpu` (default `4`).
|
||||
- `TRANSCRIBE_BACKEND` (default `local`): Set to `openai` to offload Whisper transcription to the OpenAI API instead of running locally.
|
||||
- `OPENAI_API_KEY`: Required when `TRANSCRIBE_BACKEND=openai`; API key used for authenticated requests.
|
||||
- `OPENAI_BASE_URL`, `OPENAI_TRANSCRIBE_MODEL`, `OPENAI_TRANSCRIBE_TIMEOUT`: Optional overrides for the OpenAI transcription endpoint, model and request timeout.
|
||||
|
Reference in New Issue
Block a user