Updated README
This commit is contained in:
50
README.md
50
README.md
@@ -14,6 +14,55 @@ docker compose up -d --build
|
||||
```
|
||||
The worker reaches OpenWebUI at `$OPENWEBUI_URL` (default: http://host.docker.internal:3003).
|
||||
|
||||
## Components Overview
|
||||
|
||||
- **scanner**: Scans your media folders (`library` and `transcripts`) for new or updated files, triggering ingestion and processing workflows.
|
||||
- **worker**: Handles general background tasks such as metadata fetching, thumbnail generation, and indexing.
|
||||
- **rss_ingest**: Periodically reads an RSS feed list, downloads new podcast episodes, and adds them to your library for processing.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
- `REFRESH_EXISTING` (default `false`): If set to `true`, forces re-download of metadata, captions, and thumbnails for existing files during scanning.
|
||||
- `REFRESH_TTL` (default `604800` seconds, i.e., 7 days): Time-to-live before metadata and related info are refreshed.
|
||||
- `REFRESH_FAILURE_TTL` (default `86400` seconds, i.e., 1 day): Time-to-live before retrying failed refresh attempts.
|
||||
- `LIBRARY_HOST_DIR`: Path on the host machine where your source media files reside (mounted into the container).
|
||||
- `TRANSCRIPTS_HOST_DIR`: Path on the host machine where processed transcripts, subtitles, and metadata are stored.
|
||||
- `WHISPER_MODEL`: Whisper model variant to use for transcription (e.g., `small`, `medium`, `large`).
|
||||
- `WHISPER_PRECISION`: Precision setting for Whisper inference (`float32` or `float16`).
|
||||
- `WHISPER_LANGUAGE`: Language code for Whisper to use during transcription (e.g., `en` for English).
|
||||
- `YTDLP_COOKIES`: Path to YouTube-DL cookies file for accessing age-restricted or private videos.
|
||||
|
||||
## RSS Ingestion
|
||||
|
||||
PodX supports automated podcast ingestion via RSS feeds:
|
||||
|
||||
- Add your podcast RSS feed URLs to a `feeds.txt` file, one URL per line.
|
||||
- The `rss_ingest` component reads this list periodically, downloads new episodes, and places them into the `library/podcasts` folder.
|
||||
- Downloaded podcasts are then processed by the scanner and worker to generate transcripts, metadata, and thumbnails.
|
||||
|
||||
## Refresh Mechanism
|
||||
|
||||
PodX periodically refreshes metadata, captions, and thumbnails for media files based on the TTL settings:
|
||||
|
||||
- Files older than `REFRESH_TTL` are re-processed to keep metadata up-to-date.
|
||||
- Failed refresh attempts are retried after `REFRESH_FAILURE_TTL`.
|
||||
- Setting `REFRESH_EXISTING=true` in `.env` forces a refresh on every scan cycle.
|
||||
|
||||
## Multi-Worker Setup
|
||||
|
||||
For improved performance and scalability, PodX supports running multiple workers with specialized roles:
|
||||
|
||||
- `podx-worker`: Handles general tasks such as scanning, metadata fetching, and indexing.
|
||||
- `podx-worker-transcribe`: Dedicated to heavy Whisper transcription jobs, isolating resource-intensive audio processing.
|
||||
|
||||
This separation helps optimize resource usage and allows parallel processing of different workloads.
|
||||
|
||||
## Plex Integration
|
||||
|
||||
- The **library** folder contains your source media and can be mounted directly into Plex or other media managers.
|
||||
- PodX automatically generates NFO files and `.srt` subtitle sidecars per show and episode, enabling rich metadata and transcripts in Plex.
|
||||
- This setup lets you browse, search, and play your media with synchronized transcripts and metadata seamlessly.
|
||||
|
||||
## Ingest helpers
|
||||
```bash
|
||||
MEILI_URL=http://localhost:7700 MEILI_KEY=$MEILI_MASTER_KEY ./ingest/ingest_pdfs.sh /path/*.pdf
|
||||
@@ -21,7 +70,6 @@ MEILI_URL=http://localhost:7700 MEILI_KEY=$MEILI_MASTER_KEY ./ingest/ingest_epub
|
||||
MEILI_URL=http://localhost:7700 MEILI_KEY=$MEILI_MASTER_KEY ./ingest/ingest_kiwix.sh /path/wiki.zim
|
||||
```
|
||||
|
||||
|
||||
## Backfill existing files into OpenWebUI
|
||||
```bash
|
||||
# From repo root:
|
||||
|
Reference in New Issue
Block a user