Developer Guide
This guide explains the architecture, key modules, and how to extend SaLED.
Repository layout
- Backend:
backend/saledapp/main.py: FastAPI app, CORS, health, router registration.app/routers: API endpoints fortranscribe,summary,chat,retranslate.core/services: Business logic; stateless where possible, with in-memory job/result stores.core/models: Adapters likeWhisperTranscriberandLLMService.cli: Typer-based CLI for model management and local transcription.
- Frontend:
frontendNext.js UI.
Backend architecture
- FastAPI app exposes REST + WebSocket.
- Request-scoped headers:
X-Request-ID,X-Process-Time-Msadded by middleware. - Routers:
- Transcribe: start job, list jobs, get status, get result, live WebSocket stream.
- Summary: start job, get status, get result.
- Chat: synchronous responses to transcript-grounded prompts; list available models.
- Retranslate: reprocess segment from uploaded clip or from full audio registered by job.
Services
TranscriptionService
- Saves uploaded file, assigns
job_id, spawns background processing. - Converts to WAV; chunk-based processing for long audio (60s chunks, 10s overlap).
- Real-time progress via
_update_job_statusandget_segments_generator. - Diarization: tries PyAnnote; if unavailable, uses feature/rule-based fallbacks.
- Results in-memory:
jobs,results,segments_cache, andjob_file_pathsfor retranslation.
SummaryService
- Uses
LLMServiceto generate a summary and extract topics. - Tracks job status and results in-memory.
ChatService
- Wraps
LLMService.chatwith a transcript-aware system prompt. - Returns assistant message, model used, and token usage.
RetranslateService
- Keeps a map of
job_id→ audio file path for full-audio segment extraction. - Extracts a time range to WAV and runs
WhisperTranscriberon the slice.
Getting started (development)
Prerequisites
- Linux/macOS with Bash
- Git for version control
- NVM for Node.js and npm frontend environment
- UV for Python backend environment
1) Clone the repo
bash
git clone https://github.com/Sang-Buster/SaLED.git
cd SaLED2) Install dependencies
Use the Makefile helpers:
bash
make install
# or
./run.sh installThis will:
- Backend: create/refresh a
uvvirtual environment and sync Python deps inbackend/saled. - Frontend: install
npmpackages infrontend.
Optional: install pre-commit hooks
bash
uv add ruff pre-commit
pre-commit install3) Start development servers
bash
make dev
# or
./run.sh dev- Frontend: http://localhost:3000
- Backend (FastAPI docs): http://localhost:8000/docs
To run only one side:
bash
make backend # FastAPI only
make frontend # Next.js only4) Verify health
- Visit http://localhost:8000/health (expect {"status":"healthy", ...}).
- Visit http://localhost:8000/ (service metadata and endpoints).
- Check http://localhost:8000/docs for interactive OpenAPI.
5) First transcription
Use the API UI at http://localhost:8000/docs or curl:
bash
curl -X POST "http://localhost:8000/api/transcribe/" \
-F "file=@backend/saled/demo.wav" \
-F "model_name=tiny.en" \
-F "diarize=true"Response contains job_id. Poll status:
bash
curl "http://localhost:8000/api/transcribe/status/<job_id>"Fetch result:
bash
curl "http://localhost:8000/api/transcribe/result/<job_id>"6) Using the CLI (optional)
From backend/saled:
bash
./cli.py --help
./cli.py model list
./cli.py model download tiny.en
./cli.py transcribe ../../demo.wavTroubleshooting
- Port conflicts: run
make cleanto kill prior servers. - Missing models: run
./cli.py model listand./cli.py model download <name>. - CORS or 500s: check backend logs; every response includes
X-Request-IDto trace. - Long audio: progress is chunked; for live updates, connect to WebSocket
/api/transcribe/live/{job_id}.
Adding a new API route
- Create a router in
app/routers/<feature>.pywith Pydantic models. - Register it in
app/main.pywithapp.include_router. - Implement logic in
core/services/<feature>_service.py. - Add tests in
backend/saled/tests.
Extending transcription
- Models: integrate new Whisper variants by extending
WhisperTranscriberor mapping inMODEL_REPO_IDS. - Progress: emit granular messages via
_update_job_status. - Diarization: plug in better pipelines; ensure fallbacks remain robust.
Frontend integration notes
- The UI expects REST endpoints at
/api/<feature>and a WebSocket at/api/transcribe/live/{job_id}. - CORS is permissive by default; tighten for production.
Production considerations
- Persistence: move jobs/results from memory to a store (Redis/Postgres/S3 for blobs).
- Background workers: offload long jobs to a task queue (RQ/Celery/Arq) to isolate API.
- Rate limits/auth: introduce auth middleware and quotas.
- Observability: structure logs with
request_id; export metrics. - Model assets: pre-fetch and cache models using the CLI or at container build time.
Testing and linting
- Tests live under
backend/saled/tests. - Lint:
make lintruns frontend Prettier/ESLint and backend Ruff.