User Guide

This guide walks through common tasks in the SaLED UI and with the API.

Open the app at http://localhost:3000.
Use the transcription panel to select an audio file (MP3/WAV/M4A).
Choose a Whisper model (start with tiny.en for speed, larger models for quality).
Enable/disable speaker diarization as needed.
Start. Watch progress indicators; segments appear incrementally.

Tips:

If a region is inaccurate:

Note the start_time and end_time of the problematic span.
Reprocess just that portion with /api/retranslate/segment by uploading a clipped audio file; or
If the original full audio was uploaded for the same job, call /api/retranslate/segment-from-full?job_id=<job_id> with the time range and model.
Replace or merge the corrected text in your transcript.

When to escalate model size:

After transcription, send segments to /api/summary/ with an optional full_text.
Pick format: paragraph, bullet_points, or key_points.
Poll /api/summary/status/{job_id} and fetch /api/summary/{job_id} for the final summary and topics.

Good prompts for summaries:

Compose a messages array with chat history and send along transcript/full_text to /api/chat/.
Adjust temperature (0.0–1.0) for creativity vs. determinism.
Respect token limits with max_tokens for concise answers.

Best practices:

From backend/saled:

bash

./cli.py model list
./cli.py model download base.en
./cli.py model delete base.en
./cli.py model update

User Guide ​