API Reference
Base URL: http://localhost:8000
All responses include headers:
X-Request-ID
: unique ID for tracingX-Process-Time-Ms
: end-to-end processing time for the request
Health and root
- GET
/health
→{ status, timestamp }
- GET
/
→ service metadata: name, version, endpoints
Transcription API
Prefix: /api/transcribe
List available models
GET /api/transcribe/models
Response:
{
"models": [
{
"name": "tiny.en",
"full_name": "openai/whisper-tiny.en",
"is_downloaded": true,
"size_mb": 75.0,
"description": "Fast, English optimized"
}
]
}
Start transcription
POST /api/transcribe/
Form fields:
file
(required): audio file (mp3, wav, m4a, ...)model_name
(default:tiny.en
)language
(optional ISO 639-1)diarize
(bool, default: true)
Response:
{ "job_id": "<uuid>", "status": "processing", "message": "..." }
List jobs
GET /api/transcribe/jobs
Response:
{
"jobs": [
{
"job_id": "...",
"status": "processing",
"progress": 25,
"updated_at": 1712345678.9,
"message": "..."
}
],
"count": 1
}
Check status
GET /api/transcribe/status/{job_id}
Response (examples):
{
"job_id": "...",
"status": "processing",
"message": "Processing chunk 2/5 (40%)",
"progress": 40,
"updated_at": 1712345678.9
}
or if not found:
{
"job_id": "...",
"status": "not_found",
"message": "Transcription job not found or expired",
"progress": 0,
"updated_at": 1712345678.9
}
Get result
GET /api/transcribe/result/{job_id}
Response:
{
"segments": [
{
"start": 0.12,
"end": 2.34,
"text": "hello",
"speaker": "SPEAKER_1",
"speaker_confidence": 0.78,
"start_display": "00:00:00.120",
"end_display": "00:00:02.340"
}
],
"text": "hello ...",
"language": "en",
"audio_duration": 123.45
}
Live updates (WebSocket)
WS /api/transcribe/live/{job_id}
Messages:
- Status updates:
{ "type":"status", "status":"processing|completed|failed", "message":"...", "progress": 0-100, "chunk_info": {"current_chunk":N,"total_chunks":M,"chunk_progress":0-100} }
- Segment objects as they are produced (silence segments are not sent)
Debug a job
GET /api/transcribe/debug/{job_id}
→ diagnostic fields: cached segments, latest time, etc.
Summary API
Prefix: /api/summary
Start summary
POST /api/summary/
Body (JSON):
{
"transcript": [{ "start": 0.0, "end": 1.2, "text": "..." }],
"full_text": "optional full transcript text",
"model_name": "gpt-3.5-turbo",
"max_length": 500,
"format": "paragraph" // paragraph | bullet_points | key_points
}
Response: { "job_id": "<uuid>", "status": "processing", "message": "..." }
Get summary result
GET /api/summary/{job_id}
Response:
{
"summary": "...",
"topics": ["..."],
"length": 123,
"model_used": "gpt-3.5-turbo"
}
Check summary status
GET /api/summary/status/{job_id}
→ { status, message, progress }
or 404 if unknown.
Chat API
Prefix: /api/chat
List available chat models
GET /api/chat/models
Response:
{
"models": [{ "id": "gpt-3.5-turbo", "name": "GPT-3.5", "description": "..." }]
}
Chat with transcript context
POST /api/chat/
Body (JSON):
{
"transcript": [{ "start": 0.0, "end": 1.2, "text": "..." }],
"full_text": "optional full transcript text",
"messages": [{ "role": "user", "content": "What did speaker 2 request?" }],
"model_name": "gpt-3.5-turbo",
"temperature": 0.7,
"max_tokens": 500
}
Response:
{
"message": { "role": "assistant", "content": "..." },
"model_used": "gpt-3.5-turbo",
"usage": { "prompt": 123, "completion": 45, "total": 168 }
}
Retranslation API
Prefix: /api/retranslate
Reprocess a provided audio segment
POST /api/retranslate/segment
Form fields:
file
(required): clipped audio for the segmentsegment_id
(required)start_time
(float seconds)end_time
(float seconds)model_name
(default:whisper-large-v3-turbo
)language
(optional) Response:
{
"segment_id": "seg-1",
"text": "...",
"confidence": 0.95,
"speaker": null,
"model_used": "whisper-large-v3-turbo"
}
Reprocess from previously uploaded full audio
POST /api/retranslate/segment-from-full?job_id=<job_id>
Body (JSON):
{
"segment_id": "seg-1",
"start_time": 30.0,
"end_time": 44.0,
"model_name": "whisper-large-v3-turbo",
"language": "en"
}
Response: same as above.
Error handling
- 400: invalid input (e.g., file content-type not audio)
- 404: unknown
job_id
- 500: unhandled errors; payload includes
detail
and responses still carry CORS headers