SpeakTrue

Speech Workflow Contract

Reader and action

This contract is for native and web engineers implementing the generated speech to Soundboard save workflow. After reading it, a caller should be able to generate speech, retain the returned artifact reference, save that artifact to Soundboard, and branch correctly on structured failures without inspecting legacy planning artifacts.

Scope

The workflow has two HTTP operations:

  1. POST /api/text-to-speech generates a reusable speech artifact.
  2. POST /api/soundboard/save-clip saves that generated artifact into a Soundboard category.

Privileged storage writes remain server-side. Browser and native clients only forward artifact metadata returned by the generation operation; they do not write directly to object storage.

Generate speech request

POST /api/text-to-speech

{
  "text": "Text to speak",
  "voice": "voice-id",
  "model": "eleven_multilingual_v2",
  "stability": 0.8,
  "similarity": 0.7,
  "speed": 0.9,
  "style": 0.2,
  "speaker_boost": true
}

Required:

Optional fields fall back to server defaults when omitted. Numeric tuning fields must be parseable as numbers.

Generate speech success

HTTP status: 200

{
  "success": true,
  "status": 200,
  "artifact_id": "generated-id",
  "artifact_path": "speech/anonymous/generated-id.mp3",
  "audio_path": "/static/speech/anonymous/generated-id.mp3",
  "download_path": "/download-audio?path=speech/anonymous/generated-id.mp3"
}

Client rules:

Save to Soundboard request

POST /api/soundboard/save-clip

{
  "category": "Lecture1",
  "text": "Text used to generate the speech",
  "artifact_path": "speech/anonymous/generated-id.mp3",
  "artifact_id": "generated-id",
  "audio_path": "/static/speech/anonymous/generated-id.mp3",
  "download_path": "/download-audio?path=speech/anonymous/generated-id.mp3",
  "format": "mp3",
  "bitrate_kbps": 192,
  "normalize": true
}

Required:

Compatibility:

Save to Soundboard success

HTTP status: 200

{
  "success": true,
  "message": "Clip and text saved to Lecture1 category in storage",
  "clip": {
    "name": "Text_to_speak",
    "filename": "Text_to_speak_1712345678.mp3",
    "url": "https://storage.example/soundboard/Lecture1/Text_to_speak_1712345678.mp3",
    "text_url": "https://storage.example/soundboard/Lecture1/Text_to_speak_1712345678.txt",
    "text_filename": "Text_to_speak_1712345678.txt",
    "category": "Lecture1",
    "timestamp": 1712345678
  }
}

Client rules:

Failure envelope

Generation and save failures use the same structured envelope:

{
  "error": "Human-readable failure message",
  "error_code": "SPEECH_ARTIFACT_NOT_FOUND",
  "operation": "save-clip-to-soundboard",
  "status": 404,
  "retryable": false,
  "details": {
    "phase": "audio_upload",
    "backend": "supabase"
  }
}

Required fields:

Details rules:

Status and retryability expectations

Artifact path rules

Browser diagnostics

The legacy web runtime exposes:

Diagnostics include generation/save phase, operation, error code, status, retryability, message, and bounded details where available.

Native consumer checklist

Executable proof

Run these checks before changing the contract:

node web/python-web-app/tests/js/tts_soundboard_workflow_runtime_check.mjs
web/python-web-app/venv/bin/pytest web/python-web-app/tests/api/test_speech_contract_hardening.py web/python-web-app/tests/api/test_tts_soundboard_workflow_target.py web/python-web-app/tests/api/test_tts.py web/python-web-app/tests/api/test_soundboard_supabase_mode.py
deno test backend/supabase/functions/tts-generate/handler_test.ts
deno test backend/supabase/functions/soundboard-save-generated/handler_test.ts

The graph rebuild command keeps code navigation artifacts current after implementation changes:

python3 -c "from graphify.watch import _rebuild_code; from pathlib import Path; _rebuild_code(Path('.'))"