This brief is for the S02 implementation agent landing cold on the product workflow. After reading it, the agent should implement and verify the first end-to-end path where generated speech is saved into Soundboard and can be reloaded for reuse without reopening discovery.
Use the legacy web main TTS screen plus the legacy web Soundboard as the initial implementation surface for R045 and R046. This surface already has the text-to-speech API, the save-to-Soundboard API, the browser save modal, Soundboard category loading, pytest coverage, and the storage-provider seam needed to prove the user-visible workflow fastest.
Native parity, storage-provider migration execution, and additional speech surfaces are downstream work. They should consume the acceptance contract proven here rather than expand this slice into another platform inventory.
Only these tracked files are needed for S02 implementation targeting:
web/python-web-app/src/routes/tts.py exposes POST /api/text-to-speech and delegates generation to the service layer.web/python-web-app/src/services/tts_service.py allocates a generated speech artifact and returns artifact_id, audio_path, and download_path on successful generation.web/python-web-app/src/legacy_runtime.py handles POST /api/soundboard/save-clip, requires both text and an artifact reference, uploads the audio/text assets to the configured storage provider path, updates clip order metadata, and invalidates category cache.web/python-web-app/static/js/index_speech.js owns the browser generate-and-save modal flow. Its current save request sends category, text, format, bitrate, and normalize values, but does not yet submit the generated artifact reference returned by /api/text-to-speech.web/python-web-app/static/js/index_soundboard.js owns Soundboard category load, clip render, play, download, and reload/reuse behavior.web/python-web-app/tests/api/test_tts.py already asserts generated TTS responses expose artifact fields and that save-to-Soundboard rejects requests missing an explicit artifact reference or text.POST /api/text-to-speech.audio_path and related metadata.POST /api/soundboard/save-clip.audio_path or artifact_path.Success criteria:
POST /api/text-to-speech returns a structured success payload that includes artifact_id, audio_path, and download_path for the generated audio artifact.audio_path or equivalent artifact reference is stable enough for a subsequent POST /api/soundboard/save-clip request in the same browser workflow.artifact_path or audio_path and resolves it through the server-side artifact/download path rules.Failure criteria:
SPEECH_ARTIFACT_REFERENCE_REQUIRED.SPEECH_ARTIFACT_NOT_FOUND response.Success criteria:
POST /api/soundboard/save-clip uploads the generated audio to the selected Soundboard category using the configured storage provider path.Failure criteria:
Success criteria:
/api/text-to-speech artifact reference separate from the text input value.audio_path or artifact_path from the latest generated response, plus category, text, format, bitrate, and normalize options.Failure criteria:
audio_path or artifact_path.SPEECH_ARTIFACT_REFERENCE_REQUIRED, SPEECH_ARTIFACT_NOT_FOUND, or another structured error, the UI must show a clear failure toast and reset the Save button state./api/text-to-speech failures./api/soundboard/save-clip failures.SPEECH_RUNTIME_ERROR, SPEECH_ARTIFACT_REFERENCE_REQUIRED, and SPEECH_ARTIFACT_NOT_FOUND.Future agents must be able to inspect the workflow through pytest failures and runtime diagnostics. Preserve these observable fields and signals:
/api/text-to-speech success payload: artifact_id, audio_path, download_path./api/soundboard/save-clip request payload: category, text, and audio_path or artifact_path.save-clip-to-soundboard, structured speech error codes, provider upload path, text sidecar path, clip-order update result, and cache invalidation log.