Both the text-to-speech and the speech-to-text models launched here suffer from ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		simonw 9 months ago \| parent \| context \| favorite \| on: OpenAI Audio Models Both the text-to-speech and the speech-to-text models launched here suffer from reliability issues due to combining instructions and data in the same stream of tokens. I'm not yet sure how much of a problem this is for real-world applications. I wrote a few notes on this here: https://simonwillison.net/2025/Mar/20/new-openai-audio-model...

accrual 9 months ago [–]

Thanks for the write up. I've been writing assembly lately, so as soon as I read your comment, I thought "hmm reminds me of section .text and section .data".

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact