I personally love using Piper to make audiobooks offline. I made a CLI wrapper over it and it works great for me [0] Honestly surprised more people aren't using similar offline tools. I have listened to dozens of audiobooks in the past year through this.
There is a pitfall in thinking that the most natural voice is the most important metric (i.e. many blind users still prefer espeak). Piper has a great balance of natural voice, offline convenience/privacy, and interpretability at high speeds. Still have not seen anything better as an overall audiobook solution.
I am excited to try it! Could you add a prebuilt release? Would it be possible to post a sample output so people can judge for themselves whether they would like the quality available?
Yes I mainly read non-fiction and I think your point is valid. That being said, I do listen to a fair amount of books that include interviews or travel writing with multiple characters and it still works well for me.
They’re missing the chance to be cheap and sweep as many developers as possible into their net. OpenAIs genius is not only being first with a truly great product but being so cheap it’s the easy and obvious choice for developers.
I've played with it a bit (I made a local nonsense social media thing, reading llm content), and its surprisingly easy to get text to display in sync with the audio. It works with voices that are installed on the users system.
The bookmarklet just uses the web speech API. Voicegen does not need to be installed and can generate (on my system) wav files of about 1 hour in length with good speech quality using PicoTTS.
I've used this, and the main ElevenLabs service. The reader has to be using a very very cheap model, it sounds worse than the Azure service (which isn't bad).
It's better than traditional text to speech, but I can't use it to listen to long form articles.
Ah it's definitely not (significantly cheaper than ElevenLabs), but I was expecting ElevenLabs to be better because I hadn't read (or seen anything suggesting) it was a worse model, but it makes sense.
I haven't tested the app but the main service is pretty near natural language.
If 11labs app get to the main service quality and start accepting ePubs, it will be the death of Audible.
> If 11labs app get to the main service quality and start accepting ePubs, it will be the death of Audible.
The app from this announcement does accept epubs. Just tested a couple and had no issues. Haven't used 11labs before, but the quality was good and didn't have any major issues with an English voice even with some spot checked fantasy names/terms or chemical names.
I believe Audible is already using non human speakers as well and just making up names for the narrators. I can tell since sometimes pronunciations are wrong and sound similar to TTS mistakes.
Ah - I have probably listened to maybe 8-10 audiobooks and all ones that have been recommended to me as excellent. I guess I just assumed that that’s what audiobooks are like, which I guess doesn’t make sense.
My main reason of returning or not listening to audiobooks, be it with Audible or competitors, has been either a) a whiny or nasal voice, or, b) a strong accent that i dislike (e.g. strong British or Australian accent).
I have found myself finding more new books by checking which other books a specific narrator has voiced, rather than finding a book I want to listen to and hoping for a good narrator.
There is a pitfall in thinking that the most natural voice is the most important metric (i.e. many blind users still prefer espeak). Piper has a great balance of natural voice, offline convenience/privacy, and interpretability at high speeds. Still have not seen anything better as an overall audiobook solution.
https://github.com/C-Loftus/QuickPiperAudiobook