Free Text-to-Speech App with natural voices

C-Loftus · on Aug 23, 2024

I personally love using Piper to make audiobooks offline. I made a CLI wrapper over it and it works great for me [0] Honestly surprised more people aren't using similar offline tools. I have listened to dozens of audiobooks in the past year through this.

There is a pitfall in thinking that the most natural voice is the most important metric (i.e. many blind users still prefer espeak). Piper has a great balance of natural voice, offline convenience/privacy, and interpretability at high speeds. Still have not seen anything better as an overall audiobook solution.

https://github.com/C-Loftus/QuickPiperAudiobook

_ddzr · on Aug 23, 2024

I am excited to try it! Could you add a prebuilt release? Would it be possible to post a sample output so people can judge for themselves whether they would like the quality available?

C-Loftus · on Aug 24, 2024

Yes I do have prebuilt rolling releases. Since they are categorized as draft releases, GitHub hides them by default, but they are there and should work fine: https://github.com/C-Loftus/QuickPiperAudiobook/releases/

Examples of Piper generally can be found at https://rhasspy.github.io/piper-samples/ , and I added my program can be found at https://github.com/C-Loftus/QuickPiperAudiobook/tree/master/...

I added this info in the readme for you and anyone else that could benefit from it.

_ddzr · on Aug 25, 2024

I could not find anything at the on the releases page (https://github.com/C-Loftus/QuickPiperAudiobook/releases/)

This is what I see: https://imgur.com/screenshot-PqmyqhT

Do I need a GitHub account to see the draft releases?

C-Loftus · on Aug 26, 2024

Sorry, apparently draft releases are private. I didn't realize that. Added an auto release. Here it is: https://github.com/C-Loftus/QuickPiperAudiobook/releases/tag...

_ddzr · on Aug 26, 2024

Thank you so much!!!

nilsherzig · on Aug 24, 2024

Are your audiobooks mainly non-fiction books? I would really miss the personality a good speaker can give each character.

C-Loftus · on Aug 24, 2024

Yes I mainly read non-fiction and I think your point is valid. That being said, I do listen to a fair amount of books that include interviews or travel writing with multiple characters and it still works well for me.

andrewstuart · on Aug 23, 2024

Eleven Labs is too expensive.

They’re missing the chance to be cheap and sweep as many developers as possible into their net. OpenAIs genius is not only being first with a truly great product but being so cheap it’s the easy and obvious choice for developers.

Something better will come along.

tkgally · on Aug 23, 2024

Eleven Labs announced a price reduction yesterday:

https://www.youtube.com/watch?v=wevDlfDIG9s

It still seems a bit expensive to me, though....

_ddzr · on Aug 23, 2024

I have found VoiceGen for Linux Mint to be very good (https://linux.softpedia.com/get/Utilities/VoiceGen-104295.sh...). It is available for download through the software manager.

Also, this bookmarklet will speak highlighted text in the browser regardless of platform:

javascript:void function(){ javascript:(function(){ var selection = window.getSelection().toString(); if (!selection) { alert("Please select some text on the page."); return; } var encodedSelection = document.createElement("div"); encodedSelection.textContent = selection; var processedContent = encodedSelection.innerHTML.replace(/\n/g, " "); var words = processedContent.split(" "); var formattedText = ""; var speechContent = ""; for (var i = 0; i < words.length; i++) { var word = words[i]; var chunkSize = Math.floor(word.length / 3) + 1; var boldPart = "" + word.substring(0, chunkSize) + ""; var lightPart = "" + word.substring(chunkSize, word.length) + ""; var formattedWord = boldPart + lightPart; if (word.endsWith(".")) { formattedWord += " *"; } formattedText += formattedWord + " "; speechContent += word + " "; } var newWindow = window.open("", "_blank"); newWindow.document.write("<html><head><title>Spoken Content</title></head><body><input type='range' min='0.1' max='10' value='1' step='0.1' id='rate-slider'>"%20+%20formattedText%20+%20"</body></html>");%20var%20rateSlider%20=%20newWindow.document.getElementById("rate-slider");%20var%20utterance%20=%20new%20SpeechSynthesisUtterance(speechContent);%20rateSlider.addEventListener("input",%20function()%20{%20utterance.rate%20=%20rateSlider.value;%20window.speechSynthesis.cancel();%20window.speechSynthesis.speak(utterance);%20});%20window.speechSynthesis.speak(utterance);%20})();}();

pogue · on Aug 23, 2024

I saved this to pastebin (mostly to see the syntax). Does this require Voicegen to be installed or how's it working? https://pastebin.com/zuRVpiVh

ortsa · on Aug 23, 2024

I believe it's using the web speech API https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_...

I've played with it a bit (I made a local nonsense social media thing, reading llm content), and its surprisingly easy to get text to display in sync with the audio. It works with voices that are installed on the users system.

_ddzr · on Aug 23, 2024

The bookmarklet just uses the web speech API. Voicegen does not need to be installed and can generate (on my system) wav files of about 1 hour in length with good speech quality using PicoTTS.

radicalriddler · on Aug 23, 2024

I've used this, and the main ElevenLabs service. The reader has to be using a very very cheap model, it sounds worse than the Azure service (which isn't bad).

It's better than traditional text to speech, but I can't use it to listen to long form articles.

pogue · on Aug 23, 2024

I don't believe Azure's TTS is free anymore.

radicalriddler · on Aug 25, 2024

Ah it's definitely not (significantly cheaper than ElevenLabs), but I was expecting ElevenLabs to be better because I hadn't read (or seen anything suggesting) it was a worse model, but it makes sense.

dakial1 · on Aug 23, 2024

I haven't tested the app but the main service is pretty near natural language. If 11labs app get to the main service quality and start accepting ePubs, it will be the death of Audible.

IPTN · on Aug 23, 2024

> If 11labs app get to the main service quality and start accepting ePubs, it will be the death of Audible.

The app from this announcement does accept epubs. Just tested a couple and had no issues. Haven't used 11labs before, but the quality was good and didn't have any major issues with an English voice even with some spot checked fantasy names/terms or chemical names.

signaru · on Aug 23, 2024

I believe Audible is already using non human speakers as well and just making up names for the narrators. I can tell since sometimes pronunciations are wrong and sound similar to TTS mistakes.

dmd · on Aug 23, 2024

You must be listening to some reaaaaly bad readings.

dotinvoke · on Aug 23, 2024

As a long-time Audible subscriber, I’d estimate that I skip 1 in 3 books because of the narrator’s voice or bad recording setup.

dmd · on Aug 23, 2024

Ah - I have probably listened to maybe 8-10 audiobooks and all ones that have been recommended to me as excellent. I guess I just assumed that that’s what audiobooks are like, which I guess doesn’t make sense.

mylastattempt · on Aug 23, 2024

My main reason of returning or not listening to audiobooks, be it with Audible or competitors, has been either a) a whiny or nasal voice, or, b) a strong accent that i dislike (e.g. strong British or Australian accent).

I have found myself finding more new books by checking which other books a specific narrator has voiced, rather than finding a book I want to listen to and hoping for a good narrator.

/rant i guess

_aavaa_ · on Aug 23, 2024

Death is a bit extreme. I think acquisition is much more likely.

zuhsetaqi · on Aug 23, 2024

Account needed to use the app

simfree · on Aug 23, 2024

How does it compare to app.lmnt.com ? lmnt sounds quite good to me, near natural.

NayamAmarshe · on Aug 23, 2024

That Ava voice is amazing!

roshankhan28 · on Aug 23, 2024

the quality is somewhere 6/10. it still sounds robotics like alexa but its free so it works. i have heard the same audio on youtube videos aswell.