Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Free Text-to-Speech App with natural voices (elevenlabs.io)
38 points by jslakro on Aug 22, 2024 | hide | past | favorite | 29 comments


I personally love using Piper to make audiobooks offline. I made a CLI wrapper over it and it works great for me [0] Honestly surprised more people aren't using similar offline tools. I have listened to dozens of audiobooks in the past year through this.

There is a pitfall in thinking that the most natural voice is the most important metric (i.e. many blind users still prefer espeak). Piper has a great balance of natural voice, offline convenience/privacy, and interpretability at high speeds. Still have not seen anything better as an overall audiobook solution.

https://github.com/C-Loftus/QuickPiperAudiobook


I am excited to try it! Could you add a prebuilt release? Would it be possible to post a sample output so people can judge for themselves whether they would like the quality available?


Yes I do have prebuilt rolling releases. Since they are categorized as draft releases, GitHub hides them by default, but they are there and should work fine: https://github.com/C-Loftus/QuickPiperAudiobook/releases/

Examples of Piper generally can be found at https://rhasspy.github.io/piper-samples/ , and I added my program can be found at https://github.com/C-Loftus/QuickPiperAudiobook/tree/master/...

I added this info in the readme for you and anyone else that could benefit from it.


I could not find anything at the on the releases page (https://github.com/C-Loftus/QuickPiperAudiobook/releases/)

This is what I see: https://imgur.com/screenshot-PqmyqhT

Do I need a GitHub account to see the draft releases?


Sorry, apparently draft releases are private. I didn't realize that. Added an auto release. Here it is: https://github.com/C-Loftus/QuickPiperAudiobook/releases/tag...


Thank you so much!!!


Are your audiobooks mainly non-fiction books? I would really miss the personality a good speaker can give each character.


Yes I mainly read non-fiction and I think your point is valid. That being said, I do listen to a fair amount of books that include interviews or travel writing with multiple characters and it still works well for me.


Eleven Labs is too expensive.

They’re missing the chance to be cheap and sweep as many developers as possible into their net. OpenAIs genius is not only being first with a truly great product but being so cheap it’s the easy and obvious choice for developers.

Something better will come along.


Eleven Labs announced a price reduction yesterday:

https://www.youtube.com/watch?v=wevDlfDIG9s

It still seems a bit expensive to me, though....


I have found VoiceGen for Linux Mint to be very good (https://linux.softpedia.com/get/Utilities/VoiceGen-104295.sh...). It is available for download through the software manager.

Also, this bookmarklet will speak highlighted text in the browser regardless of platform:

javascript:void function(){ javascript:(function(){ var selection = window.getSelection().toString(); if (!selection) { alert("Please select some text on the page."); return; } var encodedSelection = document.createElement("div"); encodedSelection.textContent = selection; var processedContent = encodedSelection.innerHTML.replace(/\n/g, " <br></br> "); var words = processedContent.split(" "); var formattedText = ""; var speechContent = ""; for (var i = 0; i < words.length; i++) { var word = words[i]; var chunkSize = Math.floor(word.length / 3) + 1; var boldPart = "<span style='font-weight:bolder'>" + word.substring(0, chunkSize) + "</span>"; var lightPart = "<span style='font-weight:lighter'>" + word.substring(chunkSize, word.length) + "</span>"; var formattedWord = boldPart + lightPart; if (word.endsWith(".")) { formattedWord += "<span style='color:red'> *</span>"; } formattedText += formattedWord + " "; speechContent += word + " "; } var newWindow = window.open("", "_blank"); newWindow.document.write("<html><head><title>Spoken Content</title></head><body><input type='range' min='0.1' max='10' value='1' step='0.1' id='rate-slider'><p id='content' style='background-color:#EDD1B0;font-size:40;line-height:200%25;font-family:Arial'>"%20+%20formattedText%20+%20"</p></body></html>");%20var%20rateSlider%20=%20newWindow.document.getElementById("rate-slider");%20var%20utterance%20=%20new%20SpeechSynthesisUtterance(speechContent);%20rateSlider.addEventListener("input",%20function()%20{%20utterance.rate%20=%20rateSlider.value;%20window.speechSynthesis.cancel();%20window.speechSynthesis.speak(utterance);%20});%20window.speechSynthesis.speak(utterance);%20})();}();


I saved this to pastebin (mostly to see the syntax). Does this require Voicegen to be installed or how's it working? https://pastebin.com/zuRVpiVh


I believe it's using the web speech API https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_...

I've played with it a bit (I made a local nonsense social media thing, reading llm content), and its surprisingly easy to get text to display in sync with the audio. It works with voices that are installed on the users system.


The bookmarklet just uses the web speech API. Voicegen does not need to be installed and can generate (on my system) wav files of about 1 hour in length with good speech quality using PicoTTS.


I've used this, and the main ElevenLabs service. The reader has to be using a very very cheap model, it sounds worse than the Azure service (which isn't bad).

It's better than traditional text to speech, but I can't use it to listen to long form articles.


I don't believe Azure's TTS is free anymore.


Ah it's definitely not (significantly cheaper than ElevenLabs), but I was expecting ElevenLabs to be better because I hadn't read (or seen anything suggesting) it was a worse model, but it makes sense.


I haven't tested the app but the main service is pretty near natural language. If 11labs app get to the main service quality and start accepting ePubs, it will be the death of Audible.


> If 11labs app get to the main service quality and start accepting ePubs, it will be the death of Audible.

The app from this announcement does accept epubs. Just tested a couple and had no issues. Haven't used 11labs before, but the quality was good and didn't have any major issues with an English voice even with some spot checked fantasy names/terms or chemical names.


I believe Audible is already using non human speakers as well and just making up names for the narrators. I can tell since sometimes pronunciations are wrong and sound similar to TTS mistakes.


You must be listening to some reaaaaly bad readings.


As a long-time Audible subscriber, I’d estimate that I skip 1 in 3 books because of the narrator’s voice or bad recording setup.


Ah - I have probably listened to maybe 8-10 audiobooks and all ones that have been recommended to me as excellent. I guess I just assumed that that’s what audiobooks are like, which I guess doesn’t make sense.


My main reason of returning or not listening to audiobooks, be it with Audible or competitors, has been either a) a whiny or nasal voice, or, b) a strong accent that i dislike (e.g. strong British or Australian accent).

I have found myself finding more new books by checking which other books a specific narrator has voiced, rather than finding a book I want to listen to and hoping for a good narrator.

/rant i guess


Death is a bit extreme. I think acquisition is much more likely.


Account needed to use the app


How does it compare to app.lmnt.com ? lmnt sounds quite good to me, near natural.


That Ava voice is amazing!


the quality is somewhere 6/10. it still sounds robotics like alexa but its free so it works. i have heard the same audio on youtube videos aswell.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: