Yes, with VideoSDK's Real-Time AI Agents, you can control the TTS output tone, either via prompt engineering (if your TTS provider supports it, like ElevenLabs) or by integrating custom models that support tonal control directly. Our modular pipeline architecture makes it easy to plug in providers like ElevenLabs and pass tone/style prompts dynamically per utterance.
So if you're building AI companions and want them to sound calm, excited, empathetic, etc., you can absolutely prompt for those tones in real time, or even switch voices or tones mid-conversation based on context or user emotion.
Let us know what you're building. Happy to dive deeper into tone control setups or help debug a specific flow!
We are building AI companions, the tone prompting would be great