GitHub - KarsonAlford/Sagen: Free, open-source TTS engine for .NET

Sagen (German for "to say") is my attempt at making a text-to-speech engine aimed at .NET developers who don't have thousands of dollars at their disposal to license a commercial speech synthesis solution. In many ways, it is an experiment and continual learning experience for me, as I am in no way well-versed in speech science, phonetics, or vocal acoustics; I simply want to see how far I can go with original research and lots of patience.

Rationale

Some might ask me why I'd bother. After all, there are tons of TTS engines out there already. In short, I don't feel like people have enough options.

Aside from being often prohibitively expensive, it is not unusual for commercial TTS systems to be restrictive in their available customizations for voices, voice parameters, and context-sensitive vocal qualities (e.g. intonation, stress, and timbre). Such qualities are necessary to convey meaning in speech.

Concatenative synthesizers, as well as other similar "realistic" TTS technologies tend to be CPU-heavy, leave a large memory footprint, and require each voice to be installed separately. Because they are based on databases of recorded speech samples, they are not very customizable at all.

There are also many free options for speech synthesizers, but they often have sparse, confusing, or convoluted documentation, or are locked down to one specific language (e.g. Java). While all TTS libraries have advantages and disadvantages, I feel like the .NET crowd would welcome a TTS solution specifically made for them.

My goal with Sagen is not necessarily to produce "something better", but to instead offer a user-friendly TTS engine with a respectable amount of configurability, flexibility, and performance. The best part? It's free.

What's planned

Here is a short list of major features that will be supported:

Text-to-speech based on formant synthesis and physically-based vocal filtering
Plentiful parameters for tuning how voices sound (age, sex, vocal force, hoarseness, etc...)
Support for direct playback, WAV exporting, and sending audio data via System.Stream
Multiple options for sample format and rate (export only)
Support for X-SAMPA-based pronunciation lexicons
Multiple language support (English and German are currently prioritized)
Heteronym resolution
Singing?!

It is currently a heavy work-in-progress, and I welcome your input and/or contributions.

Licensing

This project is made available under the MIT License and is completely free for anyone to use, for any purpose, without the burdens of licensing costs or royalties.

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
Sagen.Languages.German		Sagen.Languages.German
Sagen.Languages.USEnglish		Sagen.Languages.USEnglish
Sagen.Playback.OpenAL		Sagen.Playback.OpenAL
Sagen.Playback.XAudio2		Sagen.Playback.XAudio2
Sagen		Sagen
SagenConsole		SagenConsole
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Rebracer.xml		Rebracer.xml
Sagen.sln		Sagen.sln

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rationale

What's planned

Licensing

About

Releases

Packages

Languages

License

KarsonAlford/Sagen

Folders and files

Latest commit

History

Repository files navigation

Rationale

What's planned

Licensing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages