VALL-E: Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

Mind-Blowing VALL-E: Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

Marko Vidrih
4 min readJan 16, 2023

https://www.niftify.io/
https://creatus.ai/

Model Overview

VALL-E Model overview. Image source: Microsoft

Advantages

Use Cases

Conclusion

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Marko Vidrih
Marko Vidrih

Written by Marko Vidrih

Most writers waste tremendous words to say nothing. I’m not one of them.

No responses yet

Write a response