
Nari Labs' Dia: A New AI Voice Model Competes with NotebookLM
The field of synthetic speech is experiencing rapid growth, with numerous players vying for dominance. Among the latest entrants is "Dia," an AI model developed by Nari Labs, founded by two undergrads. Dia aims to provide users with greater control over generated voices and script customization, drawing inspiration from Google's NotebookLM.
Toby Kim, one of the co-founders, mentioned that they began exploring speech AI only three months prior. Leveraging Google's TPU Cloud program, which offers free access to TPU AI chips, they trained Dia, a 1.6 billion-parameter model capable of generating dialogue from scripts. Users can fine-tune speaker tones and incorporate nonverbal cues like coughs and laughs.
Accessibility and Functionality
Dia is available on platforms like Hugging Face and GitHub, making it accessible to a wide audience. It can operate on most modern PCs equipped with at least 10GB of VRAM. While it generates random voices by default, users can guide it with style descriptions or even clone voices.
Early tests have shown Dia to be quite effective, readily generating two-way conversations on various topics. The voice quality is competitive with existing tools, and the voice cloning feature stands out for its ease of use.
Ethical Considerations
Like many voice generators, Dia lacks robust safeguards against misuse. This raises concerns about the potential for creating disinformation or scam recordings. Nari Labs acknowledges these risks and discourages harmful use, but disclaims responsibility for misuse. Moreover, the data used to train Dia hasn't been disclosed, raising questions about potential copyright infringement, a common yet legally ambiguous practice in AI development.
Future Plans
Nari Labs envisions building a synthetic voice platform with social features on top of Dia and future, larger models. They also plan to release a technical report and expand language support beyond English.
1 Video of AI Voice Model:
Source: TechCrunch