We ❤️ Open Source
A community education resource
How I used Pinokio to run OpenAudio and clone a voice in seconds
This open source combo turns your laptop into a text-to-speech, AI voice-cloning lab.
Are you a scientist, developer, or just a tinkerer like me? Fascinated by the power of AI to generate and clone human voices for your projects? Then OpenAudio might be exactly what you’re looking for.
Thanks to Pinokio, it’s easy to download and install OpenAudio directly on your computer. In this brief walkthrough, I’ll show you how I set things up on an M3 MacBook Air with 16 GB of RAM, and just how quickly you can get started with AI-generated speech.
Pinokio is a browser that lets you install, run, and automate AI models on your local machine. Once you’ve installed it, click the Discover button in the top-right corner of the app and search for OpenAudio. It’s usually one of the first options listed under the Apps section.
Read more: 15 overlooked rules of AI
Pinokio is open source under the MIT license, and OpenAudio is open source under Apache 2.0. OpenAudio is based on FishSpeech and recently rebranded from that name.

The OpenAudio project has 77 contributors and describes itself as follows:
“We are incredibly excited to unveil OpenAudio S1, a cutting-edge text-to-speech (TTS) model that redefines the boundaries of voice generation. Trained on an extensive dataset of over 2 million hours of audio, OpenAudio S1 delivers unparalleled naturalness, expressiveness, and instruction-following capabilities.”
Installing the model via Pinokio was fast and straightforward. Once set up, you can immediately begin generating AI-powered speech. (Performance may vary depending on your system specs.)

After launching OpenAudio, you’re greeted with an intuitive interface.

In my case, I used just four lines of text and generated audio in 77 seconds. The result: 8 seconds of output in WAV format, packaged in a 684 KB file. A Download button is located at the top-right of the playback window.
Give it a listen and judge for yourself.
In addition to standard text-to-speech, OpenAudio also supports voice cloning. You can upload a sample, ideally 5 to 10 seconds of reference audio, to generate a synthetic voice that sounds like you. Controls for cloning and fine-tuning appear in a dialog box at the bottom-left of the interface.
Use of OpenAudio is governed by Creative Commons BY-NC-SA 4.0. The team also includes a helpful legal note:
“We do not hold any responsibility for any illegal usage of the codebase. Please refer to your local laws about DMCA and other related laws.”
Under the hood, OpenAudio is built on VQ-GAN and LLaMA and developed by the Fish Audio team. Source code and models are publicly available, and the project stays active on Discord. Visit the OpenAudio blog for the latest research and updates.
Have some fun experimenting with OpenAudio and Pinokio on your machine. Leverage the power of open source and AI in your own projects, and if you’re inspired, consider contributing to the community.
More from We Love Open Source
- 15 overlooked rules of AI
- Why AI won’t replace developers
- Getting started with AI on a budget
- What if your AI agent could actually help?
- Build better with AI: Lessons from real-world GenAI projects
This article is adapted from “AI voice generation made easy with Pinokio and OpenAudio” by Don Watkins, and is republished with permission from the author.
The opinions expressed on this website are those of each author, not of the author's employer or All Things Open/We Love Open Source.