In partnership with

If you’ve spent any time on TikTok, Instagram Reels, or YouTube Shorts lately, you’ve probably noticed something: AI voiceovers are everywhere. They’re ‘human enoug’” that most people don’t stop to question them.

Tools like ElevenLabs offer big libraries of ready-made voices in different accents and languages, and creators are using them to narrate everything from fact videos to mini-docs.

In this newsletter I want to run through something that isn’t quite so normalised: voice cloning - making a replica of your own voice and using it in production. I’ve been experimenting with this for 2 years now, but only recently had enough demand to justify the time and cost putting in to making a new one. But I can say with certainty that those costs were justified.

My latest YouTube video runs through why voice cloning is useful, and how to do it.

Library AI voices vs. cloned voices

Let’s separate two things that often get blurred together:

  • Library voices are prebuilt “AI people.” They sound real, but they’re not you. Great for quick narration and character inserts.

  • Voice clones are trained on your recordings, so they mimic your tone, cadence, and vocal quirks.

Both are useful. But cloning has a few specific use cases that make it worth talking about on its own.

The big question: should you even clone your voice?

Voice cloning sits in a weird cultural and ethical zone. Some people find it creepy - the uncanny valley effect where something feels almost human but not quite. Others see it as a natural extension of digital production.

There are also hard legal and professional realities: a person’s voice is part of their identity and often their intellectual property. In TV, podcasting, and other talent-based industries, contracts frequently forbid cloning a presenter’s voice without explicit permission, and usually compensation. So the rule is simple: never clone someone else without them fully understanding and approving it.

Where cloning gets genuinely effective is when a creator chooses to clone their own voice as a tool to speed up their workflow.

Production use cases

1. Fixes and last-minute additions

Anyone who edits knows the moment: you’ve finished the cut, then realize you need a missing line, a correction, or a cleaner take. Normally that means re-recording, coordinating schedules, setting up mics again, and hoping the tone matches.

With a voice clone, you just type the new line into ElevenLabs and drop it into your timeline. This can make an annoying fix with many moving parts relatively painless.

2. Fast timeline visualization (the real “hack”)

This is the one that changed my process.

My videos lean heavily on images, footage, and documentary-style pacing. If I use a clone, I can take my script and generate a full narration track immediately, then start building the edit around it - without filming or recording anything first.

That means:

  1. Script finished → clone narration generated

  2. Timeline blocked out visually

  3. Only record the on-camera pieces or emphasis lines later

It massively reduces the time spent recording and also cuts down narration-editing in post. In my own workflow, editing time in Premiere for presenting dropped from about 2 hours to 15 minutes. Recording time fell from ~50 minutes to ~15 minutes. That’s roughly 2.5 hours saved per video - which compounds into weeks over a year.

Partner Message

The Future of Shopping? AI + Actual Humans.

AI has changed how consumers shop by speeding up research. But one thing hasn’t changed: shoppers still trust people more than AI.

Levanta’s new Affiliate 3.0 Consumer Report reveals a major shift in how shoppers blend AI tools with human influence. Consumers use AI to explore options, but when it comes time to buy, they still turn to creators, communities, and real experiences to validate their decisions.

The data shows:

  • Only 10% of shoppers buy through AI-recommended links

  • 87% discover products through creators, blogs, or communities they trust

  • Human sources like reviews and creators rank higher in trust than AI recommendations

The most effective brands are combining AI discovery with authentic human influence to drive measurable conversions.

Affiliate marketing isn’t being replaced by AI, it’s being amplified by it.

How to get a high-quality clone

ElevenLabs recommends at least one hour of clean audio, ideally closer to three hours, for professional cloning. You can get usable results with less, but quality scales with amount and consistency of input.

Key prep rules:

  • One speaker only. If other voices appear in your samples, they can bleed into the clone.

  • Consistent mic + settings. Different microphones or rooms can create tonal drift. I excluded older audio recorded on a lower-quality Bluetooth mic and only used tracks from my shotgun mic sessions for consistency.

    AI Voice Cloning

  • Clean audio. Remove background noise or obvious mistakes if you can. ElevenLabs has cleanup tools, but the better your source, the better your clone.

If you don’t already have an audio archive, the fallback is simple: record 1–2 hours of clean reading with a decent mic in a quiet room. It can be a novel, articles, anything — you just want steady, natural speech.

The cloning process in ElevenLabs (high level)

  1. Go to Professional Voice Cloning (requires paid Creator plan or above).

  2. Upload your samples.

  3. Run optional enhancement tools if needed.

  4. Complete voice verification by reading a short script.

  5. Wait for processing (now typically hours, not days).

  6. Your clone appears in your voice library ready for text-to-speech.

    AI Voice Cloning

Once it’s ready, test it with casual lines, then with a real script segment. I treat clone output as a guide track for editing — like a painter’s sketch before the final layer. Then I re-record only what genuinely needs human presence.

Bottom line

AI voice cloning is way beyond a gimmick now. It’s a tool for:

  • shaving hours off your edit schedule,

  • reducing re-recording friction,

  • and letting you prototype documentary timelines at speed.

If you produce consistently, the subscription pays for itself quickly in time saved. And if you want to take the next step, pairing voice cloning with a visual avatar can unlock even more automated production workflows.

Reply

or to participate