I Went Viral with Stick Figure Animations Using ONLY ChatGPT & ElevenLabs

Let me say this right now: discipline isn’t just a habit—it’s transformation. You keep thinking you need more motivation, but what you really need is to silence the voice that says, “You’ve done enough today. Rest. You can start tomorrow.” That voice is what holds you back. Until you shut it down, you’ll never meet the stronger, tougher version of yourself that’s waiting on the other side of struggle.

That mindset applies directly to content creation. If you want to stand out on YouTube, you can’t just chase shortcuts or copy trends. You need to commit to growing your skills, learning the process, and building something sustainable.

Recently, I discovered one of the most underrated content formats on YouTube: stick figure videos. At first glance, they look ridiculously simple—minimal drawings, text, and a voiceover. But the numbers tell a different story:

  • One channel uploaded a basic stick figure video just two months ago and already has over 432,000 views.
  • Another channel with only 49 uploads has gained more than 53,000 subscribers.
  • A brand-new channel posted a single video just a month ago—and that one video has already crossed 2.8 million views.

No facecam. No fancy editing. No expensive setup. Just raw storytelling with stick figures.

I tried the same format myself, and within 24 hours my video was outperforming everything else I’d ever posted. So yes—this works. And in this guide, I’ll show you step by step how to do it, from coming up with the idea to writing the script, generating the visuals, adding the voiceover, and editing everything into a polished video.

Why Canva Isn’t the Best Option

If you’ve watched other tutorials, you’ve probably been told: “Just use Canva.” But here’s the problem. Canva’s library of stick figures is tiny. Most of what you find there has already been used by hundreds of other creators. That makes it nearly impossible to stand out.

I also tried purchasing stick figure bundles—most come with 15 to 25 assets, which is nowhere near enough if you’re building a channel that needs dozens of videos. Worse, you’ll spend hours searching for assets and still end up with visuals that look generic.

Instead of relying on Canva, I use ChatGPT + Leonardo AI to generate original stick figure images on demand. This way, I never run out of material, and my channel maintains a unique look. Even better, the whole process can be automated with a simple script, reducing the workload by up to 70%.

What Makes This Format So Powerful

Before diving into the workflow, let’s talk about why this format works so well.

Stick figure videos go viral because they’re:

  • Relatable — everyone recognizes simple stick drawings, so they don’t feel intimidating or distant.
  • Calm — in a feed full of flashy edits and loud soundtracks, these videos stand out precisely because they’re minimal.
  • Message-driven — the focus isn’t on effects; it’s on the story and emotion.

They work not because they’re complicated, but because they’re simple.

Reviewing a Real Example

Take the channel The Toothpick Guys. At the time of writing, they’ve uploaded seven videos, have nine subscribers, and just over 300 views. Their visuals are consistent—they use two main colors, black and green, which helps create a brand identity. That’s a smart move.

But they also made some mistakes:

  • Too many fast transitions and constant motion. This format works best when visuals are steady and calm.
  • Weak voiceovers with flat delivery that don’t hold attention.
  • Scripts that lacked energy or emotional punch.

The takeaway? Simple doesn’t mean lazy. You still need a strong script and engaging narration.

Step 1: Writing the Script with ChatGPT

Start with ChatGPT. My initial prompt looks like this:

“Give me 15 long-form YouTube video ideas that would work well with stick figure animation. The tone should be emotional, calm, and relatable. Focus on topics like overthinking, life lessons, self-doubt, motivation, or reflection.”

From there, I refine the results with:

“Make them more personal and story-based. Include subtle emotional twists. Focus on things people think about but don’t talk about.”

That’s how I got ideas like “The day I stopped trying to be perfect” or “When I stopped apologizing for existing.”

Once I choose a topic, I ask:

“Write a long-form YouTube script for [title]. Use a calm, emotional, and reflective tone. Structure it like a story. Keep sentences short and easy to follow.”

Then, if the draft feels flat, I add:

“Rewrite this in a casual, humorous style with quick, punchy sentences. Start with a relatable everyday scenario, exaggerate it for comedic effect, and wrap up with an insightful twist.”

This process gives me scripts that are ready to use with little editing.

Step 2: Generating Stick Figure Images

Here’s where ChatGPT helps again. Once I have the script, I ask:

“Break this script into short segments and create descriptive prompts for each one to generate stick figure images.”

For example:

  • “A stick figure sits at a desk, staring at a pile of papers.”
  • “A stick figure looks in the mirror with a heavy expression.”
  • “Two stick figures walk side by side, one with a slouched posture, the other standing tall.”

This way, every sentence or idea in the script has a visual to match.

I can then feed these prompts into Leonardo AI (or ChatGPT’s image model) to generate consistent, original drawings.

Step 3: Automating the Process

Doing this manually works for a single video. But if you’re serious about building a channel, automation saves hours.

With a simple Python script, you can:

  1. Feed in your script.
  2. Automatically generate prompts.
  3. Batch-create all images overnight.

That’s how you scale from one video a week to multiple uploads without burning out.

Step 4: Choosing the Right Voiceover

Voiceovers are the glue that hold this format together. I use 11 Labs for narration.

Here’s my process:

  1. Go to Text to Speech.
  2. Test both professional and default voices. (Tip: default often sounds more natural.)
  3. Choose something calm, youthful, and relatable.

The voice I ended up with is the same one used by the channel Wise Joe, which hit 54,000 subscribers in just three months. It’s warm, conversational, and hooks the listener immediately.

Step 5: Editing Everything Together

For editing, I use CapCut because it’s fast and simple. My project usually has six layers:

  1. Captions at the top.
  2. Extra text for emphasis.
  3. Image layer 1.
  4. Image layer 2 (cropped duplicates for subtle effects).
  5. Background layer (to fix aspect ratio).
  6. Voiceover.

Important note: ChatGPT and Leonardo usually generate images in 3:2, not 16:9. To fix this, I create a solid color background by sampling from the image, then place the figure on top. It keeps the video consistent and fullscreen.

For this type of video, I skip background music. It’s not necessary—the voice and message do the heavy lifting.

Final Thoughts: Why This Format Works

Stick figure videos prove something important: you don’t need complex visuals to connect with people. In fact, simplicity is often what cuts through the noise.

If you want to succeed with this style:

  • Focus on better scripts, not better effects.
  • Use automation smartly to save time.
  • Don’t copy—add your own perspective.
  • Commit for the long term.

The creators who win aren’t the ones chasing shortcuts. They’re the ones refining their craft, one video at a time.

So next time that voice in your head says, “Rest. Start tomorrow.”—shut it down. Get to work. Because your breakthrough might be one stick figure video away.