To write an explainer video script, work in seven steps: define one goal and one audience, choose a proven structure, open with a hook on the viewer's problem, draft to about 150 words per 60 seconds, trim to a single message, end on one specific call to action, and read it aloud before you lock it. A good explainer script is short — 150 words for a 60-second video — and disciplined: one idea, one hook, one CTA. This guide walks each step with examples, then shows how to turn the finished script into a video with an AI agent like Pexo, where Script-to-Video segments the narration and returns a scored, captioned explainer.
Most scripts fail before step one because the writer starts typing without deciding what the video is for. Nail the goal and the audience first, and the words almost write themselves.
What Makes an Explainer Video Script Work
An explainer script does one job: it makes a viewer understand and want one thing in 60 to 90 seconds. Three properties separate a script that lands from one that rambles. It is single-minded — one message, not three. It is viewer-first — it opens on the audience's problem, not your product. And it is spoken, not written — it sounds natural read aloud, because that is how it will be heard. Hold those three in mind through every step below and you will avoid the mistakes that sink most drafts.
The Five Script Structures You Can Start From
Before step one, know your options. Almost every explainer reuses one of five structures; pick the one that fits your goal and you skip the blank page.
| Structure | Hook opens on… | Best for |
|---|---|---|
| Problem–Solution | The viewer's pain | SaaS, apps, services |
| How-It-Works | A question or curiosity | Technical products, APIs |
| Product Demo | A relatable moment | E-commerce, hardware |
| Founder Story | A belief or frustration | Brand, crowdfunding |
| FAQ / Onboarding | A real user question | Support, activation |
If you want full copy-ready scripts for each of these, see our explainer video script examples. This guide is about the process of writing your own.
Step 1: Define One Goal and One Audience
Write the goal and audience in a single sentence before anything else: "Convince busy founders to start a free trial of our project tool." That sentence decides your hook, your tone, and your CTA. If you can't name one audience and one action, you'll write a script that tries to reach everyone and moves no one. Resist the urge to list three audiences — make a separate video for each.
Step 2: Choose a Structure
Match your goal to one of the five structures above. A SaaS trial signup is almost always Problem–Solution. A technical API explainer is How-It-Works. A physical product is a Demo. Commit to one structure and let it dictate your beats; mixing structures mid-script is how explainers lose their spine.
Step 3: Write the Hook on the Viewer's Pain
The first two lines decide whether anyone watches the rest. Open on the viewer's problem or a relatable moment, never your product name. Compare:
❌ "Streakly is a habit-tracking app with streak protection." ✅ "You start a new habit on Monday. By Thursday, the streak's broken — and so is your motivation."
The second version makes the viewer feel seen in five seconds. Name the product after you've named the pain.
Step 4: Draft the Body to Length
Now write the middle: the fix, how it works, and one proof point. Size it as you go. Read aloud, you average about 150 words per minute, so use the script's word count as a stopwatch.
| Video length | Script word count | Typical use |
|---|---|---|
| 15 seconds | ~35–40 words | Social teaser, ad hook |
| 30 seconds | ~75 words | Feature highlight, Reels |
| 60 seconds | ~150 words | Homepage explainer |
| 90 seconds | ~220 words | Product / how-it-works |
| 2 minutes | ~300 words | Detailed onboarding |
Carry exactly one concrete proof point, with a number: "users stuck with a habit 3x longer," not "loved by thousands." One specific number beats five adjectives.
Step 5: Trim to One Message
Your first draft is always too long and says too much. Cut every sentence that isn't the single clearest message. If a 60-second script needs 220 words, you're carrying a second message — delete it or make a second video. Trimming is where average scripts become strong ones; protect the hook and the CTA, and cut from the middle.
Step 6: End on One Specific CTA
Close with a single, concrete action: "Download Streakly free," not "Learn more." A soft or split CTA ("visit our site, follow us, and sign up") wastes the attention you earned. One verb, one destination.
Step 7: Read It Aloud and Lock It
Scripts are heard, not read. Read your final draft out loud — or have a text-to-speech voice read it — and you'll catch the long clauses, tongue-twisters, and jargon that look fine on the page but trip a voiceover. Fix anything you stumble over, then lock the script. This thirty-second step catches more problems than any amount of silent editing.
A Worked Example: Before and After
Here's a weak first draft and the rewrite after the seven steps:
Before: "Our app, Streakly, is a powerful habit-tracking solution with many features including streak protection, reminders, and analytics to help you build better habits over time." After: "You start a habit Monday. By Thursday, the streak's broken. Most apps reset you to zero — that red mark is where people quit. Streakly protects your streak with a grace token and nudges you back the next morning. In our beta, users lasted 3x longer. Download Streakly free."
Same product, but the rewrite hooks on the pain, carries one number, and ends on one action — at roughly 60 seconds.
A Pre-Production Checklist
Before you hand the script to an editor or an AI agent, run it against this checklist. Each item maps to one of the seven steps and catches the failures that survive a first draft:
- One goal, one audience — named in a single sentence at the top.
- Hook on the pain — the first two lines name the viewer's problem, not your product.
- One message — you can state the video's single takeaway in five words.
- On length — the word count matches the target runtime at ~150 words/minute.
- One number — exactly one concrete proof point, not a pile of adjectives.
- One CTA — a single verb and destination at the end.
- Reads aloud clean — you got through it once with no stumbles.
- A style line — one note on tone and visual style for whoever produces it.
If every box is checked, the script is production-ready. A script that fails two or more of these will produce a video that feels unfocused no matter how good the visuals are, so fix the words first — it is far cheaper than re-rendering the video.
Turning Your Script Into a Video
A finished script is only half the job; production is the slow half. Script-to-Video collapses it. Paste your script into Pexo and it segments the narration into shots, generates matching visuals through auto model selection across 10+ models (Seedance 2.0, Kling 3.0, Veo 3.1, Sora 2, Runway Gen-4.5, and more), sequences the cuts, and composes a three-layer soundtrack of voiceover, music, and Foley sound effects before adding clean titles and subtitles. A 15-second 3-shot explainer comes back in about 8–10 minutes, exported in 16:9, 9:16, or 1:1.
Here's my 60-second problem–solution script for a habit app. Make a stylized
2.5D explainer from it, upbeat tone, end on the download CTA. Vertical 9:16.
[paste script]
Script is one of Pexo's five input types — you can also start from a plain idea, images, or a landing-page URL. New to the format first? See what an explainer video is and the broader how to make an explainer video walkthrough. Script locked? Paste it into Pexo and get a finished, scored explainer back.
When You Shouldn't Auto-Generate From a Script
Writing the script is universal; generating the video isn't always the right next step:
- A to-camera monologue for a named presenter is an avatar job — use HeyGen or Synthesia.
- Narration over footage you filmed is a timeline-editor task in CapCut or Descript.
- A walkthrough of your live product UI belongs in Loom or Screen Studio.
For a scripted, narrated, animated explainer built from words, generating from the script is the fastest path to a finished video.
Related reading
- Explainer Video Script Examples: 5 Templates You Can Copy
- Explainer Video Templates: The Best Types and Where to Get Them
- How to Make an Explainer Video
- What Is an Explainer Video?
- The Best AI Video Agents for Full Video Creation
- The Best AI Video Generators, Compared
Resources
| Resource | URL | Slot |
|---|---|---|
| Pexo | pexo.ai | Script-to-Video: paste a script → finished, scored explainer |
| HeyGen | heygen.com | Avatar / talking-head presenter from a script |
| Synthesia | synthesia.io | Avatar explainer, 100+ languages |
| Descript | descript.com | Text-based editing for narrated footage |
| CapCut | capcut.com | Timeline editing of footage you filmed |
| Loom | loom.com | Screen-recording product walkthroughs |





