AI
April 8, 2026

AI ad creative: the tools we actually use (and why none of them work alone)

Rupert Mason
Author

There Is No Silver Bullet

I will save you six months of experimentation. There is no single AI tool that takes a brief and spits out a finished ad. Not Sora. Not Veo. Not Kling. Not anything launching next quarter.

What actually works is a stack of specialist tools, stitched together by a human who knows what a good ad looks like. That is the unsexy truth every "Top 10 AI Video Tools" listicle leaves out.

At Sidekick, we have been running AI-assisted production on real client campaigns for over a year. Our fully AI-generated commercial for Stratiphy is a good example of what the full stack can produce. But the path to that finished ad was anything but press-a-button simple. Here is the exact stack we use, what each tool is genuinely good at, and where every single one of them falls over without a person in the chair.

Our AI Video Production Stack

Google Veo: The Visual Engine

Veo is our go-to for generating video footage from prompts. It handles photorealism better than anything else we have tested, particularly for product shots and environmental scenes where you need the output to feel cinematic rather than synthetic.

Where it shines: consistent lighting, smooth camera movement, and scenes where the subject is not a human face. Give it a brief like "slow dolly across a co-working space, morning light, shallow depth of field" and the output is genuinely usable.

Where it breaks: the moment you need a person to talk, gesture naturally, or interact with a physical object. Hands are still a problem. Lip sync is not there yet. And if your brand relies on a real founder speaking to camera, Veo cannot replace that shoot.

Kling AI: Motion and Animation

Kling is where we go for anything that needs controlled motion, particularly transitions, animated typography, and scenes where we need to direct exactly how an element moves through frame.

Where it shines: stylised content, motion graphics sequences, and creative transitions between scenes. For social-first content where the visual language is more graphic than photographic, Kling consistently outperforms Veo.

Where it breaks: when you need the output to match live-action footage seamlessly. The aesthetic is distinctive. That is a strength for brand films and a weakness for anything that needs to cut against real camera work without the audience noticing.

ElevenLabs: Voice and Audio

Every ad needs a voice. ElevenLabs handles voiceover, narration, and audio production. The quality is now close enough to a professional VO artist that for certain formats, particularly explainer videos and social cutdowns, we use it as the primary audio layer.

Where it shines: speed. We can generate 20 voiceover variations in the time it takes to book a studio session. For A/B testing different tones, pacing, or scripts across paid social, that speed is the actual value, not cost savings.

Where it breaks: emotional range. A human voice actor can take a single line and give you fury, warmth, dry wit, or quiet authority. ElevenLabs gives you competent. For hero brand films or anything where the voice is the emotional centrepiece, we still book a real person.

The 80% Nobody Talks About

Here is the part that every AI tool vendor skips in their demo reel.

The tools generate raw material. That is it. They do not write the brief. They do not know your brand guidelines. They do not understand that the last three seconds of a 15-second Instagram ad need to resolve on the logo at exactly the right moment. They do not catch the uncanny-valley micro-expression that will make your audience scroll past instead of stopping.

On a typical AI-assisted production at Sidekick, the breakdown looks roughly like this:

  • 20% AI generation. Prompting, generating, selecting the best outputs from dozens of variations.
  • 80% human craft. Writing the script. Directing the AI (yes, you direct it the same way you direct a camera operator, with specific, opinionated instructions). Editing. Colour grading. Sound design. Quality control. Making the hundred small taste decisions that turn raw footage into an ad that actually works.

That 80% is where the value is. Anyone can access these tools. The difference is knowing what to ask for, recognising when the output is good enough, and having the production experience to finish it to broadcast standard.

Why "Just Use AI" Is Bad Advice

We see founders try to skip the human layer every week. They download Kling, generate 30 clips, stitch them together in CapCut, and wonder why it looks like a tech demo instead of an ad.

The problem is never the tools. The tools are extraordinary. The problem is that an ad is not a collection of pretty shots. It is a story told in a specific time frame, for a specific audience, with a specific emotional arc that ends on a specific action. That requires a director, not a prompt engineer.

Think of it this way: a camera has been able to shoot cinema-quality footage for years. That did not make everyone a filmmaker. AI video tools have made generation cheap. They have not made storytelling cheap.

When AI Production Makes Sense (And When It Does Not)

We recommend AI-assisted production when:

  • You need volume. 20+ ad variations for paid social testing, where speed matters more than pixel-perfect craft.
  • You need environmental footage that would cost a fortune to shoot on location. Cityscapes, abstract product worlds, seasonal scenes.
  • Your budget is under £10k and you need broadcast-quality motion graphics without a full post-production team.
  • You are iterating fast on creative concepts and need to visualise ideas before committing to a full production day.

We do not recommend it when:

  • The ad depends on a real human performance. Founder-led content, testimonials, anything where authenticity is the point. Compare our Sage Mentors campaign, which relied on real founders telling their stories on camera. No AI tool could have replaced those performances.
  • You are building long-term brand equity where every frame needs to feel unmistakably yours. AI has a look. Your audience will learn to spot it.
  • You need precise product interaction. Hands holding your product, someone using your app on a real device, physical demos.

This Is a Work in Progress

I want to be honest about something. This stack will look different in six months. Veo will improve. Kling will improve. Something new will launch that makes one of these tools redundant. We are updating our workflow constantly.

What will not change is the fundamental architecture: specialist tools, human direction, production experience filling the gap between what AI generates and what an audience will actually watch.

The contrast between a project like Stratiphy (fully AI-generated) and Sage Mentors (fully human performance) is the point. The best agencies will know when to reach for which. If you are a founder considering AI video for your next campaign, the question is not "which tool should I use?" The question is "who is going to direct it?"

Work With Us

Sidekick Studios runs AI-assisted video production for startups and scale-ups across TV, social, and OOH. We handle the full stack so you get the speed and cost advantage of AI without sacrificing the craft that makes an ad actually convert.

See our AI video production services or book a free 30-minute consultation to talk through your next campaign.

Similar posts