The Will Smith AI Crowd Is Not What You Think - But It Points to a Bigger Problem

Advertisement

The reality sandwich

Kyt from KaiberAI, a creative research head who broke down the video for Vice, explained it like this:

The base footage is real - real Will Smith, real concert moments, real fan signs.

But AI tools were layered in to pad out the edit. Short, uncanny clips of crowds were generated from key frames of real shows, then spliced between actual footage.

Those AI shots were clipped just before they “descended into mush” (a common artifact when video models can’t maintain coherence for more than a few seconds).

So what we’re seeing isn’t a deepfake concert. It’s an edit made from 60% reality, 40% AI mush - a highlight reel stitched together to meet the one-minute Instagram/TikTok sweet spot.

In other words: a reality sandwich.

Why crowd shots?

Crowds are one of the hardest things for AI to generate believably. One person is hard enough - hands melt, limbs warp, movements stutter. A crowd is chaos: phones, hats, shirts, signs, lighting. Every detail has to sync. Which is why it’s incredibly bold (or sloppy) to use AI specifically for crowd shots.

And of course, people noticed. Any celebrity is under a microscope, and Will Smith in particular is stuck in the middle of a cultural cringe cycle. If anything feels off, it gets magnified.

Photoshop déjà vu

But the bigger story here isn’t Will Smith. It’s what this video represents.

Remember when Photoshop first hit magazines? People were outraged. “You’re lying to us!” Then, slowly, it became normalized. We all know celebrity photos are touched up. Every billboard, every ad, every Instagram thirst trap is filtered, cropped, smoothed, and reshaped.

We don’t call it a scandal anymore. We just know.

AI is about to do the same thing to video.

The new normal of video fakery

We’re heading into an era where almost every video you see will have some AI touch-up. Not full deepfake replacements, but subtle enhancements:

Resolution boosted by AI upscaling.

Frames expanded to widen the shot.

A distracting object erased.

A missing moment patched in with generated footage.

A crowd shot filled out for a more impressive edit.

Ninety-nine percent real, one percent synthetic. Just enough to smooth the rough edges, to cover the gaps.

And unless you’re actively looking for it, you’ll never notice.

Will Smith is just the preview

The backlash to Will Smith’s AI crowd shows how much people still bristle at AI fakery. But soon this won’t even be worth arguing about. Just like Photoshop, AI in video will become invisible background noise.

We’ll miss these awkward, garbled faces. We’ll miss the misspelled signs and the jerky movements. Because in a few years (or months, in this pace), the tech won’t glitch - it’ll blend perfectly. Reality sandwiches will become the default recipe.

Final thought

The Will Smith crowd wasn’t fake. It wasn’t real. It was both - a glimpse of the new media landscape we’re walking into.

The era of video as evidence is ending. From now on, every clip could be a reality sandwich, layered with just enough fakery to make it work. And all we can really do is adjust. Trust our eyes a little less. Stay suspicious. And accept that reality, at least on video, will never be 100% real again.

Tags

Scroll Down For The Next Article