How Santos Dioses gets to WOW

The plan to stop making mediocre content · 2026-06-13 · grounded in 12 real viral case studies

The one truth I was missing: the product is the reward, not the subject. WOW videos bury the bottle until the final 3 seconds and spend the first 20 on something that has nothing to do with the product: a question, a surprise, a human, a tension, a cultural moment. I lead with the bottle every single time. That is why my content is polished and dead.

Proof from the brands that win: Don Julio's "Does height matter?", Espolòn's Ken Jeong short-king joke (May 2026), Poppi's unfiltered founder (26M views, $100k in 24h), Liquid Death ($1.4B on rebellion, not water), Grey Goose hijacking The Devil Wears Prada 2. Not one of them opens on the bottle.

1. Why our content has been mediocre (honest)

Trap 1 — the bottle is the hero from frame 1. Slow zoom on a bottle. The scroll is gone by frame 3. (This is literally what I shipped today.)
Trap 2 — polish instead of idea. I optimize for "premium and clean" and skip the creative insight. The algorithm reads it as an ad and buries it. Authentic-and-rough out-performs cinematic-and-empty 3 to 5x.
Trap 3 — one-shot, no iteration. I generate once and ship. Liquid Death ships 5 variations a week; we ship one "perfect" thing and hope. No quality loop.
Trap 4 — recycling. I reused a photo we posted hours earlier. Recycled content can't be wow.
Trap 5 — no shared definition of "wow." I keep guessing your bar and missing. I have no calibrated reference of what you consider great.

2. The rules that separate WOW from polished-mediocre

Hook in the first 1-2 seconds, never the product. A question, a surprise, a person, a tension. Espolòn: "Quality has nothing to do with height."
Pattern-break by frame 2. If it looks like every other beverage ad, the scroll continues. Pattern-interrupt in first 5s = ~23% higher retention.
Idea over polish. Spend the budget on the creative insight, not the color grade. Poppi's best content was never cinematic.
Sound is the hook; visuals support it. Trending or satisfying ASMR audio stops the scroll. e.l.f. and Celsius win on sound, not visuals.
Lead with a cultural moment or a tribe, not a product attribute. Not "ultra-premium, small-batch." Don Julio = a height debate. Liquid Death = anti-corporate identity.
Make it repeatable so people remake it. A format others can participate in beats a bespoke hero shot. The Batanga pour-and-clink trend.
Pay off the curiosity in the last 3 seconds. A resolution worth screenshotting and sharing.

3. The quality engine — how we guarantee it's good before you see it

This is the process change. We stop shipping the first thing. Every piece runs this gauntlet:

1. Concept-first. Start from a hook + a mechanic (surprise / tension / emotion / cultural), never a product shot. Generate 8-10 concepts a week, score them, keep the top 3.

2. Generate many, not one. Multiple takes per concept (Higgsfield, real footage, Sloane anchors).

3. Pre-test every take with Higgsfield's Virality Predictor. Below the bar, it gets reworked or killed. Never shipped.

4. The WOW checklist (hard gate). A piece does not reach you unless it passes all of:

Today's bottle clip fails 5 of 6. That is the gate doing its job.

4. Six WOW-tier concepts (built on our real assets + AI)

PRODUCIBLE NOW = real Jalisco footage + Higgsfield, no new filming. NEEDS A PERSON = one real shoot or a bartender/creator.

ConceptHook (the first thing on screen)The mechanic + build
The Invisible Hand PRODUCIBLE NOW"This whole thing is made by one pair of hands. Watch."Transformation arc: real hands harvesting → cutting → the oven → the pour. The bottle only appears in the final 2 seconds. Real footage + Higgsfield detail shots. Human craft, not product.
Does Expensive Even Taste Better? NEEDS A PERSON"We put our bottle next to a $400 tequila. Watch his face."Blind taste, real reaction (gasp, "wait, which one?"), reveal. Contrarian price-vs-quality tension. Needs a bartender or Angel to film once.
The Cut PRODUCIBLE NOWThe sound of a coa splitting a piña, before any picture resolves.ASMR craft spectacle. Slow-mo real jimador footage, sound-designed (the chop, the crack, the breath). Satisfying, rewatchable. Bottle as the final beat.
Nine Years. Nine Seconds. PRODUCIBLE NOW"You waited nine minutes for a table. This waited nine years."Time-compression of the agave's life (real field footage + AI growth motion), resolving into the bottle. Patience as the emotion. Educational without lecturing.
The Roar PRODUCIBLE NOWA dark screen and a stadium roar building, then a goal.Reactive cultural moment: Mexico plays in Jalisco Wed Jun 18. Crowd-roar audio is the hook. Premium, same-day. No team or tournament marks.
Texas Doesn't Know Yet NEEDS A PERSON"Austin doesn't know yet. Houston's about to."POV Texas-lifestyle cuts (rooftop, music, skyline) with the bottle at the edge of frame. Positions us as a Texas brand arriving, a tribe-identity hook. Needs some real Texas footage.

Four of these are producible now with what we have. Two need one real person or shoot, which is a resourcing decision for you.

5. What I need from you (this is the real answer to "what do I have to do")

1. Send me 3 to 5 videos you think are genuinely WOW. Any brand, any category. This is the single highest-leverage thing. I reverse-engineer them into a concrete rubric and stop guessing your bar. Without it I keep missing.

2. Approve the iterate-to-score process. Excellent is not one-shot. It means generating several, pre-testing, killing the mediocre, and only showing you what cleared the bar. That takes a little longer per piece and it is the whole difference.

3. Decide on the two "needs a person" concepts. A bartender or an Angel for one filmed reaction unlocks a whole tier of content (the taste-test, the Texas POV) we cannot fake well with AI. Your call on whether to resource it.

Give me #1 and #2 and the next thing I bring you is built this way: concept-first, pre-tested, bottle buried, nothing mediocre getting through the gate.