What building KaiNet taught us about ad quality at scale

We started building KaiNet thinking the hard part would be generation. It turned out to be a much smaller portion of the problem than we expected.

The hard part was quality control.

Generation is easy

Asking a model to write headlines and descriptions for a Google Ads campaign, given a product, an audience, and a goal, is not a hard problem in 2026. The model will produce them. They'll mostly be syntactically correct. They'll often be reasonable.

This is the part that's easy to demo, and the part most likely to lead a builder astray. Because if the demo looks impressive, you start to think the product is mostly built. It isn't. You've built the easy 30%.

Quality is plural

The harder thing was figuring out what "quality" actually meant for an ad system that had to produce campaigns we'd be willing to put into a live account.

Quality, it turns out, isn't one property. It's the intersection of:

Structural sanity. The right number of ad groups, sensible bid strategies, asset diversity that the platform actually wants to see.
Brand coherence. Copy that sounds like it came from this company, not a generic version of any company in the category.
Compliance. No claims that would trip Google's policy enforcement, characters in headlines that exceed limits, sensitive categories handled correctly.
Strategic alignment. A campaign that does what the brief asked for, instead of drifting into a different campaign somewhere along the way.
Account fit. A campaign that makes sense given what's already in the account, instead of overlap masquerading as a new launch.

Each is a different problem. Each needs its own validation pass. None are caught by a model that just generates good-looking text.

The unsexy work

Most of the engineering hours on KaiNet went into the inspection layers, not the generation. Validators. Schema enforcers. Policy checkers. Account-state awareness. The "before this gets shown to a human, does it actually meet the bar?" infrastructure.

This is unsexy. It doesn't demo well. There's no "wow" moment when a structural validator catches a near-duplicate campaign before it ships. The output of a good validator is a campaign that doesn't have a problem, and you can't show "the problem that didn't happen" in a slide.

But it's the work that decides whether the system is something you'd hand to an account or something you'd keep behind a "for fun" wall.

We had a stretch in early development where we had a beautiful generation pipeline and almost no inspection. Demos felt great. Internal use felt fine. The first time we tried it on a real account that wasn't ours, we caught three structural issues, two compliance flags, and a near-duplicate campaign in the first ten minutes. None of them were "AI hallucinations." All of them were just things a careful human would have caught. The system needed to catch them too, if humans weren't going to be in the loop on every detail.

What we'd tell another builder

At scale, quality control becomes product design. Most of the meaningful product decisions we made in the last six months were inspection decisions, not generation decisions.

If you're building anything in ad ops automation, budget more for inspection than for generation. We did the opposite at first and lost a quarter to relearning it.

KaiNet · Builder diary

What building KaiNet taught us about ad quality at scale

Generation is easy

Quality is plural

The unsexy work

What we'd tell another builder

More notes

We're not interested in 'AI magic' for ads. We're interested in better systems.

We tried 14 'AI marketing tools.' Most are wrappers. Here's what actually moved the needle.