
There’s a version of this story you’ve probably heard: a team uses an AI design tool, generates a full visual system in an afternoon, ships it, and everyone is happy. That version is a demo.
The real version is more interesting and more useful if you’re trying to figure out where these tools actually fit in a serious design process.
We brought AI design tools into an actual client engagement to find out. Real work, with a real brief, real users, and a real launch date.
Here’s what we learned about where both the tools and design expertise still drive the outcome.
The brief
Arlow was preparing to launch a care coordination platform for older adults and caregivers, helping users track medications, stay connected, and manage day-to-day care. The product was functional. But it felt generic. Stock imagery, no distinctive visual identity, nothing that made the experience feel like Arlow specifically.
For their audience, that gap mattered more than it would for most.
Their users aren’t browsing casually. They’re often dealing with stressful, emotionally loaded situations. A product that feels clinical creates friction at exactly the wrong moment. And in a consumer-driven market, that friction is a positioning problem, not just a UX problem.
The Arlow team was heads-down preparing for launch. So we ran a design exploration in parallel to push their brand forward without pulling their team off course.
Early wins and fails
Part of that exploration involved testing AI design tools, specifically whether they could help build a consistent, scalable visual system faster than traditional methods.
Kristin led the work using a tool that chains different types of generation together: extract prompts from images, blend with mood boards, generate new visuals, extend into motion.
Early outputs were exciting: Initial versions of icons sets and illustrations looked fantastic and gave the team styles and directions that felt like they would work system-wide.
The problem came when we tried to produce a scalable system. Consistency collapsed the further we went. The same prompt produced different results each time. Styles drifted. Details shifted. Every attempt cost tokens, and the economics get punishing quickly on real projects. The people making AI tool tutorials often have credits in the hundreds of thousands. Roughly one in five attempts produced something usable. The rest burned through budget.
To compensate, Kristin moved into manual cleanup: vectorizing outputs in Illustrator, correcting irregular shapes, and adjusting proportions to match earlier work. The time savings had evaporated.
We had the answer: AI tools can produce polished ideas and concepts, but they can’t create scalable, consistent production-ready systems. And consistency is what fuels usability and efficiency.
The pivot that produced something real
Rather than force the tools to do something they couldn’t, Kristin narrowed the scope.
She pulled a single element from Arlow’s existing brand, the sparkle in their logo, and started exploring what it could become as a character. This is where AI proved genuinely useful. By combining rough sketches with AI-generated variations and reference styles, she moved through a wide range of directions quickly.
Not all of them worked—even explicit prompts produced extra limbs, inconsistent shapes, and unexpected features. But that unpredictability helped surface decisions faster: keep the form simple, limit facial detail, prioritize readability. No legs. No nose. Arms only in specific contexts.
The tool didn’t produce the answer. It compressed the time it took to find one. That’s the right job for it.
When the Arlow team saw the character, it was love at first sight. That kind of clarity, with a concept that lands without explanation, is the signal to stop generating and start building.

Where expertise took over
Once the concept was validated, Jeremiah came in as brand designer. What happened next is the part that doesn’t show up in AI tool demos.
His first pass caught what the vectorization process had introduced: a corner that hadn’t rounded correctly, subtle curve inconsistencies that wouldn’t be visible at small sizes but would compound across a product. Small things a trained designer sees immediately and a generative tool doesn’t know to flag.
Then came the system work: standardizing proportions, defining rules for when and how arms appear, building a full expression library. The addition of eyebrows unlocked a much wider emotional range without adding visual complexity. Even animation required hands-on problem-solving. Creating natural arm movement meant building custom pivot points using bounding boxes, because no tool handled rotation from a natural shoulder joint automatically.
What’s worth noting is how the work moved between the exploration and the execution.
Kristin’s exploration gave the project direction and momentum. Jeremiah’s craft made it production-ready.
Kristin took the refined system Jeremiah established and extended it—new illustrations, new scenes, built on the standards he’d set.
It was a relay. Each handoff produced something neither could have gotten to alone, as fast, with as much quality control built in.

The decisions that made it usable
The design work was only part of it.
We ran a workshop with the Arlow team to define how the character should actually behave in the product: where it appears, where it doesn’t, what emotional register is appropriate in which contexts. This wasn’t aesthetic work. It was dozens of product decisions built into a framework.
A character that celebrates completing a medication log is doing something different from a character that appears when a user is trying to find emergency contact information. For a product serving people managing their parents’ care, getting that wrong isn’t a minor UX issue. It’s a trust issue.
Those decisions required understanding the users. They required product context. They required someone who could hold the brand strategy and the UX implications in the same conversation.
None of that came from the tools. It came from the work of knowing what you’re building and who you’re building it for.
What we shipped
The character came to life across Arlow's product and marketing even before their launch. It gave the product a recognizable presence, something distinctive that competitors with generic interfaces don’t have. It gave the product team a system they can extend rather than an asset they’ll have to reinvent. And because it was built with clear usage rules from the start, it won’t sprawl into places it shouldn’t be.
What started as a parallel exploration became a meaningful part of how Arlow shows up in the market.

What this means for how you think about AI in design
If you’re a design leader navigating pressure from leadership to ‘just use AI tools’ or trying to figure out where they genuinely help versus where they create more work, here’s what we’d take from this:
AI is genuinely useful for the exploration phase.
It compresses the time it takes to move through a wide range of directions, surface what you don’t want, and get to a concept faster. If your team is stuck early or needs to cover creative ground quickly, it earns its place.
But it can’t maintain systems.
Consistency degrades the further you push. Budget for cleanup time and token costs before you start—the economics of real work look nothing like the demo economics.
The more important point: polished output isn’t differentiated output.
A brand character or icon set built entirely from AI generation looks like what it is: something your competitors could generate from the same starting point. What makes it yours is the research, the decisions, the craft, and the usage guidelines that come from actually understanding who you’re building for.
The teams getting real value from these tools treat them as one input into a process that still requires design expertise at every critical decision point. They know exactly when to stop generating and hand off to someone who can build.
That handoff is where the outcome gets decided.
But it’s also where most teams underinvest because the exploration phase feels like most of the work. It isn’t.
Closing the gap between a product that works and one that feels right is still a human job. It's faster when brand strategy, product design, and execution expertise are working from the same brief. That's what Arlow got. That's what made the difference.
If you're ready to build something AI can't replicate, we'd love to help.
