Writing

Notes on design moving closer to the work.

Essays on AI-native design, workflow transformation, and what it takes for teams to reason from the same evolving body of knowledge.

AI-native design

Headless does not mean designless

When a platform is mostly consumed by agents, design does not go away. It moves into contracts, defaults, permissions, feedback loops, tool descriptions, source health, and the parts of the product humans may never directly touch.

May 20267 min read

Read essay Product development

The SDLC is changing under our feet

When designers can shape schemas, prototype against real data, and ship production-level features, the path to MVP stops being a long relay race and starts becoming a tighter loop of judgment, evidence, and build.

May 20268 min read

Read essay Point of view

Design is moving closer to the work

AI-native design is not about making designers faster at making screens. It is about collapsing the distance between judgment, evidence, implementation, and the product itself.

May 20266 min read

Read essay Operating model

Teams need one living body of knowledge

The AI-native team does not win because it has more tools. It wins because its tools can reason from the same evolving truth.

May 20267 min read

Read essay AI-native practice

Automating ideation with science, not slop

The useful version of AI-assisted product improvement is not a machine that generates more ideas. It is a system that can pressure-test ideas against UX heuristics, behavioral science, HCI research, product evidence, and the realities of machine learning.

May 20268 min read

Read essay

AI-native design

Headless does not mean designless

May 20267 min read

Designers are used to designing for a person.

We give that person a goal, a context, a job to be done, a level of confidence, a set of fears, a few constraints, maybe a name if the team still likes personas. Then we design the path through the product: what they see, what they understand, what they trust, what they do next.

That work still matters. I do not think human-centered design became old-fashioned because agents can click buttons, call APIs, write code, or summarize a dashboard.

But something is changing underneath it. More products are becoming platforms that are not primarily consumed through a human-facing screen. They are consumed by agents: fast, parallel, literal, persistent, and sometimes wildly overconfident. The product still has users. They are just not always looking at the UI.

The interface moved below the glass

A headless product can look invisible from a traditional portfolio lens. There may be no beautiful dashboard to photograph. No onboarding flow. No hero interaction. No tidy persona journey from awareness to activation.

But there is absolutely an interface.

The interface is the API contract. The schema. The tool description. The examples in the docs. The naming of an endpoint. The default value. The permission boundary. The retry behavior. The error message. The source label. The confidence score. The audit log. The moment the system decides to continue, pause, escalate, or ask a human.

If a human sees a confusing label, they might hesitate. They might ask someone. They might ignore it. If an agent reads a confusing label, it may confidently do the wrong thing one hundred times before anyone notices.

That is a design problem.

Human personas are not enough for agent populations

Traditional persona work assumes a relatively bounded human actor. The user has motivations, attention limits, emotional states, mental models, social pressures, and a context of use. We design for those things because they shape behavior.

Agent users have a different shape. They do not get tired in the same way. They do not skim because they are bored. They do not feel reassured by a well-composed empty state. They may call the same tool thousands of times, chain outputs into other tools, misread an affordance, overfit to an example, or treat missing context as permission to guess.

So the design object changes. We are not only designing a deterministic path for one human persona. We are designing operating conditions for a population of agent behaviors.

That means personas start to look more like capability profiles. Retrieval agent. Planning agent. Execution agent. Reviewer agent. Support agent. Research agent. Coordinator agent. Each one needs different permissions, context, failure handling, and evidence standards. And behind all of them is still a human steward who needs to understand what happened and why.

In a headless platform, naming is interaction design

I have become slightly obsessive about names because agents make sloppy naming expensive.

A vague field name is not just an implementation detail. It is a little product decision that will be reused by every downstream workflow. A tool called update_status might sound harmless until nobody knows whether it updates a draft state, a customer-visible state, a planning state, or a compliance-relevant state.

The same is true for data objects. If the system has a weak noun, the product will eventually inherit weak behavior. Agents need clear nouns, clear verbs, clear scopes, clear preconditions, and clear consequences. Humans need those too, but humans are more likely to notice when something feels off.

Headless design has a craft layer. It is just a quieter craft layer: structured names, crisp descriptions, examples that do not teach the wrong behavior, and defaults that guide the system toward the safest useful action.

Docs become product surface

In a screen-based product, documentation is often treated as support material. In an agent-consumed product, docs are part of the runtime experience.

The model reads them. The tool caller uses them. The engineer copies from them. The agent framework may turn them into available actions. The examples become patterns the system repeats.

This makes documentation a design medium. Not the boring afterthought kind. The actual product surface kind.

A strong tool description should answer the same questions a good interface answers. What is this for? When should I use it? What should I never use it for? What inputs are required? What evidence should I check first? What happens next? What does success look like? What are the failure modes? When should a human be pulled in?

That is UX writing. That is product design. It just happens to be read by both humans and machines.

Evals are usability testing for agents

If agents are real users of the platform, then evals become a kind of usability testing.

Not because agents are people. Because the platform has to prove that its instructions, contracts, permissions, and feedback loops produce reliable behavior under pressure.

A human usability test might ask whether a person can complete setup without getting lost. An agent usability test might ask whether a planning agent chooses the right tool, respects the review gate, preserves provenance, refuses an unsafe shortcut, recovers from a missing field, and explains what it did in a way a human can audit.

The questions are different, but the design instinct is familiar. Where does the user misunderstand the system? Where does the system make the wrong thing too easy? Where is the recovery path? Where does confidence exceed evidence? Where does the product need to slow down?

The AI-native designer should be comfortable moving between both kinds of testing.

Trust moves from persuasion to instrumentation

For a human-facing interface, trust is often expressed through hierarchy, language, confirmation, progressive disclosure, and visual clarity. Those still matter when a human is in the loop.

For an agent-facing surface, trust has to be more structural. Provenance. Source health. permissions. versioning. Trace logs. confidence thresholds. reversible actions. dry runs. human review gates. clear authority boundaries. evidence attached to outputs.

This is where design leadership gets very real. It is not enough to say the system should be trustworthy. You have to decide where trust is earned, where it is displayed, where it is recorded, and where the agent is not allowed to act without more evidence.

The product has to make good behavior the path of least resistance for users that do not have human hesitation as a built-in safety feature.

The human journey is still the anchor

I do not want a future where designers ignore humans because agents are the immediate consumers of the platform. That would be a category error.

Agents act on behalf of humans, teams, businesses, and communities. The human journey still tells us what matters. What risk is the person trying to reduce? What outcome are they accountable for? What judgment should never be silently delegated? What does the person need to understand after the agent has acted?

The difference is that the journey now includes invisible stretches. A user makes a request. Agents gather context, call tools, transform data, check policies, generate a plan, ask for review, execute a step, leave a trace, and update the system. The human may only see the beginning and the end, but design has to shape the middle.

That middle is where a lot of product quality will live.

What this asks of designers

Designers need to get more fluent in the materials of headless experience: schemas, APIs, tool contracts, event logs, prompts, evals, permissions, observability, and source-of-truth systems.

Not because every designer needs to become a backend engineer. Because if the product is being used by agents, those materials are part of the user experience.

We need to ask different critique questions. Is the tool description too broad? Can this action be taken without enough context? What happens when the source is stale? Does the agent know when to stop? Can a human reconstruct the decision? Are the nouns in the data model the same nouns the user would recognize? Are we designing for one happy-path assistant, or for many agents operating at once?

This is still design. It is just design with fewer surfaces to decorate and more systems to make legible.

My bias

I think headless platforms are going to expose which teams have been treating design as styling and which teams have been treating it as product judgment.

If design only means the visual layer, then yes, a headless platform can appear to need less of it. But if design means shaping how a system is understood, trusted, used, constrained, recovered from, and improved, then headless platforms need more design, not less.

The agent user is fast. The agent user is many. The agent user will amplify whatever the product makes easy, clear, ambiguous, or dangerous.

That is the AI-native lens for me: design is no longer just the interface between a human and a product. It is the interface between intent, data, models, tools, judgment, and the people who remain accountable when the system acts.

Product development

The SDLC is changing under our feet

May 20268 min read

The software development lifecycle was built around scarcity.

Scarce engineering time. Scarce prototyping fidelity. Scarce access to data. Scarce research bandwidth. Scarce ability to test an idea before asking a whole team to believe in it. Scarce capacity to polish the last mile once the first version finally made it into production.

That scarcity shaped everything: roadmaps, handoffs, discovery rituals, design artifacts, sprint planning, MVP definitions, backlog hygiene, and the amount of compromise everyone learned to treat as normal.

But the scarcity is changing. Not evenly, not magically, and not without risk. But enough that the old operating beliefs are starting to look suspicious.

The path to MVP got shorter, but the bar got higher

Five months ago, a serious MVP still felt like a six-month undertaking in many teams. Not because everyone was slow. Because the path had so many gates: clarify the problem, align stakeholders, map the journey, design the flows, define the data needs, negotiate scope, wait for implementation, discover the edge cases, cut the delight, cut the polish, ship the thing, then live with the half-baked version longer than anyone wanted because the team had already moved on.

Now I can feel a different cadence emerging. A six-month MVP can become a six-day first product slice when the right builder has access to AI coding tools, live data, design systems, fast deployment, research synthesis, and a real product point of view.

That does not mean every product should be built in six days. It means the cost of learning from something real has collapsed. The old MVP was often the smallest version the team could afford to build. The new MVP can be the smallest version that is actually honest enough to learn from.

That is a higher standard. It should not be permission to ship junk faster. It is permission to stop pretending that a brittle, ugly, under-instrumented product is the natural price of moving quickly.

Designers can touch the data model now

One of the biggest changes is that designers can get closer to the data schema. That sounds dry until you have watched an entire product get distorted because the interface had to inherit a data model nobody questioned early enough.

A designer does not need to become the database architect to have useful influence here. But we can now inspect fields, sketch object relationships, prototype with real API responses, notice when the system is missing a concept users clearly need, and ask sharper questions before the schema hardens around the wrong nouns.

Journey maps used to stop at emotions, touchpoints, pain points, and opportunities. Those are still useful. But now a good journey map can also become a first draft of the product's object model. What has to be remembered? What changes state? What needs provenance? What belongs to the user, the team, the system, the model, the source, the review gate?

This is where design becomes more consequential. The screen is not separate from the schema. The schema is one of the forces shaping the user's experience.

Journey mapping still works because humans still need an anchor

For all the new machinery, I keep coming back to one very manual first step: map the user journey.

It almost never fails as an anchor. Not because journey maps are sacred. Because they force the team to name the sequence of lived experience before the tools start producing artifacts at inhuman speed.

Who is trying to do what? What do they know at each moment? What are they afraid of? What proof do they need? What does the system know that the user does not? Where does trust get earned or lost? Where is the handoff? Where does the product need to remember something on the user's behalf?

Once that map exists, the rest of the work can flow from it: data objects, API needs, copy, permissions, states, empty states, AI prompts, eval criteria, analytics events, onboarding, support patterns, and backlog items. The journey map becomes the manual human first pass before the machine starts helping with scale.

The backlog stops being a parking lot

The backlog used to be where ideas went to wait. Sometimes patiently. Sometimes forever.

In an AI-native workflow, the backlog can become more alive. Not a junk drawer. Not a graveyard. A living product intelligence system that keeps changing as research, data, product usage, technical constraints, design-system patterns, and customer evidence change.

A good backlog item should be able to carry more than a title and a priority. It can carry the user journey moment, the evidence, the source, the confidence level, the related schema object, the design-system implication, the AI risk, the measurement plan, and the reason it matters now.

That changes planning. If the backlog is alive and trustworthy, planning no longer has to cosplay certainty six months to a year in advance. The team can hold a stronger 1-2 month planning horizon, keep the longer-term direction visible, and let the work respond to what the product is actually teaching them.

Some old beliefs are now vestigial

A vestigial belief is something that used to be adaptive and now quietly gets in the way.

One belief is that design should stay out of implementation details. That made sense when touching implementation meant derailing engineers or pretending designers were experts in systems they could not inspect. It makes less sense when a designer can use AI tools to explore the system, build a prototype, understand the shape of the data, and bring a better question to engineering.

Another belief is that MVP means ugly, thin, and temporary. That was often a resource compromise disguised as product wisdom. If production-quality UI, real data, instrumentation, accessibility checks, and thoughtful copy are dramatically more accessible, then the minimum bar should move.

Another belief is that roadmaps become safer when they stretch farther into the future. Sometimes they do. But often they just make everyone defend guesses for longer. If the team can build and learn faster, the more responsible move may be a shorter planning loop with better evidence and clearer decision gates.

We do not have to live with bad first versions as long

This might be the part I feel most strongly as a designer. Teams have tolerated half-baked live products because the alternative was not available. Fixing the awkward flow, polishing the empty state, instrumenting the event, improving the data view, adding the missing review step, tightening the copy, making the prototype production-ready: all of that competed for scarce time.

When resources become more available to more builders on the team, the moral math changes. The product can get better sooner. The awkward thing does not have to sit there for three quarters while everyone apologizes for it in sales calls and customer interviews.

That does not mean every builder should ship whatever they want. It means teams need better guardrails, not more helplessness. Design systems, review gates, source-of-truth docs, test routines, observability, and human judgment become more important because more people can now move the product.

The answer to abundance is not chaos. The answer is a stronger operating system for quality.

The SDLC becomes less linear

The old lifecycle was a relay race: discovery hands to design, design hands to engineering, engineering hands to QA, QA hands to launch, launch hands to learning. Every handoff leaked context.

The new loop is tighter. A designer maps the journey, sketches the objects, builds a prototype against real or representative data, tests the interaction, identifies schema gaps, ships a small production slice, watches the evidence, updates the backlog, and does it again.

Engineering still matters enormously. Product still matters. Research still matters. Quality still matters. The difference is that the borders are more permeable. More of the team can work closer to the product reality, and the artifacts can be alive instead of ceremonial.

The SDLC does not disappear. It becomes less like a factory line and more like a learning system.

What this asks of designers

It asks us to get braver about the parts of product development we used to stand beside.

Not reckless. Braver.

Brave enough to ask what the schema assumes about the user. Brave enough to prototype with real data even when the first pass is messy. Brave enough to ship a small feature and measure it. Brave enough to say the MVP bar is higher now because the tools changed. Brave enough to make the journey map the anchor, then let the backlog, prototype, data model, and release plan evolve from it.

It also asks us to protect taste. When everyone can generate something, the scarce thing becomes judgment. What should exist? What should wait? What is true? What is overfit? What is humane? What is shippable? What needs one more review gate before it touches a user's real workflow?

My bias

I do not want AI-native product development to become a faster way to make mediocre software. I want it to become a way for teams to stop accepting mediocre software as the inevitable first step.

The promise is not that designers become engineers or engineers become designers. The promise is that more builders can get closer to the truth of the product sooner.

When that happens, planning gets shorter and sharper. Backlogs get more alive. MVPs get more honest. User journeys become operational, not decorative. The live product improves before everyone has learned to route around its flaws.

That is the shift I want to be part of: not just faster shipping, but less wasted time living with work we already know how to make better.

Point of view

Design is moving closer to the work

AI-native design is not about making designers faster at making screens. It is about collapsing the distance between judgment, evidence, implementation, and the product itself.

May 20266 min read

For most of my career, the distance between a good idea and a shipped product was treated like weather. You planned around it. You made peace with it. You learned which ideas could survive the trip through roadmaps, handoffs, analytics backlogs, engineering queues, research plans, data gaps, and the normal entropy of teams trying to do too much with too little time.

Designers got very good at making work legible across that distance. We made flows, prototypes, workshops, diagrams, journey maps, research readouts, and design system documentation. The best of that work still matters. I am not interested in pretending that taste, facilitation, systems thinking, or user understanding suddenly became obsolete because a model can write React.

But the distance changed.

The old handoff model made design smaller than it wanted to be

The old model often forced designers to stop right when the work got interesting. You could see that a product needed a richer data layer, but the schema lived somewhere else. You could tell that a dashboard needed a better way to explain uncertainty, but the evaluation logic belonged to another team. You could imagine a more useful workflow, but the automation, permissions, source data, and edge cases were scattered across the organization.

So design became a translation layer. We translated user needs into artifacts. We translated complexity into screens. We translated ambiguity into alignment. That was valuable work, but it also kept us one step removed from the system we were trying to improve.

AI-native tools make that removal less inevitable. A designer can now explore data structures, write small scripts, inspect APIs, generate prototypes from source material, test variants, record workflows, draft eval criteria, and ask better questions of the implementation itself. Not as a replacement for engineering. As a way to arrive at the conversation with more truth in hand.

The new designer is closer to evidence

I think this is the real shift. AI gives designers access to speed, but speed is the least interesting part if it only helps us produce more rectangles.

The better opportunity is proximity. Proximity to the science behind a recommendation. Proximity to the data that makes a visualization honest or misleading. Proximity to the edge cases that usually hide until late QA. Proximity to the actual text a user will read, the event a system will log, the source a model will cite, and the review gate a human will need before trusting the output.

That proximity changes the quality of design judgment. It is one thing to say, 'This needs to be trustworthy.' It is another to inspect the source data, write the confidence language, design the review path, and make sure the system never pretends certainty where it only has a guess.

It also changes what evidence means. A useful AI-native workflow should be able to tell the difference between a claim, a behavioral signal, a traceable outcome, and a controlled test. Those are not the same thing. Treating them as the same thing is how teams end up with dashboards that look confident and products that quietly drift away from reality.

The role gets bigger, not vaguer

There is a lazy version of the future where everyone becomes a prompt person and craft gets flattened into vibes. I do not buy it. The more AI enters the work, the more teams need people who can make judgment calls across messy boundaries.

Designers are trained to sit in ambiguity without immediately turning it into machinery. That matters. But now we can also build enough of the machinery to understand its shape. We can prototype with live data. We can notice when the data model is fighting the user's mental model. We can see when an automation creates a trust problem before it creates a productivity win. We can make the last mile feel humane.

The work is no longer just 'What should the interface look like?' It is also: What does the system know? Where did it learn that? What should it be allowed to do? What should remain a human judgment? What needs to be reviewable? What evidence belongs inside the workflow? What becomes reusable for the next team?

That last question matters more than I used to think. The strongest work is not only the screen or the prototype. It is the durable pattern left behind: the critique routine, the source-of-truth rule, the design-system update, the eval rubric, the skill another person can run without needing me in the room.

Shipping is becoming a design material

A shipped thing teaches you differently than a deck. Even a small working prototype changes the conversation because it forces reality into the room. The copy either holds up or it does not. The data either arrives or it does not. The interaction either reduces cognitive load or creates a new little tax. The team can see it, use it, dislike it, improve it.

This is why I care so much about live artifacts. Motion clips, public-data prototypes, Notion operating systems, skills, routines, working product surfaces. They are not decorations around the design process. They are the process becoming visible.

The future of design is not designers doing every job. It is designers getting close enough to the work that our judgment becomes more useful, more grounded, and harder to hand-wave away.

What I want from design now

I want design to be less precious and more powerful. Less trapped in critique theater. More willing to touch the data. Less satisfied with a beautiful empty state if the underlying workflow is still broken. More fluent in the systems that produce the interface.

The once spark-deadening space between idea and shipped product is smaller than it has ever been. That should make us more ambitious, not less. The designer who can cross that space with taste, care, technical curiosity, and a real quality bar is going to matter a lot.

Operating model

Teams need one living body of knowledge

The AI-native team does not win because it has more tools. It wins because its tools can reason from the same evolving truth.

May 20267 min read

Most teams do not have a tooling problem first. They have a truth problem.

The roadmap says one thing. The Figma file says another. The implementation has moved on. The research notes are still accurate but buried. The support tickets know what is breaking. The analytics know what people are doing. The design system knows what the product wants to be. The code knows what the product actually is. The AI chat knows whatever someone pasted into it at 11:43 p.m. while trying to get unstuck.

Then everyone wonders why the team feels out of sync.

AI makes stale knowledge more expensive

Before AI, stale knowledge mostly slowed people down. Someone asked around. Someone remembered the decision. Someone found the doc. Someone corrected the slide. Annoying, but familiar.

With AI in the workflow, stale knowledge can scale. A model can helpfully repeat the wrong strategy, generate UI from outdated patterns, summarize a decision that was reversed, or create five polished artifacts from a premise nobody believes anymore. The output looks productive. The underlying truth is off by three weeks and one architectural decision.

This is why I do not think AI adoption is mainly about teaching people better prompts. Prompting helps, sure. But if the team does not know where truth lives, the prompt is just a very confident fishing pole dropped into muddy water.

The answer is not one mega-tool

I do not think every team needs to shove all work into one platform. That usually creates a different kind of mess. Designers need Figma. Engineers need GitHub. Product needs planning surfaces. Researchers need room for nuance. Customer teams need their own intake paths. Leaders need visibility without flattening the work.

The trick is not making everyone use the same tool. The trick is making the tools answer to the same body of knowledge.

That body of knowledge has to be alive. Not a wiki graveyard. Not a folder called Final Final. Not a strategy deck fossilized at the exact moment everyone stopped believing it. A living system has owners, update routines, source-of-truth rules, and visible timestamps. It knows what is canonical, what is draft, what is deprecated, what is evidence, and what is merely a good idea waiting for proof.

It also needs reconciliation rituals. Not glamorous ones. The boring, necessary ones: stale-reference checks, duplicate cleanup, link audits, version bumps, decision logs, and lightweight rules for what an agent is allowed to treat as current. Without that, the knowledge system slowly becomes a haunted house with very nice typography.

I think in knowledge loops now

When I design an AI-native workflow, I am usually thinking about loops before screens.

What enters the system? A customer quote, a product decision, a research finding, a bug, a metric, a design critique, a schema change, a support pattern, a leadership constraint. Where does it land? Who reviews it? What does it update? What can an AI agent safely use? What should never be automated? What should be turned into a reusable skill or routine?

A good knowledge loop lets the team move fast without turning memory into folklore. It lets a designer ask an agent for a surface audit and know the audit is using the current principles. It lets a PM draft a brief from real research instead of vibes. It lets engineering see why a design decision exists. It lets leadership understand progress without needing everyone to perform status theater.

The best loops also preserve dissent and uncertainty. They do not convert every note into a fake answer. They keep open questions open, mark assumptions as assumptions, and make it clear when something came from research, a stakeholder decision, an implementation constraint, or a model's best guess.

Unison comes from shared context, not constant meetings

Teams often try to solve context drift with meetings. Some meetings are necessary. Many are just humans manually rehydrating a shared brain that the tools failed to maintain.

A better system reduces the need for re-explaining. The design system knows which components are current. The product principles are written in a way an agent can apply. The research is tagged to personas, workflows, and decisions. The roadmap links to evidence. The prototype can be audited against the same standards the team uses in critique. The AI outputs leave behind provenance so nobody has to ask, 'Where did this come from?'

That is what working in unison looks like to me. Not everyone doing the same thing. Everyone making decisions from a shared, evolving understanding of the work.

The designer has a real role here

Designers are natural stewards of context because our work already crosses the borders: user needs, product intent, interaction details, visual systems, language, edge cases, adoption, trust. AI makes that border-crossing more operational.

A designer can help define the shape of the knowledge system. What is the canonical front door? What has to be captured during the work, not after it? How do we make reasoning visible without turning every artifact into a legal deposition? Where should AI accelerate the team, and where should it slow down and ask for human judgment?

This is not glamorous in the old portfolio sense. It is not just a gorgeous hero section. It is the infrastructure that lets a team keep making good decisions after the workshop ends.

My bias

I want teams to build fewer performative artifacts and more durable ones. A good playbook. A current schema atlas. A source-backed copy atlas. A design critique routine that actually runs. A customer loop that does not rely on one person remembering everything. A set of skills that makes the team's best judgment repeatable without making it rigid.

AI-native work is going to reward teams that can keep their knowledge alive. Not perfect. Alive. Updated, disputed, reconciled, reviewed, and usable by both humans and agents.

That is how you get speed without chaos. That is how the tools start to feel like one system instead of six tabs and a prayer.

AI-native practice

Automating ideation with science, not slop

May 20268 min read

A lot of AI ideation still feels like someone opened a confetti cannon in a strategy meeting. More concepts. More feature names. More cheerful little suggestions that sound plausible until they touch the product.

That is not the version of AI-native design I care about. I do not need a model to give me twenty ideas for improving onboarding if none of them know the user's goal, the current failure pattern, the product's constraints, the behavioral load of the workflow, or the trust problem hiding underneath the interface.

The better use of AI is not idea generation by volume. It is disciplined, evidence-aware product thinking at a speed that used to be impossible.

The old ideation workshop had a memory problem

Traditional ideation often depends on whoever is in the room, whatever research people remember, and whatever constraints happen to be visible that week. Good teams try to correct for this with pre-reads, synthesis boards, metrics snapshots, heuristic reviews, competitive audits, and design principles. I love those tools. I have used them for years.

But they are usually static. They require people to manually reload context before every decision. A heuristic checklist lives in one doc. Behavioral science lives in someone's head. HCI papers live in a researcher's Zotero library. Product analytics live in a dashboard. Customer quotes live in Notion. The code knows what actually shipped. The model only knows what you paste into the prompt.

So the ideation surface gets weirdly thin. The team is technically surrounded by evidence, but the evidence is not operational.

I want an ideation system, not an idea machine

An AI-native ideation system should start by gathering the right judgment layers. UX heuristics catch the classic interface failures: unclear status, hidden affordances, weak error recovery, mismatched mental models, overloaded memory, dead-end flows. Behavioral science adds a different lens: motivation, friction, habit, cognitive load, timing, salience, default effects, loss aversion, trust calibration.

HCI research adds the interaction layer that product teams often flatten: how people form mental models, how they recover from uncertainty, how automation changes attention, how explanations shape trust, how adaptive interfaces can help or overwhelm, how collaboration changes when an AI system becomes a participant instead of a tool.

Recent ML and AI research adds another necessary layer. Generative systems are probabilistic. Agents can act across tools. LLM-powered interfaces can adapt, summarize, infer, and propose. That means product ideas now need to be evaluated not only for usability, but for uncertainty, provenance, model failure, hallucination risk, prompt brittleness, privacy boundaries, and whether the user still knows what the system is doing on their behalf.

The workflow I trust

I trust AI most when it is not pretending to be the designer. I want it to behave like a tireless product-review partner with access to the team's actual knowledge base.

First, feed it the current product reality: screenshots, flows, events, research notes, support patterns, analytics, known bugs, roadmap intent, design principles, component rules, accessibility standards, and source-of-truth docs. Then ask it to inspect the surface through specific lenses instead of asking for generic ideas.

What violates basic heuristics? Where is cognitive load too high? Where does the system ask for trust before earning it? Where does the default nudge the wrong behavior? Where are we hiding uncertainty? Where would an agent need a confirmation gate? Where is the user forced to remember something the interface should carry? Where does the data model fight the user's mental model?

That kind of prompt does not produce a vibe list. It produces a map of product pressure.

The next step is to turn that map into a testable claim. Not 'make this better,' but 'this intervention should reduce boilerplate answers,' or 'this review gate should increase confidence without lowering completion,' or 'this explanation should improve transfer to a novel scenario.' The claim does not need to be fancy. It needs to be inspectable.

Then make the model argue with itself

The useful move is not asking one model for ideas and shipping the prettiest answer. The useful move is running structured disagreement.

One pass can use Nielsen-style usability heuristics. Another can use behavioral-science lenses like friction, motivation, commitment, defaults, and timing. Another can use human-AI interaction principles: uncertainty visibility, reversibility, calibrated trust, controllability, and human review. Another can use product evidence: conversion drop-offs, repeated support issues, failed tasks, empty states, and qualitative pain. Another can use ML risk: confidence, explainability, data availability, privacy, and failure modes.

Then the system can cluster the findings, separate symptoms from root causes, and rank opportunities by user harm, business leverage, implementation cost, evidence strength, and reversibility. This is where AI gets powerful. Not because it becomes brilliant. Because it can hold more lenses in working memory than a tired team can on a Tuesday afternoon.

I also want at least one pass that is explicitly adversarial. What would make this idea fail? Who would misunderstand it? Where could it encourage shallow compliance instead of real reasoning? What pattern would look successful in the metric while making the product worse? If the system cannot survive those questions, it is not ready to become a roadmap item.

Ideation gets better when it has constraints

A raw idea is cheap. A constrained idea is where design starts.

The best AI-assisted product-improvement routines I have built or used do not say, 'Give me ten features.' They say: propose three interventions that reduce cognitive load without adding a new step. Propose five copy changes that improve trust calibration without making the system sound defensive. Propose two workflow changes that preserve human judgment before automation acts. Propose one small design-system update that would prevent this class of mistake from repeating.

That is the difference between brainstorming and operating. The model is not there to be endlessly imaginative. It is there to help turn product quality into a repeatable practice.

The research matters because AI is too fluent

Fluency is dangerous. A model can make a weak product idea sound mature. It can give a fake sense of completeness to an analysis that never touched a user, a metric, or the implemented surface. It can generate recommendations that feel thoughtful because the prose has good posture.

This is why I want UX heuristics, behavioral science, HCI literature, and AI research embedded into the workflow. They are not decorative citations. They are friction. Useful friction. They slow down the parts of ideation that should not become smooth.

Recent research on generative AI UX keeps pointing toward the same practical truth: users need clear capability boundaries, visible uncertainty, recoverable errors, good mental models, and control over consequential actions. Research on adaptive and agentic interfaces points in a similar direction: automation is only useful when the user can understand, steer, correct, and trust the system at the right moments. That is not a side note for designers. That is the work.

A mature system produces better judgment, not just better tickets

The output of this kind of workflow can be a backlog item, but that is not the only valuable artifact. It can also produce a critique memo, a design-system rule, a copy pattern, an eval rubric, a risk register, a research question, a prototype variant, or a reusable skill that lets another designer run the same analysis next week.

That last part matters. If an AI-assisted review catches a recurring onboarding problem, the win is not just fixing one screen. The win is encoding the pattern so the team can catch it again. If a behavioral-science lens reveals that users are being asked to commit before they understand value, the fix should travel into onboarding principles, empty-state standards, and future design reviews.

This is how product improvement becomes compounding instead of episodic.

The designer's job becomes more editorial

I do not think this makes designers less important. It makes our judgment more exposed.

When AI can generate options quickly, the designer's job is to decide what deserves to exist. Which insight is real? Which recommendation is overfit to one metric? Which idea manipulates behavior instead of supporting agency? Which automation changes the user's relationship to the product? Which intervention is small enough to ship and meaningful enough to matter?

The designer becomes the person who can integrate evidence, psychology, interaction theory, system constraints, and taste into a product decision. That is a bigger job than making screens. It is also a more interesting one.

My bias

I want ideation to get less theatrical and more rigorous. Fewer sticky-note forests. Fewer generic AI brainstorms. More living critique systems. More source-backed product audits. More product-improvement routines that run against the real interface, the real data, the real user behavior, and the real research.

Automating ideation should not mean outsourcing imagination. It should mean building a better machine around human judgment: one that brings the science into the room, remembers what the team already learned, notices patterns we would otherwise miss, and helps us turn good design instincts into repeatable product improvement.

The point is not to automate taste. The point is to automate the conditions that make taste sharper.