AI-Powered UX Audits: When I Let Fictional Users Review My App

I created five fictional people. Gave them names, job titles, technical skill levels, and personality traits. Then I asked AI to pretend to be each of them, walk through my entire application, and tell me everything that was wrong with it.

That sounds a lot like user testing. It is not. Real user testing involves actual humans, costs real money, and gives you insights that no simulation can replicate. But it turns out that simulated user testing - done properly - catches an embarrassing number of real problems. Including one that led to building an entirely new feature.

The Persona Setup

The idea came from a simple frustration: I knew the application had UX issues, but I was too close to it to see them. When you build something, you develop expert blindness. You know where every button is because you put it there. You understand the navigation because you designed it. You never hit the confusing parts because you never take the paths that a new user would take.

So I created personas. Not the marketing kind - the usability testing kind. Each one had a specific technical background, a specific set of goals, and a specific tolerance for confusion.

There was a project manager who was technically competent but not a developer. A small business owner who could barely use a spreadsheet. A senior developer who would notice every inconsistency and judge every design choice. A freelancer who was evaluating the product against three competitors. And an administrator setting up the system for a team who needed everything to be self-explanatory.

Each persona came with a detailed brief: who they are, what they're trying to accomplish, what frustrates them, and what their expectations look like based on other tools they use. The brief was specific enough that the AI could maintain the persona's perspective consistently across a full audit.

Running the Audits

Here's where it got interesting. I ran multiple agents simultaneously, one per persona. Each agent was given the persona brief and asked to audit the entire application from that person's perspective. Navigation, onboarding, core workflows, settings, error states, mobile experience - the lot.

Each agent produced a structured report: findings categorised by severity (critical, major, minor, cosmetic), specific page or feature affected, description of the problem from the persona's perspective, and a suggested improvement.

The reports were consolidated into a single document with cross-references - if multiple personas flagged the same issue, it got bumped up in priority. If only the senior developer noticed something, it was likely a polish issue rather than a fundamental problem. If the small business owner got stuck somewhere, that was a critical path issue.

The First Round Was Too Nice

I'll be honest with you - the first round of audits was disappointing. Not because they found nothing, but because they didn't find enough. The reports read like polite suggestions from someone who didn't want to offend. "Consider improving the contrast on secondary buttons." "The onboarding flow could benefit from additional guidance." "Some users may find the navigation structure complex."

This is useless. I didn't create fictional users to get diplomatic feedback. I created them to get the kind of honest, unfiltered criticism that real users give in usability tests - the kind where someone looks at your lovingly crafted interface and says "I have no idea what I'm supposed to do here."

So I pushed back. "Not hard enough. These reports are surface-level. Go deeper. What would actually make Sarah the project manager give up and switch to a competitor? Where would Mike the business owner get genuinely stuck, not mildly confused?"

The second round was dramatically better. And more painful to read.

The Missing Front Door

The finding that changed the most was this: the freelancer persona - the one evaluating the product against competitors - couldn't figure out how to manage payments.

Not because payments were hidden. The payment functionality existed. But there was no single, obvious place to go for "everything to do with money." Invoices were in one section. Subscription management was somewhere else. Payment history was buried in account settings. If you knew where everything was, it all worked. If you were a new user trying to understand your financial relationship with the product, it was a mess.

This finding led directly to building an entirely new feature: a Finance Hub. One page that consolidated everything financial - current plan, billing history, invoices, payment methods, usage tracking. A front door for payments that should have existed from day one but didn't, because when you're building incrementally, you add payment features where they make sense at the time and never step back to see the whole picture.

A fictional user found a real product gap that I had been looking at for months without seeing. That's the value of persona-based audits in one sentence.

The 37-Conversation UX Project

The persona audits opened a floodgate. Once I saw the volume of legitimate UX issues they surfaced, I couldn't unsee them. What started as a quick experiment became a full UX review project spanning 37 separate conversations.

Some of those conversations were small fixes - a button label that was confusing, a tooltip that was missing, a colour that didn't provide enough contrast. Others were significant redesigns - rethinking the navigation hierarchy, restructuring the settings pages, adding contextual help for features that weren't self-explanatory.

The persona framework made prioritisation straightforward. If the non-technical persona couldn't complete a core workflow, that was a P1. If the developer persona thought something was inconsistent, that was a P3. The fictional users became a standing reference for design decisions: "Would Sarah understand this?" replaced "Does this make sense?" - and the specificity of the question produced better answers.

Where the Technique Works

Persona-based UX audits are particularly good at catching three types of problems.

First: expert blindness. The things you can't see because you know too much. Navigation that makes sense to the person who built it but not to anyone else. Terminology that's internal jargon. Workflows that assume knowledge the user doesn't have.

Second: edge case flows. The paths through your application that you never take because they're not your paths. What happens when someone tries to do things in the wrong order? What does the empty state look like before any data exists? What happens when a permission is denied?

Third: cross-persona consistency. Does the application feel the same for an admin as it does for a regular user? Do power users and beginners have equivalently good experiences? Are there features that serve one persona well but alienate another?

AI is good at this because it can adopt a perspective and maintain it consistently across an entire application. A human tester gets fatigued, forgets their persona's constraints, or unconsciously applies their own expertise. The AI doesn't. It stays in character. Whether that character is useful depends entirely on how well you defined it.

Where the Technique Falls Short

It's not real user testing. I want to be clear about that because the temptation is to treat it as a substitute, and it isn't.

Real users do unexpected things. They misread labels. They click on things that aren't buttons. They use the application on a phone with a cracked screen in direct sunlight while their child is screaming. No persona simulation captures that level of chaotic human behaviour.

The simulated audits also miss emotional responses. A real user who gets frustrated might abandon the application silently. A persona-based audit identifies the frustration point but can't simulate the emotional weight of it. "Sarah would find this confusing" is less impactful than watching an actual Sarah furrow her brow and reach for the back button.

And the initial reports, as I mentioned, were too generous. AI has a bias toward being helpful and constructive, which in a UX audit context translates to pulling punches. You have to explicitly ask it to be harsh, and even then it's harsh in a structured, professional way. Real users are harsh in a "this is rubbish, I'm leaving" way. The difference matters.

How to Actually Do This

If you want to try persona-based UX audits, here's what I learned about making them effective.

Define personas with constraints, not just descriptions. "Non-technical user" is too vague. "Small business owner who uses Xero and Slack but has never used a project management tool, has 15 minutes to evaluate the product, and will abandon it if she can't accomplish her primary goal in that time" gives the AI something to work with.

Run one persona per agent. Don't ask a single AI conversation to switch between personas. Each persona should be a separate agent with a dedicated context. This keeps the perspective consistent and avoids the bleed-through where the "non-technical user" starts noticing API response times.

Push past the first round. The initial audit will be surface-level. Push for depth. Ask specifically about failure modes, abandonment points, and competitive comparison. "Where would this persona switch to a competitor?" produces much more actionable feedback than "What could be improved?"

Consolidate and cross-reference. A finding from one persona is interesting. The same finding from three personas is a priority. Build the consolidated report and let the overlap guide your roadmap.

It's not a replacement for talking to real users. But as a technique for surfacing the problems you're too close to see, it's surprisingly powerful. Five fictional people found a missing feature that I'd been blind to for months. That alone made the entire exercise worthwhile. And honestly, the Finance Hub is one of the better things in the product now. All because a person who doesn't exist couldn't find the invoices page.