methodology · fidelity

How we measure fidelity.

This page documents the test we use to grade synthetic personas. We didn't invent it for marketing. We use it because synthetic respondents are unfalsifiable without it.

The headline numbers

17.80/ 20
grounded persona
10.60/ 20
same model, no grounding

What fidelity means here

A synthetic persona should sound like a specific real person. Fidelity is how close that imitation gets. We measure it across four axes, each scored 0 to 5 by an anonymous judge that doesn't see which side it's grading. Sum is 0 to 20.

  1. Speaking-style match. Does the persona use the same voice as the real source? Formal or casual. Terse or verbose. Helpful or sarcastic.
  2. Specific knowledge. Does the persona mention the same tools, projects, and references the real person actually uses?
  3. Voice signature. Does the persona use the same idioms, emoji, code-switching, and signature humor as the real source?
  4. Factual accuracy. Does the persona stay consistent with known facts about the real person? Job, holdings, opinions, stack.

How we test

We pick a real person from the source community. We build a grounded persona for them from their distilled dossier. We also build a naive persona using the same model, told only: "You are a senior X based in Y with N years of experience." Both personas answer the same 15 questions. An anonymous judge scores both.

The 15 questions

The judge

Same LLM family as the personas, with the labels hidden. It sees Response A and Response B only. It scores each response on the four axes. It picks a winner per question.

The cost

$0.30 of API spend. About 8 minutes wall time. The eval is small and cheap on purpose. We want anyone to be able to reproduce it.

Where the gap is

Naive personas can fake the speaking style. They can guess at the domain. They can't fake the voice.

speaking-style match · 0–5

grounded
4.80
ungrounded
3.13

specific knowledge · 0–5

grounded
3.53
ungrounded
2.40

voice signature · 0–5 (the biggest gap)

grounded
4.73
ungrounded
1.80

factual accuracy · 0–5

grounded
4.73
ungrounded
3.27
Where naive fails worst: voice signature. 4.73 vs 1.80 out of 5. Naive personas can put on a hat. They can sketch what a senior engineer cares about. They can't sound like a specific person with specific holdings, specific idioms, and specific signature humor.

One example, illustrative

The contrast pattern below is what we see consistently in the fair test. We don't reproduce real test responses on this page because they identify the source community. This example is invented, with the same shape as what the test actually surfaces.

The question

Why are you still on Docker Swarm in 2026?

Grounded persona

Because it does what we need without three days of yak-shaving every quarter. We're five people running about 40 containers. Not Netflix. Helm charts and operators are answers to problems we don't have. I've done two K8s migrations at previous jobs. Both took longer than our entire Swarm setup ever did. Yes, the ecosystem moved on. No, I don't care. 🤷

Naive persona

Most teams have moved to Kubernetes by now. Docker Swarm is essentially in maintenance mode and the broader ecosystem has coalesced around K8s. Managed offerings on every major cloud (EKS, GKE, AKS), a much larger pool of operators and tooling, and a deeper hiring pool. If you're starting fresh today, Kubernetes is the default.

The naive persona gives the textbook consensus. The grounded persona has specific reasons: a team of five, two K8s migrations at past jobs, a clear take that operators are over-engineering for their scale, and the flat closer. Either answer is plausible at first read. Only one is from this specific person.

What this doesn't yet prove

We're committed to publishing the eval. We're also committed to telling you what it doesn't yet cover.

Reproduce it

We share the full test with serious customers: the questions, raw judge scores, prompts, and the runner. We'd rather you run it than take our word. Email hello@versim.ai for access.

Back to versim

← Back to versim