methodology · pipeline

How we build a persona.

Most "AI persona" tools guess from a job title. Ours distill from real conversations someone actually had. This page walks the pipeline.

What a persona actually is, here

A versim persona is a typed role-play of a specific real person from a community we have permission to model. Not a generic "senior X" stereotype. One person's career, tools, tones, opinions, and signature phrasing, compressed into a structured profile and made queryable.

The real person never sees your question. You never see the real person. The anonymiser is the only path between you.

the pipeline

Six stages, archive to answer

  1. Archive in, anonymized when we load it. We start with a community's archive: messages, threads, timestamps, social context. We only work with archives we have permission to use. Identifiers are stripped at load time, so the rest of the pipeline never sees raw names, handles, or contact information. Communities can request deletion at any time.
  2. A dossier per person. For each real person in the archive, we build one structured profile. It covers their career, the tools they use and how deeply, their speaking style, their tone in conversation, rough seniority and current goals, who they listen to, personal details, and a short summary. Every claim points to the exact message that supports it, checked against the source text at build time. Made-up content is a bug. We reject it in tests.
  3. A team of agents builds it, not one big agent. Different aspects of a person need different lenses. We use a team of small specialised agents: one for career, one for tools, one for speaking style, one for tone, and so on. Each agent reads the archive and returns its specific slice with sources. One big monolithic agent would lose specificity and make things up. Many small agents stay focused.
  4. The anonymiser is a one-way valve. Nothing crosses to you without going through it. The anonymiser strips real user IDs, message IDs, handles, display names, exact source-text spans of fifteen words or more, and any blocked phrases on the per-archive list. What passes through: anonymized response text, public labels like persona_7_of_8 that don't link back to anyone, and a count of how many spans were removed. You never see a dossier, a real name, or a long exact quote.
  5. A panel builder picks who answers. You don't ask one persona. You ask a panel. The builder takes typed filters: size, role tier, activity level, tone exclusions, tool requirements, location, current goals, and picks a balanced sample matching them. The build is repeatable: the same filters with the same starting point give you the same panel twice.
  6. The panel answers, a summarizer combines. Each persona in the panel reads the question and answers in their own voice, grounded in their dossier plus a balanced sample of their real messages for cadence. Each claim is linked to evidence. The per-persona answers go through the anonymiser, then to a summarizer agent that builds the panel's collective answer, surfaces themes, and counts the sentiment split. That's what reaches you.

stage 2 · zoomed in

What a dossier looks like

A dossier is the long-form output of the team of agents. A JSON document with typed fields. The current shape covers:

Career timeline

Roles, how long, transitions, role types.

Tools and depth

Which tools they actually use, and how deeply, based on how they talk about them.

Speaking style

Formal, casual, terse, verbose; how it shifts with topic.

Conversation tone

Friendly-casual, helpful-terse, sarcastic-but-helpful, complaining, self-deprecating, and so on.

Seniority + goals

Rough seniority and what they're currently trying to do: looking for info, switching jobs, hiring, investing, selling.

Who they listen to

Whose advice they take, who they push back on, who they ignore.

Personal details

Hobbies, opinions, location, signature humor, idioms, emoji.

Summary

A short summary that ties everything together.

Every entry in every field points to the exact messages that support it. If the dossier claims someone is a value investor, there is a real message backing that claim, with the exact text saved and checked against the source. The dossier itself stays private. It never leaves our infrastructure.

stage 5 · zoomed in

Building a panel

A panel is a typed sample drawn from the dossier set. The builder accepts filters like these:

The build is repeatable. Two runs with the same filters and the same starting point give you the same panel. Two runs with the same filters and different starting points give you different but statistically matched panels. Either way the build is repeatable and inspectable.

stage 6 · zoomed in

How a persona answers

When a question comes in, each persona in the panel:

The persona answer goes through the anonymiser before reaching the summarizer. The summarizer never sees the dossier or the raw answer. You never see them either.

the one-way valve

The trust boundary, in detail

The anonymiser is the only path from real-person-grounded text to anything you see. It runs on every response, every time, with no toggle to turn it off.

Removed at the boundary

  • User IDs and message IDs from the source archive
  • Real names, display names, and handles
  • Exact source-text spans of fifteen words or more
  • Blocked phrases on the per-archive list
  • Contact information of any form

Passes through to you

  • Anonymized answer text in the source community's voice
  • Public persona labels (e.g., persona_07_of_30) that don't link back to any real person
  • A count of how many spans were removed
You never receive a real name, a real handle, or a long exact quote. Per-account question limits prevent reconstruction by repeated querying. The anonymiser is enforced by code and tested on every release.

honest scope

What this doesn't yet cover

We're committed to publishing the methodology and to being clear about its current scope.

Get in touch

← Back to versim