How versim treats the data. Privacy and community rights.

stays
private

source archives + dossiers

never
sold

no resale, no third-party share

opt out
any time

deletion on request

where the data comes from

We only work with archives we have permission to use.

We don't scrape private channels. We don't buy or trade scraped archives. We don't work with data of unclear origin.

The archives we work with come from one of two paths:

Open community archives that the community has itself chosen to make publicly accessible.
Direct partnerships with community owners who have invited us in to model their community.

Either way, we have a clear legal basis to use the archive before any of the rest of the pipeline runs.

what stays private, what you see

You never see the source data.

The trust boundary is the anonymiser. It's the only path from real-person-grounded text to anything you ever receive.

Stays with us, always

The source archive (raw messages)
Dossiers (structured profiles per person)
Real names, handles, display names, email addresses
Contact information of any form
Anything that could identify a real person

Visible to you

The panel's combined answer (anonymized)
Per-persona answers, labeled persona_NN_of_MM
Themes the summarizer surfaced
Sentiment count (positive, neutral, negative, mixed)
The fidelity score and the panel filters that were used

Nothing else reaches you. Not real names. Not handles. Not contact data. Not long exact quotes. Not the source archive. Not the dossiers. The anonymiser strips identifiable text on every response, every time, with no toggle to turn it off.

how the anonymiser protects the source

The anonymiser is enforced by code, not policy.

What it removes from any output before it reaches you:

User IDs and message IDs from the source archive
Real names, display names, and handles
Exact source-text spans of fifteen words or more
Blocked phrases on the per-archive list
Contact information of any form

It runs on every persona answer and on the summarizer's output. It is tested on every release. A failure to anonymize is a hard release blocker, not a warning to be reviewed later.

For more on the engineering, see the methodology page.

community rights

Communities can opt out at any time.

Right to be removed. A community can ask to be removed at any time, for any reason. We don't ask why. Email hello@versim.ai from an address connected to the community.
What gets deleted. The source archive. All dossiers built from it. Any cached panel output that draws on it. Any fidelity scores computed from it. We delete derived artefacts too, not just the raw data.
How fast. Promptly. Days, not months. We confirm completion in writing once the deletion is done.
Audit on request. We can show you exactly what was held and when it was deleted. The deletion runbook is exercised on a synthetic archive in our tests, so we know it works.

commitments to the source community

What we will never do.

Hard "no"s

Sell, share, or license the underlying archive to anyone
Extract contact information from the archive
Enable advertising or marketing automation against the source community
Train other models on the source data
Publish the source archive or its raw contents
Surface real names, handles, or contact data in any output
Pretend a synthetic answer is from the real person

The source community's data is not the product. The product is the anonymized, scored panel output. The underlying data stays with us, anonymized at the boundary, and never crosses your view.

safeguards in code

Enforcement, not policy.

Privacy commitments only matter if they're enforced. We enforce ours in the code path, not just in this document.

Anonymiser is a hard gate. Every persona answer and every summarizer output passes through it. Bypass is not a configurable setting.
Test on every release. A regression test runs the anonymiser against a synthetic-but-realistic test panel. If any identifier or long exact source-text span slips through, the build fails.
Per-account question limits. When the hosted product launches, every account will be capped on questions per panel and questions per day. The cap exists to prevent reconstruction-by-repeated-querying.
IP addresses are stored as salted hashes. We never log or persist the raw IP of anyone who hits our waitlist or eventual product. The salt rotates.
Audit logs of account activity. Useful for abuse detection. They don't store account data beyond what is needed to enforce the question limits.

honest scope

What we're still figuring out.

We're committed to publishing the privacy posture and to being clear about what isn't yet locked down.

Per-archive variation. Different partnerships may have different terms. Some communities may ask for stricter rules than this baseline. We respect the strictest applicable rules per archive.
Cross-archive panels. A panel that draws from more than one community archive enforces the strictest per-archive rule across all included sources. We are working out the consent flow for partners who want their archive included in cross-archive panels.
Compliance certifications. SOC 2, ISO 27001, and similar certifications are on the roadmap once the hosted product launches. Today our claim is structural (anonymiser, code-enforced, testable), not certified.
Regulatory regimes. Specific compliance (GDPR, CCPA, sectoral laws) varies by jurisdiction. We are designing toward the strictest regimes by default. If you have a specific compliance requirement before signing on as a customer, get in touch.

How we build a persona — the end-to-end pipeline. The anonymiser sits between every persona's grounded answer and you.
How we measure fidelity — the fair test that scores a persona against real source messages. The score is a published number, but the source data behind it stays private.
Frequently asked questions — plain answers to common questions about versim, the data, pricing, and the waitlist.
Back to versim — the landing page, with the sample panel run and the demo.

Questions? Write to hello@versim.ai.

← Back to versim

Our commitments at a glance