Research Center · Methodology

The contract behind every finding.

A finding is only as honest as the pipeline that produced it. This page describes that pipeline end-to-end: where data comes from, what gets stripped before any model touches it, which models we run, and what conditions cause a result to be silently refused. Read it once; re-read it any time a finding feels too good to be true.

1. Where the data comes from

All research data originates in the Manifesting Tracker (our companion mobile app). Members log outcomes they're working toward, practice sessions, mental triggers they noticed, and "signs" or synchronicities they observed. Nothing is collected outside the Tracker. We never read browser history, location, contacts, or any passive signal.

Default off. A new member is not in the research pool until they say yes at first sign-in.
Reversible. Members can revoke consent any time. Revocation excludes them from the next extract; their prior contributions are also dropped.
No free-text. Only structured fields (categories, statuses, methods, dates) are extracted. Personal notes, scene descriptions, and journal entries never leave the operational database.

2. What gets stripped before analysis

Operational data and research data live in separate Supabase projects. The link between them is one-way: an extract job reads from operational, transforms, and writes to the warehouse. The research engine never sees operational rows. Three transformations happen at the boundary:

Pseudonymisation. Each consenting auth.uid is mapped to a random UUID (wh_user_id). The mapping is stored only on the operational side; the warehouse cannot reverse it. This lets us honour future revocations without breaking the pseudonym.
Bucketing. Dates are coarsened to month-of (e.g. 2026-03-01, never the day or hour). Free-text labels are mapped to a short controlled vocabulary; anything unrecognised collapses to other.
Whitelist of fields. Only the columns listed in the field whitelist are extracted. New fields require an explicit migration that we publish here before it runs.

3. The k ≥ 10 invariant

The single most important rule: no published number describes fewer than ten distinct members. If a slice (a category, a cluster, a filter combination) has nine or fewer contributors, the engine refuses to write it. This rule is enforced at three independent layers, so a bug at any single one cannot leak a small cell:

Model layer. Each model checks its own sample size in __post_init__ and raises before emitting findings.
Publisher layer. The runner that lifts draft findings into published_findings refuses any row with sample_size < 10 and logs a warning.
Database layer. A CHECK constraint on published_findings.sample_size rejects any insert below the threshold at the SQL level.

Dashboards apply the same rule per filter slice. A category with too few members will simply not appear — not as a zero, not as a placeholder, just absent. Better to under-show than to under-protect.

4. The models

Wave-1 ships three small, well-defined models. Each one has a single job, a documented input shape, and a documented output shape. Adding or replacing a model requires a migration and a published note here.

Trigger × session crosstab (model_trigger_session_crosstab_v1). For each trigger type (doubt, fear, comparison, …), what kinds of session followed within seven days, and how often did they precede a "manifested" outcome? Answers like "impatience-triggered scripting sessions correlated with manifestation in X% of cases."
Practice-pattern clusters (model_practice_pattern_clusters_v1). K-means over per-user practice fingerprints (method mix, cadence, scene preferences). The engine picks the best k by silhouette score and emits one finding per cluster.
Cohort outcome comparison (model_cohort_outcome_comparison_v1). Compares manifestation rates between cohorts with statistically significant differences. Reports differences only when both arms clear k=10 and the gap is large enough to matter.

5. The consent flow

The first time a member signs in to either Tracker or University, a single calm prompt asks: include my de-identified practice data in the Research Center, or not? Both choices stick; both are recorded with a version string and a timestamp.

The choice is stored in research_consent with a consent_version tag, so we can re-prompt only when the consent text changes materially.
An audit table research_consent_events records every grant, revocation, and version bump.
The first-login flow is atomic: the marker that you've answered the prompt and the consent row are written in the same transaction. You cannot end up half-prompted.

6. Reproducibility

Every finding lists its data window, sample size, model id, and model run id. The engine code is open source. Given a copy of the warehouse and a model id, the same window will produce the same finding.

Engine repo: the engine/ folder of manifesting_university on GitHub.
Schema: warehouse migrations in warehouse/supabase/migrations/; operational research-side migrations in supabase/migrations/2026091*.
Status: the most recent run of every model is visible at /research/engine.

7. What we don't claim

That a published finding establishes causation. Most findings are correlational by design.
That the pilot cohort represents all manifesters everywhere. Findings describe this community at this time.
That the models are the only valid models. Outside researchers can apply for warehouse access at /research/apply.
That this is finished. The methodology will evolve; every change appears in version history here and in the published findings' data_source tag.

Next Steps

Read Published Findings → Engine Status → Back to Research Center →