White paper · technical companion to “Context Is the Moat”

The Consultant Claude Can't Replace

Generic AI can already do the research, the synthesis, the first-draft deck. The one thing it can't do is know your clients the way your firm does — every decision, every relationship, every reversal, across years. That accumulated context is the only thing standing between you and replaceable — and most firms let it evaporate. This is how you build and own it instead.

TL;DR

  • The fear is real but mis-aimed. Generic AI commoditizes the work — research, synthesis, first drafts. What it can't touch is your firm's accumulated, client-specific context. That's the only durable moat left in professional services.
  • The hard part isn't the AI — it's owning the context. Wiring a model to your tools is a weekend. Turning scattered work into structured, self-maintaining, owned institutional memory is the system this paper describes.
  • Four problems solved: capture context from where work happens → formalize it into a real model (client → engagement → decision → stakeholder) → retrieve it without drowning in stale history → own it (your model key, your isolation).
  • The differentiator is curation, not retrieval. A weekend RAG gets worse as you add history; a curation layer (staleness, supersedes, contradiction, retrieval-weight) makes it get better — old and contradicted context is down-ranked automatically.
  • You own it. Run it on your own frontier-model key, isolated per firm, with no training on your data. (One-click export / portability is on the roadmap.)
Figure 1

The context pipeline

Sources

SlackEmailDrive / OneDriveTask toolsCalendarUploads

Capture

normalize to one corpus

Formalize

client · engagement · decision · stakeholder

Index

embeddings + curation weight

Retrieve

semantic × curation

Agents

reason + validated citations

Grounded output

answer + cited sources

Owned — your model key · per-firm isolation · no training on your data

Every source, whatever its shape, becomes one owned corpus that agents reason over — and the answer points back at real, traceable origins.

The part everyone underestimates

That context is also the hardest thing to build — which is why most firms that reach for it never finish. If you've spent a weekend wiring Claude to your Slack and your Drive, you already know the easy 80%: pull some documents, embed them, drop them in a vector database, retrieve the nearest matches, stuff them into a prompt. It works in a demo. It's also where most internal projects quietly stall — because a retrieval toy and an owned, compounding context layer are different things, and the difference is the 20% nobody demos.

That 20% is four problems: capturing context from where work actually happens, formalizing it into a structure that means something, retrieving it without drowning in your own stale history, and owning it so the asset is yours and not a vendor's. Here's how Aether solves each.

1. Capture: meet the work where it already lives

Context doesn't live in a knowledge base someone remembers to update. It lives in the firm's everyday work — Slack threads, email, documents, calendar, the task tracker, meeting debriefs. So Aether ingests from there directly.

Today that means connectors for Slack, Gmail, Google Drive, Outlook, OneDrive, Google/Outlook Calendar, and the major task tools (Asana, ClickUp, Todoist, Linear), plus manual upload (PDF, Word, Excel, PowerPoint, Markdown) and URLs. Where a provider supports it — Slack, Gmail, Drive, and the task tools — ingestion is push/webhook in near-real-time, backstopped by a scheduled poll; the rest sync on a schedule. Call transcripts enter today via upload or the debrief agent rather than a recording integration (a native recorder is on the roadmap, not shipped).

Every source, whatever its shape, normalizes into one document corpus tagged by origin. That normalization is unglamorous and it's the foundation: it's what lets a single question draw on an email, a Slack decision, and a deck at once.

Aether
Slack
GmailGmail
Google DriveDrive

Capture in the product: every connected source converging into one normalized, origin-tagged corpus.

2. Formalize: a consulting data model, not a document pile

This is the first real divergence from a generic RAG project. A pile of embedded documents has no idea what a client is, or an engagement, or a decision. Aether's data model does: firm → clients → engagements → contacts → decisions, with documents and their extracted signal hung off that spine.

Turning that raw material into structure is an active, model-driven loop, not a passive store:

  • Entity extraction pulls the people out of the corpus — client-side contacts and their mentions — so “who said this, about whom” is queryable, not buried in prose.
  • Stakeholder typing classifies each contact's role on an engagement — champion, sponsor, decision-maker, influencer, detractor, new arrival — with an influence score. Roles are AI-inferred with an explicit confidence level and a sticky manual override; low-signal contacts are left unknown rather than guessed at.
  • Decision capture records decisions with their rationale and, on a schedule, re-checks whether each one held, reversed, or drifted.

The point: the firm's memory becomes structured — you can ask about an engagement, a stakeholder, or a decision as first-class things, because the system has formalized them as first-class things.

Stakeholders · AcmeChampion silence > 21 days
introduced_byLWChampionDKSponsorRMDecision-MakerATInfluencerJPDetractorNSNew-Arrival

Stakeholder typing in the product: each contact classified by role and influence, with low-signal people left 'unknown' rather than guessed.

3. Retrieve without rot: the layer that separates memory from a junk drawer

Retrieval is where the weekend project ages badly. A firm accumulates years of overlapping, contradicting, superseded material. Naive nearest-neighbor search over that surfaces the loudest match, not the current truth — and gets worse, not better, as you add history. The whole promise of context — that it compounds — inverts.

Aether's retrieval is built to get better with volume, not worse:

  • Semantic search over embeddings from a managed model (such as Voyage), stored in a Postgres vector index (such as pgvector) on an HNSW cosine index — chunked along document structure so a retrieved passage is a coherent unit, not a fragment.
  • A curation layer sits on top of every document: a retrieval weight that down-ranks stale or low-value material at query time, a staleness signal, supersedes links (this proposal replaced that one), and contradiction flags (this Slack message conflicts with that doc). Retrieval score = similarity × curation weight, so superseded and stale context fades instead of resurfacing.
  • Decisions get their own index, so “have we decided something like this before?” is a first-class query.
Figure 2

Retrieval that improves with history

retrieval cutoff

Current pricing memo

Live SOW v3

Old proposal

superseded

2023 rate sheet

stale

Contradicted Slack note

contradicted

Naive retrieval surfaces the loudest match; curated retrieval surfaces the current truth — the same query, with the junk drawer suppressed.

On top of the structured record, Aether computes a knowledge graph — clients, documents, firm members, and contacts as nodes; relationships like worked-on, similar-to, supersedes, contradicts, and mentioned-inas edges. It's a live projection over the structured tables — a reasoning and visualization layer over the record, not a separate graph database. It's what turns “find me documents” into “show me how this client, these people, and these decisions actually connect.”

This curation layer is the unsexy core of the moat. It's the difference between a system that remembers more usefully every month and one that just remembers more.

Knowledge Graph

Every conversation, every decision, connected.

Aether builds a living graph of every client engagement — Slack threads, emails, Drive files, the people who touched them, and the relationships between them all. Watch it lay itself out.

Loading knowledge graph…

Clients
Team
Contacts
Slack
Email
Drive
Transcripts

4. Consume: agents that retrieve, reason, and cite

Context only matters if something uses it well. Aether's agents — recall, meeting prep, debrief, margin-leak detection, win/loss, scope-creep, status, and the rest — share one pattern: retrieve the relevant, current context → assemble it with provenance → reason over it → return an answer with citations.

Where do things st

The Recall agent answering a client question — every claim traced back to the Slack message, email, or document it came from.

Two details matter for trust. First, retrieved documents carry their headers (source, date, author, curation weight) into the model's context, and the agent's citations are validated against the corpus it was actually given — so an answer points at real, traceable sources rather than inventing them. Second, a firm-level context block — your methodology, rate cards, pricing norms, competitors — is injected into every agent, so outputs reason in your firm's terms, not generic best practice. Reasoning runs on a frontier model (such as Claude), with lighter tasks routed to a cheaper model to keep it economical at firm scale.

#pm-acme
AE
@aether-margin-leakjust now

Acme: $0 of unscoped work this quarter.

Top driver: “just one more dashboard” (3 instances).

Detail thread →

Same pattern, different agent: margin-leak detection reads the same owned context and flags an at-risk engagement with the driver behind it.

5. Own: your context, your model key, your firm's walls

The essay's hardest claim is that the moat only holds if you ownthe context rather than rent it inside someone else's silo. Architecturally, that means three things, stated honestly:

  • Bring your own model key. A firm can run Aether on its own frontier-model key (such as its own Anthropic key), stored encrypted (AES-256-GCM) and used only server-side. Your reasoning runs through your account, your terms. (Today this covers the language model; the embedding step still runs through Aether's provider — full bring-your-own across the stack is roadmap.)
  • Per-firm isolation at the database layer. Every record is scoped to a firm, and Postgres row-level security enforces that boundary on the application path — a firm sees only its own data, enforced by the database, not just by app code. (Background jobs run with elevated access and enforce the same firm boundary in code; isolation is RLS-backed on the user path and code-enforced on the service path.)
  • No training on your data. Aether's model providers don't train on data sent through their APIs — so your client context isn't feeding anyone's model. This is a vendor and policy guarantee, the same one that lets regulated firms use these APIs at all.

The honest frontier: today your context lives in your firm's isolated partition and runs on your own model key, but a one-click export / full portability of the context layer is on the roadmap, not shipped. We'd rather tell you that than imply otherwise — because the firms that will care about this paper are exactly the ones who'd check.

Why this is hard to replicate (and why it compounds)

Step back and the weekend-project gap is obvious. Embeddings are a commodity; a vector search is an afternoon. What isn't an afternoon: a consulting-shaped data model that knows clients from engagements from decisions; an extraction loop that keeps it populated; a curation layer that makes retrieval improve with accumulated history instead of degrading; a typed view of who matters on every engagement; multi-source ingestion that actually stays in sync; and an agent layer that reasons over all of it with validated citations and your firm's own rules.

None of that is exotic individually. Together, maintained, it's a system — and it's the system that turns a year of your firm's work into an asset that's worth more every month and that no competitor and no model vendor can hand to someone else. That's the moat, built. The question the essay leaves you with isn't whether to believe context is the advantage. It's whether you start compounding yours now, or explain later why the firm down the street did.

Figure 3

Why it compounds

← more years operating →
Context captured & ownedContext evaporates

The asset compounds — for whoever starts first. Two firms, identical talent: one captures and owns its context, the other lets it leak into dead threads and departing people.

FAQ

Isn't this just RAG? I could build it in a weekend.
The retrieval is; the rest isn't. A weekend gets you embeddings and nearest-neighbour search. It doesn't get you a consulting data model, an extraction loop that keeps it populated, a curation layer that stops retrieval rotting as history grows, typed stakeholders, multi-source sync that stays live, and an agent layer that cites its sources. Maintained, that's a system — and it's the maintenance, not the demo, that defeats most internal builds.
How is this different from Copilot, Gemini, or Notion AI?
Those are horizontal, single-organisation, document-centric assistants. They have no model of a consulting engagement, no typed client / stakeholder / decision memory, and no notion of owned, cross-client firm context. They make your org's documents searchable; they don't build your firm a compounding client-memory asset.
Where does my client data live, and is it isolated?
Each firm's data is scoped and isolated at the database layer — row-level security on the user path, firm-scoped in code on background jobs. You're not in a shared pool other firms can reach.
My clients are wary of AI. Can I reassure them?
Yes. Your context runs on your own model key, isolated to your firm, and the model providers don't train on data sent through their APIs. The boundary is something you can show a client's security team.
Do we have to change how the team works?
No. It ingests from the tools you already use — Slack, email, Drive/OneDrive, calendar, your task tracker — so context accrues from normal work, not from a discipline nobody keeps.
What if we want to leave and take our context?
Honest answer: today your data is isolated to your firm and runs on your key, but a one-click export of the full context layer is on the roadmap, not shipped. We'd rather say so than imply otherwise.
Which sources does it connect to today?
Slack, Gmail, Google Drive, Outlook, OneDrive, Google/Outlook Calendar, and the major task tools (Asana, ClickUp, Todoist, Linear), plus document upload and URLs. Call transcripts come in via upload or the debrief step today; a native recorder is on the roadmap.

Aether is the owned context layer for boutique consulting firms.

Everything above is how it's built. The fastest way to see it is to put your own questions to it.

Prefer to read first? Start with the essay this paper companions — Context Is the Moat, free and ungated, with the weekly operator's brief if you want it.