Step 1, in the open.
Today, the first step of every campaign runs inside one engineer's laptop. We want to put it where the GTM team can see it, run it, and trust it, and where the rest of the product can learn from it.
A small operator page on infrastructure we already pay for, mirroring the same shape Josh's scheduler uses, so Step 1 feeds the loop instead of inventing a parallel one.
What Step 1 is, in plain language
Before AICRO can write a cold email for a client, it has to know what the client sounds like, who they sell to, and what their proof points are. Step 1 is the act of writing that down.
The output is a short document, called a "brief," that every downstream agent reads on every email it composes. Voice rules. Disqualifiers. Proof points. The lines the writer can use. The lines that are off-limits.
If the brief is good, every email gets better. If the brief is stale or shallow, every email drifts.
One brief per client. About 4,000 prospects per active campaign. The brief is loaded into the writer's prompt on every single composition. One Step 1 refresh therefore affects thousands of outbound messages.
The Step 1 brief is one piece of the interactive playbook we are building per client. Voice, tone, ICP, signals, proof points, banned phrases, sender selection rules, the lessons we have learned campaign by campaign, all of it stored together in Supabase as one connected source of truth. Every downstream agent (writer, critic, scheduler, reply agent) reads from the same playbook, so when a tone rule changes here, the whole system shifts with it. Step 1 is where the playbook starts.
What Step 2 is, and why the scheduler covers it
Once we know what a client sounds like, the next question is which specific campaign to run for them this week. Step 2 is the act of answering that: turning the client's brief plus current market signals into one campaign-specific brief, with the targeting, the angle, the proof points, and the segments to write for.
Josh's scheduler product is the propose-then-promote surface for Step 2. A nightly cron scores per-client opportunities across nine factors (dormant pool, quarter-end promo, hiring signals, conference windows, and others). Operators see the high-scoring proposals at internal.aicro.co/gtme, click Accept on one, and the scheduler creates the Airtable campaign-creation task that kicks off the actual brief-writing work.
What still needs operator-UX work sits downstream of Accept: the brief writing itself, mode selection (signal-led / segment-led / ICP-only), the multi-output dispatch to RevBase, Clay, and grading prompts, and the approval gate before downstream steps consume the brief. Those will get their own deep-dive write-up after Steps 3 and 4.
The Step 1 brief is one of the inputs the scheduler reads when scoring Step 2 proposals. When a client's voice or ICP changes in their playbook, the scheduler's per-client proposals shift accordingly on the next cron. The two steps are two halves of the same continuous loop: Step 1 keeps the playbook current, Step 2 turns it into a specific campaign to run this week. Treating them together makes that loop visible.
Today vs. Tomorrow
Same agent. Same data. Same brief output. What changes is who can run it, and how much of the work is visible.
Refresh runs in Sarah IDE
- Practically only one engineer refreshes briefs today
- No staleness signal anywhere in the UI
- No feedback flow from replies, deals, or market shifts
- Cannot ship to an external customer
Refresh runs on /clients
- Any GTM teammate can refresh, no engineer needed
- Freshness, version, and doctrine checks visible inline
- Schema ready to accept feedback signals from the loop
- Same architecture customers can use directly
The new path uses the same primitives Steps 4A, 5, and 6 already run on. The work to bring Step 1 home is moving the agent's runner, not rewriting the agent.
Why this matters strategically
Three forces are converging this year. Each one points at the same conclusion: vertical agents that show their work and learn from their data are pulling away from horizontal AI tools.
1. The moat is the loop, not the model
Every credible 2026 analysis says the same thing. Closed-loop products that get smarter per customer beat horizontal AI tools, full stop. We have the data. We just have to close the loop.
2. Vertical depth beat horizontal width
Anthropic won 70% of enterprise picks in 2026. The winners were vertical. The losers were "AI for everyone." AICRO's doctrine is exactly the kind of vertical depth that compounds.
3. Doctrine is the product, not a feature
Copy.ai's "Brand Voice" was their first-class feature, not a polish item. We have 50x their doctrine. Hiding it behind a generic UI is leaving the moat invisible.
4. Internal-first, externally-ready
We use ourselves as the test customer for 4-8 weeks. Validate the loop on AICRO clients. Productize once the pattern earns its keep on real data.
What "doctrine visible" looks like
The brief already has machine-readable validators. Voice and Identity checks. Em-dash count. Service-clarity flag. The §38b hard-fail. None of these are surfaced anywhere a GTM operator can see them.
This is the single highest-leverage UX change in the proposed PR. The cost is about an hour of work. The payoff is real and immediate: it telegraphs the positioning that wins in 2026.
A CSM seeing this page understands the brief is healthy, what rules are enforced, and where the next attention belongs. A future customer seeing this page understands AICRO is auditable. That is the trust mechanism for an opinionated AI tool.
How this connects to work already in flight
This is not new infrastructure. It is the missing surface on top of three things already shipping or shipped. Each piece below is real, owned, and either live or in progress.
Operator UI on /clients
The new surface. Reads briefs from Supabase, regenerates via server-side Trigger.dev task, shows doctrine inline. The hub that connects to everything below.
Scheduler proposals (the eventual Step 2)
9-factor weighted scoring per client. Cron generates per-client proposal cards at internal.aicro.co/gtme. Operator clicks Accept, which fires POST /scheduler/accept and creates the Airtable Campaign Creation Request. Architecture is shipped; not yet in active daily use.
INTERNAL TESTING NEXT WEEKpropose-then-promote
Client Portal
Inbox, reply agent, follow-up automation, customer-facing surfaces. Reply agent and follow-up automations are already live. The eventual external read-view of the brief would live here. Tenant model and auth contract come from the portal.
PARTLY LIVEBRIEF VIEW NOT YET BUILT
Closed-loop scheduler factors
dormant_lead_pool, quarter_end_promotion, plus seven others. Each reads real data from Supabase or Airtable and scores per-client campaign opportunities. The pattern Step 1 brief refresh should follow.
SHIPPED · AWAITING VALIDATION9 factors, weighted 1.00
Inbound message pipeline
Replies classified, warmth-scored, and stored in inbound_messages. The training signal that should feed Step 1 brief refresh proposals once the loop wires through.
LIVEsignal source
The implication: Step 1 should not invent a parallel propose-then-promote architecture. It should write proposed brief updates into the same Supabase tables the scheduler reads from, and surface them through the same operator-facing patterns Josh already built. One pattern, not two.
Anticipated concerns
Three reasonable pushbacks. Best addressed before they show up in a review thread.
"Doesn't moving Step 1 server-side make every refresh more expensive?"
No. The agent runs the same LLM calls today, just from a developer's laptop. Moving it to a Trigger.dev task changes where the call originates, not how many calls happen. A refresh fires when an operator clicks Refresh, not on every email composition. The actual cost trade is in the other direction: less engineer time per refresh, no idle Railway code-server compute, and per-call cost telemetry we can surface to operators.
"Aren't we just rebuilding Sarah IDE in a web wrapper?"
No. Sarah IDE is a developer environment for building and debugging agents. The proposed page is an operator surface for running them. Different audience, different gestures, different success criteria. Sarah IDE survives unchanged for the people building the agents. The operator UI replaces the launch path for the people running them.
"Doesn't this conflict with Josh's scheduler work?"
It complements it. The scheduler proposes which campaigns to run for a client. Step 1 produces the brief those campaigns write from. Two layers, complementary surfaces. The decision in this document is that Step 1's eventual feedback signals route through the same /gtme proposals UI Josh already built, rather than spawning a parallel one.
What to ship
Four ordered pieces. Two PRs and a validation run. The work itself is doable today; sequencing is the only constraint.
step1Pipeline.ts. Same primitives Steps 4A, 5, 6 already use. Old Sarah-IDE path stays available in parallel for comparison runs on real clients.workspace_id column added on the relevant tables, defaulted to a single internal workspace.What we are explicitly not changing
- The doctrine library stays in aicro-os. The brain that knows what a brief should look like remains where the team edits it. We are wrapping it with a runner, not rewriting it.
- The campaign-builder agent stays opinionated. We are not adding a workflow builder. Customers buy AICRO because we have the opinion, not because we make them write one.
- Sarah IDE survives for the people building the agents. It is a developer tool and it is good at being one. We are just removing it as the operator-facing launch path.
- The scheduler product stays Josh's. Step 1's propose-then-promote feed routes through Josh's existing /gtme surface, not a parallel UI.
Interim plan, until the scheduler lands
The scheduler is expected in internal testing next week. Until it does, the bridge state below is meaningfully better than the Sarah IDE path, while staying small enough to throw away when the full architecture takes over.
- Ship the
/clientspage read-only first. No refresh button yet. Surfaces every client's current brief, version, last-refresh date, and doctrine checks. CSMs immediately get the visibility they don't have today, even without the ability to trigger a refresh. - Wrap Step 1 in a one-command refresh script. While the Trigger.dev migration is being built, expose
npm run step1-refresh -- --client capitalizeon the engineering machine so anyone with repo access can refresh without opening Sarah IDE. Removes the IDE friction without waiting for the full server-side migration. - Schema additions land non-breaking.
workspace_idcolumns nullable on day one, defaulted before any code reads them. - Flip the refresh button on per-client as comparison runs come back clean. No big-bang cutover. CSMs gain self-service one client at a time.
Once the scheduler reaches active internal use next week, this interim collapses into the full operator UI naturally. No throwaway work that can't be repurposed.
Open questions for sign-off
Each question is written for the person it belongs to. Yes-or-no answers move us forward; longer answers get a Slack thread.
Does our codebase already have a way to keep one customer's data separate from another's?
If yes, we plug into it. If not, we'll add a small column to track which customer a row belongs to. Either way the answer is fast and unblocks PR 1.
When the brief and the rest of the playbook eventually need to live alongside customer-portal data, can we agree on the shared customer-account shape now?
aicro-cortex, the playbook in Supabase, and the client portal all need to agree on what a "customer account" means. If we lock the shape now, neither side has to refactor later. A 30-minute call covers it.
Should Step 1's brief-update suggestions show up on the same /gtme page your scheduler uses?
One page for "the system is proposing something" feels cleaner than two. Curious if you see a reason to keep them separate.
Are you okay being the main person who uses this page each week?
The proposed flow is: open /clients, see which briefs are stale, click Refresh on the one you care about, look at what changed, promote it. Roughly 5 minutes per client.
Do we open a draft pull request for the team to review the code, or socialize this document first and get verbal sign-off?
Either path works. The document-first path tends to be cleaner when other teammates have adjacent work in flight, which is the case here.