Reply Loop.
The closed-loop product feature that turns reply data into sharper signal scoring. Sequencer webhooks ingest replies, an LLM classifier sorts them into 4 tiers, the positive band feeds archetype matching, and tomorrow's brief is sharper than today's. The operating mechanism behind why every Mama customer's predictions tighten month-over-month — and the surface that makes the defensibility argument operational, not theoretical.
01What Reply Loop actually does
Most outbound tools care about replies at the moment of the reply — the SDR's inbox fills, the sequencer pauses the cadence, the rep classifies the reply and decides next steps. The signal lives for about 24 hours, then disappears into a CRM activity log nobody re-reads.
Reply Loop treats every reply as training data for the next prediction. The reply doesn't just inform what the rep does today; it sharpens what Mama recommends to every rep tomorrow.
The precise mechanism:
- An SDR works a brief Mama surfaced (because the account hit ICP threshold and matched archetype
A-014). - The SDR sends a sequence through their existing sequencer (Outreach, Salesloft, Smartlead, etc.).
- Days later, the prospect replies. The sequencer fires a webhook to Mama:
reply received, sequence-id X, prospect Y, content Z. - Mama's LLM classifier scores the reply into one of 4 tiers (covered in §4).
- The engaged + not-now classifications are tagged with the original brief's archetype, signal-mix, persona, and template choice.
- Overnight, the archetype matcher updates: archetype
A-014's centroid pulls slightly toward this account's feature vector; its historical reply rate updates from32%to32.1%; its confidence interval tightens by a fraction of a point. - Tomorrow's brief queue is generated against the updated archetype library. New accounts that match archetype
A-014get the updated predicted reply rate and the updated template recommendation.
None of those individual steps is novel. The novelty is doing all of them in continuous operation, end-to-end, across the signal-detection layer AND the reply-outcome layer simultaneously. That combination is what other tools cannot replicate — covered in detail in the sibling essay on archetype matching.
Reply Loop is the operating system. Archetype matching is what the OS runs.
02The flywheel diagram
The mental model is a 5-stage loop. Each stage hands off to the next; the last stage hands back to the first; every full revolution sharpens the predictions.
The flywheel framing matters strategically. Most B2B tools are levels — they help you reach a level of performance, then plateau. Reply Loop is a compounding system — each cycle makes the next cycle better. Customers who run Mama for 12 months get materially sharper predictions than customers who run it for 3, even with identical ICPs and signal-source configurations. The math of compounding favors patience.
03Sequencer integrations
Reply Loop only works if reply data flows into Mama from wherever the SDR team actually sends. That means production integrations with the sequencer ecosystem. As of 2026-Q2:
The webhook contract
The ideal integration uses real-time webhooks — the sequencer fires an HTTP POST to Mama within ~5 seconds of a reply being received. The payload includes sequence ID, prospect identifier, full reply text, and metadata (which step in the sequence, which template variant). REST pull fallbacks exist for sequencers that don't support outbound webhooks, but they add 15-60 minutes of latency to the loop, which doesn't matter for archetype refinement (nightly cycle anyway) but does matter for the SDR's UI — they want to see "this prospect replied" without refreshing.
What if my sequencer isn't on the list?
Two options. (1) Use the generic Email-Channel integration: Mama monitors a shared inbox or CC'd address and parses replies from there. Works for any sequencer that can BCC a tracking address. Higher classification noise because we lose the sequence-step context, but functional. (2) Build a custom integration via the Mama API and webhook payload spec — most sequencers have a webhook capability buried somewhere; a sales engineer can usually wire it up in an afternoon. The full API docs live at /api.
04The 4-tier reply classifier
Not all replies carry the same training signal. The classifier separates replies into 4 tiers based on their semantic content. Each tier has different downstream behavior in the flywheel.
The classifier mechanics
The classifier is a fine-tuned LLM (Claude Haiku tier — cheap, fast, accurate enough at this 4-class task). Each reply is sent to the model with the original outbound email as context, plus the prospect's brief metadata. The classifier outputs the tier + a confidence score (0-1). Replies with confidence below 0.7 get human review queue (typical volume: ~5% of replies, reviewed within 24 hours by a Mama-side editor).
Accuracy on the test set (~3K hand-labeled replies as of 2026-Q1): 94% on engaged, 91% on not-now, 97% on wrong-person, 99% on never. The lower numbers on engaged/not-now reflect genuine human ambiguity ("interesting timing" can read both ways). The high numbers on wrong-person and never reflect that those tiers have stronger language signatures.
05What Reply Loop produces
Reply Loop isn't a feature you "use" — it's a pipeline that quietly upgrades the rest of Mama. But it produces three user-visible artifacts that show up across the product surface:
1. Sharper predicted reply rates on every brief
The most-felt output. Every brief shows a predicted reply rate per template variant (cold / warm / curious). At customer-month 1, these are modeled-confidence (based on industry benchmarks + ICP fit). By customer-month 4, they're blended-confidence (industry + your team's actual reply behavior). By customer-month 12, they're measured-confidence (your team's data drives 80%+ of the prediction). The confidence badge upgrades visibly over time — customers notice and value the transition.
2. Archetype drift alerts in the RevOps console
When an archetype's reply rate moves significantly (e.g., archetype A-014's historical reply was 32% for 9 months and just dropped to 24% over the last 4 weeks), the system flags it. RevOps gets a weekly digest of drifting archetypes. The drift usually has a real-world explanation (the market shifted, a competitor launched, a previously-strong segment commoditized). Catching drift early lets the team react before the archetype-driven send patterns burn cycles.
3. Template-archetype recommendations
The 200-template library at /templates/library is searchable by SDRs, but Reply Loop also learns which template variants work best for which archetypes — and surfaces that mapping in the brief. For archetype A-014 (post-Series-C data-stack migrators), the system might learn that tone-curious templates outperform tone-cold by 6 points; for archetype A-007 (new-VP-RevOps first-90-days), tone-warm wins by 4 points. This template-archetype matrix updates monthly and gets surfaced as "recommended template: tpl-007 · 87% match · +6pt lift vs default."
These three outputs are how customers see Reply Loop. The system itself is invisible plumbing; the value shows up as better predictions, drift catches, and template-match suggestions.
06The customer cohort journey
Reply Loop's value compounds over time, which means it looks different at month 1 than at month 12. Here's the typical customer trajectory.
modeled from industry benchmarks.blended — 40% your data, 60% industry baseline.measured · early.measured · stable.The trajectory matters for two business reasons. First, it's the case for annual contracts — customers who commit to 12 months unlock significantly more value than month-to-month customers; the pricing model reflects this. Second, it's a moat against the customer churning to a competitor — leaving Mama at month 8 means restarting the loop from zero with the new tool, sacrificing the accumulated archetype data. The flywheel doesn't just compound value; it raises switching costs.
07Privacy + data ownership
Reply Loop ingests reply content from your sequencer. That includes prospects' replies to your team's outbound — which means it includes personal data from people outside your company who didn't sign up for Mama. This is a real privacy concern and we treat it carefully.
What Mama stores
For each reply: the prospect's email address (hashed in our system, never re-displayed in plaintext outside the original brief), the reply content, the classifier's tier + confidence, and the linkage to the originating brief. We do not sell, share, or use reply content for any purpose outside refining the requesting customer's own archetype model. Reply content from customer A never trains or influences customer B's archetypes.
Retention + deletion
Default reply retention: 24 months. Customers can configure shorter retention windows (12, 6, or 3 months) from the workspace settings. Individual deletion: any prospect can request deletion of their reply data via a DSAR request submitted to [email protected]; we honor within 7 business days. Full GDPR Article 17 ("right to erasure") compliance documented in the Privacy Policy.
Customer data ownership
Your reply data — and the archetype model trained on it — belongs to your workspace. If you churn from Mama, the archetype library + reply history can be exported as JSON for ingestion elsewhere. We don't lock the data behind proprietary formats. The compounding value of Reply Loop is real, but it's not extortion: customers stay because the system gets better, not because their data is hostage.
08v1 today, v2 in Q4 2026
Reply Loop ships in two versions. v1 has been GA since 2025-Q4 and powers the archetype matching described above. v2 is in design partner testing now with full GA targeted for 2026-Q4.
v1 — what's live today
The pipeline described in §1-6. Sequencer webhooks ingest replies, 4-tier classifier sorts them, engaged + not-now feed nightly archetype refresh, predictions tighten over time. The classifier looks at the reply in isolation — it sees the response but not the broader thread or downstream conversation. This is sufficient for the 4-tier classification and works well in production.
v2 — what's coming in Q4 2026
v2 adds full conversation context to the archetype training signal. Instead of just "this reply was engaged," the system tracks the entire downstream conversation (replies 2, 3, 4 in the thread, eventual meeting booking, eventual deal closing) and propagates outcome data back to the original archetype. A "engaged" reply that led to a closed-won deal worth $80K ACV trains the archetype differently than a "engaged" reply that led to nowhere.
The upgrade matters because it lets archetypes start to predict downstream outcomes (meeting rate, qualified-opportunity rate, closed-won rate) rather than just reply rate. The same archetype that has 32% reply rate might have 9% meeting rate but 3% close rate; another archetype might have 24% reply rate but 11% meeting rate and 5% close rate. The second is more valuable despite the lower reply number. v2 surfaces those downstream metrics so the SDR queue can prioritize on what actually closes, not just what replies.
The technical blockers between v1 and v2
- CRM round-trip integration. v2 needs to know which sent emails became opportunities and which closed. That requires bidirectional CRM sync (Salesforce, HubSpot) with stage-progression event handling — work that's in flight for Q3 2026.
- Conversation threading across channels. A prospect might reply by email, then switch to LinkedIn DM, then book a meeting via Calendly. The system needs to recognize all three as the same conversation. Identity stitching is the open engineering problem.
- Longer training windows. Closed-won outcome data takes 60-180 days to settle. v2 archetype refresh runs weekly (not nightly) because the outcome signal needs that window to mature.
09Common mistakes
Five mistakes show up in customers' first 90 days that prevent Reply Loop from reaching its potential.
modeled — they're industry benchmarks adapted to the customer's ICP, not the customer's actual reply behavior. Some SDRs see "32% predicted reply" and treat it as a hard forecast. The number is directional, not committed. Real reply rates from the first 60 days are what calibrate the model. Plan capacity assuming ±8 points of accuracy, not the predicted number exactly.The flywheel runs whether you watch it or not. The customers who win watch it.
Reply Loop ships with Pro tier and above. Sequencer integration in under 30 minutes. First archetype maturation by day 60. By month 12, your predicted reply rates are sharper than any competitor's static-ICP scoring will ever be. Start the 14-day trial and connect your sequencer in the first 10 minutes — that's where the loop begins.