Voice mining

Read their G2 reviews. Without reading their G2 reviews.

Mama runs daily NLP across G2, Trustpilot, Capterra, Reddit, HN, and X — clusters what users actually say about a company, then surfaces the recurring complaints your pitch should anchor on. Real quotes. Real sources. Synthesized into the brief.

6 voice sources · Daily NLP passes · Theme clustering
signalmama.com/voice/shopify.com
NLP refreshed 4h ago
Voice mining · shopify.com
1,247 mentions analyzed · 8 themes clustered · last 90 days · 6 sources
Sources ▾ + Watch theme
Sentiment · last 90 days
47%
22%
31%
Positive · 586 Neutral · 274 Negative · 387
▲ 12% Negative vs 90d ago
G2 · 412 Reddit · 287 Trustpilot · 231 Capterra · 164 HN · 98 X · 55
"Slow mobile checkout"
92% negative
47 mentions ▲ trending
"Mobile checkout takes 4-6 seconds to load on our store, lost about 12% of conversions this quarter."
G2 · verified review · 1w ago
"Anyone else seeing checkout latency on Shopify Plus? Mobile is 3x slower than desktop for us."
r/shopify · 5d ago
Pitch angle: performance, mobile CRO, page-speed tooling
"Hidden shipping cost surprises"
87% negative
31 mentions
"Customers keep abandoning at the shipping step. We can't show real-time rates until checkout."
Trustpilot · 2w ago
"Native shipping is a black box. Had to add 3 third-party apps just to estimate at the cart."
Capterra · 2w ago
Pitch angle: shipping APIs, rate transparency, cart-abandonment tools
"Plus pricing pressure"
71% negative
28 mentions
"Plus is solid but the price jumps don't match the feature gap from Advanced. Re-evaluating BigCommerce."
HN · 4d ago
Pitch angle: alternative platforms, cost-per-merchant calculators, migration tooling
"App ecosystem is unmatched"
84% positive
89 mentions
"Honestly the ecosystem is what keeps us. Found an app for every weird workflow we've thrown at it."
G2 · verified review · 3d ago
Pitch angle: avoid unless your product is genuinely better than the alternative app
Six voice sources

Where people actually complain.

Voice mining only works if you're mining the right ponds. We pull from six sources because each one captures a different kind of candor — from structured G2 reviews to anonymous Reddit threads where people say what they'd never say on G2.

G2
Structured reviews · verified user

The gold standard for B2B software. Long-form reviews with pros/cons, weighted by verified-user badge and review recency. Strongest signal for mid-market and enterprise SaaS evaluation.

Avg per company
~340 reviews
Best signal
Pros/cons clusters
Reddit
Community pain · anonymous candor

Where users say what they won't say on G2. Subreddit-scoped scraping — r/sales, r/saas, r/shopify, r/dataengineering, etc. The complaints here are raw, often funny, and almost always honest.

Avg per company
~180 mentions
Best signal
Switching threads
Trustpilot
Consumer-facing · complaint-skewed

The complaint vault for consumer brands. Skews negative by selection bias — useful for that exact reason. If you're selling to D2C, ecommerce, or any customer-facing SaaS, this is where the friction lives.

Avg per company
~240 reviews
Best signal
Service / ops pain
Capterra
SMB software · comparison-driven

Where buyers compare side-by-side. Skews SMB and mid-market. Reviews are shorter than G2 but the platform itself shows what tools are being benchmarked against each other — useful for switching-intent signals.

Avg per company
~120 reviews
Best signal
Comp benchmarks
Hacker News
Developer tools · technical critique

The most opinionated audience on the internet. If you're selling devtools, infra, or anything where engineers vote with their wallet, HN comments and Show HN threads are signal-dense. Selection bias toward early-adopter critique.

Avg per company
~75 comments
Best signal
Technical objections
X / Twitter
Real-time · viral complaint or praise

The fastest-fire feedback loop. When something breaks or ships, founders and power users post here first. Volume per company is lower than G2/Reddit but recency is unbeatable — we surface mentions within hours.

Avg per company
~60 mentions
Best signal
Real-time fire
Theme clustering

From 47 raw mentions to one theme your rep can name.

A pile of reviews is noise. The same complaint, said 47 different ways, is a theme — and a theme is the thing a rep can actually anchor a pitch on. Here's how Mama gets from one to the other.

Step 1 · Ingest
Raw mentions, six sources.
Every mention of Shopify Plus & mobile checkout ingested daily, deduplicated by source URL. 47 raw mentions on this topic alone in the last 30 days, all worded differently.
"Mobile checkout takes 4-6 seconds to load."
G2 · 1w
"Anyone else seeing checkout latency on Plus?"
r/shopify · 5d
"Slow buy button on mobile killing conversion."
Trustpilot · 9d
Step 3 · Output
One named theme, in the brief.
The cluster becomes a single theme card with a name, sentiment, mention count, trend, source mix, and 2 representative quotes.
"Slow mobile checkout"
47 mentions 92% neg ▲ trending
G2 · 18 Reddit · 14 Trust · 9 Cap · 6
The cluster is the work. Anyone can scrape G2. Naming the pattern in a way a rep can use it — that's where the synthesis matters. Mama's clustering accuracy across 10K validated themes: 94.2%.
Voice → opener synthesis

From "they complain about X" to "Hey {firstname}…"

A theme is useful. An opener that references the theme by name, with a real quote and a real source, is what gets the reply. Three worked transformations — all from the same Shopify voice profile in the hero.

Voice theme
"Slow mobile checkout"
47 mentions 92% neg ▲ trending
"Mobile checkout takes 4-6 seconds to load on our store, lost about 12% of conversions this quarter."
G2 · verified review · 1w ago
Synthesized opener
To: Sara, Head of Engineering · Shopify Plus merchant
Hey Sara — noticed 47 G2 + Reddit mentions in the last 30 days about slow mobile checkout on Plus, including one merchant who said they lost ~12% of conversions last quarter. We've been helping similar Plus shops cut TTI by 60% on the buy button — happy to share the playbook if relevant.
Why this works: verifiable specifics (47 mentions, 12% loss), real source pattern (G2 + Reddit), no generic "personalization-token" feel. Reply-rate lift vs generic: 3.4× in our customer data.
Voice theme
"Hidden shipping cost surprises"
31 mentions 87% neg
"Native shipping is a black box. Had to add 3 third-party apps just to estimate at the cart."
Capterra · 2w ago
Synthesized opener
To: Marcus, COO · DTC apparel brand on Shopify Plus
Hey Marcus — was reading recent Shopify Plus reviews and saw a pattern: "3 third-party apps just to estimate shipping at the cart" keeps coming up across G2, Trustpilot, and Capterra. We've built a single-API shipping layer for Plus that consolidates that into one call — saw it land particularly well with DTC apparel teams scaling SKU count.
Why this works: references the actual quoted complaint, names the cross-source pattern, names the buyer's segment. Feels researched because it is.
Voice theme
"Plus pricing pressure"
28 mentions 71% neg
"Plus is solid but the price jumps don't match the feature gap from Advanced. Re-evaluating BigCommerce."
HN · 4d ago
Synthesized opener
To: Priya, VP Operations · Shopify Plus merchant evaluating platforms
Hey Priya — saw chatter on HN and G2 this week from Plus merchants re-evaluating the price-to-feature jump from Advanced, with BigCommerce coming up specifically. If platform-cost modeling is on your roadmap this quarter, I built a side-by-side TCO calc for Plus vs BigCommerce vs headless that 14 brands used last quarter — happy to share.
Why this works: matches a buyer mid-evaluation with an asset they're already searching for. Time-bound ("this week", "this quarter") triggers urgency without manufactured scarcity.
Why nobody else does this

Voice mining is hard. That's the point.

BuiltWith doesn't do voice. Wappalyzer doesn't do voice. Bombora and 6sense don't surface customer voice the way an SDR can actually use it. Three honest reasons why.

01
Voice is text, not data.
Most intent platforms were built to count events — page views, form fills, content downloads. Voice mining requires NLP infrastructure that's expensive to build and harder to keep accurate. Easier to sell "intent scores" than to actually read the reviews.
Others Sentiment score (0–100)
Mama Named theme + 2 quotes + source mix
02
Built for marketers, not SDRs.
The "voice of customer" tools that do exist (Medallia, Qualtrics, Reputation) are aimed at marketers tracking their own brand. They're not built to surface a competitor's complaints in a format an SDR can paste into a cold email. Different buyer, different output.
Others Brand health dashboards for CMOs
Mama Pitch-ready voice for outbound reps
03
Six sources × daily refresh is hard.
Most who try voice stop at one source (usually G2) and call it sentiment. Going across six sources — each with its own TOS, rate limits, dedup, and source-quality weighting — is the kind of work that only matters if outbound is the use case. Otherwise it's overkill. Outbound is our use case.
Others 1 source · weekly refresh
Mama 6 sources · daily NLP · cited verbatim
The competitive moat isn't the data — it's the assembly. Anyone with budget can scrape G2. Clustering six sources into themes a rep can quote, refreshed daily, cited in the brief — that's the work nobody else bothers with because nobody else is building for the SDR.
Voice mining by plan

All 6 voice sources. From day one.

Source coverage and clustering quality are the same on every plan — Solo, Team, Pro. What scales is brief volume, theme-history depth, and how many companies you can watch.

Solo
$49 / month
100
Briefs / month
  • All 6 voice sources
  • Theme clustering + pitch angles
  • 30-day theme history
  • Personal watchlist · 25 accounts
Start free
Pro
$599 / month
Unlimited
Briefs forever
  • Custom voice sources (ticket systems, NPS, etc.)
  • Full voice API + theme webhooks
  • Multilingual NLP (12 languages)
  • Unlimited watchlists & workspaces
Talk to us

Want a voice source we don't track yet? Tell us — we add new sources monthly.

Voice mining questions

What everyone asks about voice mining.

Specifically about how voice mining works — pricing details and brief mechanics live on their own pages.

Clustering accuracy across 10K validated themes runs at 94.2% — measured against a held-out test set of human-labeled themes. Sentiment accuracy at the per-mention level is 91.7% for English. Both numbers refresh quarterly as we expand the labeled set. When Mama's confidence on a theme is below the threshold, we flag it as "emerging · low-confidence" rather than hiding it — so the rep sees the signal and the caveat together. Customer-reported false-positive rate over last 30 days: 0.9% of surfaced themes.
No. We use official APIs where they exist (Reddit, X, HN, Capterra partnerships) and licensed data partners or public-page reading with proper rate limiting where they don't (G2, Trustpilot). Every source has a documented retrieval method in our internal source catalog, reviewed quarterly by legal. We don't surface user-PII — every quote is from publicly-posted content with attribution. Full source-handling docs are available under NDA at [email protected].
Honestly — they're harder. If a company has fewer than 25 mentions across all six sources, Mama doesn't cluster into themes (clustering needs volume to be meaningful) — instead, the brief shows the raw mentions as-is, attributed. Companies under that threshold get a "low-volume voice" note in the brief so reps know not to anchor their pitch on a single complaint. The threshold scales by company size — a 50-person startup with 25 mentions is treated as data-rich, while a 5,000-person enterprise needs 100+ mentions to cluster.
On Pro, yes. Push customer voice into Mama via the webhook API — support tickets, NPS comments, churn-reason fields from your CRM, Gainsight notes, in-app feedback widgets. Mama runs the same NLP pipeline (cluster + sentiment + dedup) and surfaces internal themes alongside the public ones in the brief. Useful for CS-led-growth motions and expansion playbooks where the most predictive voice lives in your own data. Solo and Team are public sources only.
Solo and Team detect English mentions only — non-English content gets logged but skipped from clustering. Pro includes multilingual NLP across 12 languages (Spanish, French, German, Portuguese, Italian, Dutch, Polish, Japanese, Korean, Chinese-Simplified, Indonesian, Arabic) — clustering accuracy ranges from 88% to 94% depending on the language and source. Mama auto-detects language per mention; the brief shows the original-language quote plus a one-line English summary so the rep gets the gist without translation friction.
Verbatim. Every quote in the brief is the exact text as posted, with a link to the source. We don't paraphrase, summarize, or "improve" the quote — the credibility of a voice-anchored opener depends entirely on the rep being able to point to the actual review if asked. Mama only trims when the original is over 280 characters; the brief shows the trimmed quote with a "…" marker and the full text expands on click.
Several layers. We weight verified-user reviews higher than unverified ones (most platforms now provide this flag). We detect timing-clusters — when 50 5-star reviews land in 48h after a vendor-paid campaign, the cluster gets flagged "promotional event" and excluded from sentiment scoring. We deduplicate the same review text appearing across multiple sources (a common tell of paid amplification). And we down-weight reviews that read as templated — the clustering layer itself naturally surfaces these because they all phrase the same praise the same way. Customer-reported promotional false-positive rate: 0.4%.

More questions about briefs themselves? See the brief product page.

Let their words sell for you

Stop guessing at their pain. Quote it.

Run a free lookup on any domain. See the voice themes, the verbatim quotes, the synthesized pitch angles — same depth as a paid Mama brief, no signup. Reach for Solo when one domain stops being enough.

Free forever · No credit card · All 6 voice sources on Solo