Articles

Our Cold Email Deliverability Tanked. An Agent Fixed It.

Ibby SyedIbby Syed, Founder, Cotera
8 min readMarch 7, 2026

Our Cold Email Deliverability Tanked. An AI Agent Pulled It Back.

Our Cold Email Deliverability Tanked. An AI Agent Pulled It Back.

On a Tuesday in November, Priya noticed that open rates on our three largest SmartLead campaigns had dropped from 47% to 19%. In two days. No changes to subject lines, no changes to copy, no changes to sending schedule. Just a cliff.

She checked bounce rates. They'd climbed from 2.8% to 14.3%. That's not a yellow flag. That's a five-alarm fire. At 14%, your sending reputation is actively deteriorating with every email you send. Google and Microsoft are watching, and they have long memories.

Priya spent the next six hours doing forensics. She exported the bounce list, cross-referenced it against the lead lists we'd uploaded the previous week, and found the source: a list of 2,400 leads from a new data vendor. The vendor had promised "verified" emails. About 340 of them bounced. Another 200+ were catch-all addresses that accepted everything but delivered nothing. The list was contaminated, and it had poisoned our sender reputation across two domains.

That six-hour forensic session saved our outbound program. It also made us realize we couldn't keep doing this manually.

How Deliverability Dies

Most people think deliverability is binary — either your emails land in inbox or they go to spam. It's actually a gradient that shifts constantly based on signals you can and can't control.

The signals you control: bounce rate, spam complaint rate, sending volume consistency, authentication (SPF, DKIM, DMARC), and content quality. The signals you can't control: recipient engagement patterns, ISP algorithm changes, and whether some random person marks your email as spam because they're having a bad day.

What happened to us was a cascade. Bad leads caused bounces. Bounces damaged sender reputation. Damaged reputation caused emails to land in spam instead of inbox. Spam placement killed open rates. Low open rates further damaged reputation. It's a death spiral, and it moves fast.

Tomás, who manages our sending domains, explained it to me like this: "Building sender reputation takes months of careful warmup. Destroying it takes about 48 hours of bad data."

We had three sending domains. Two were compromised by the time Priya caught the problem. The third survived only because its campaigns hadn't pulled from the contaminated list yet.

The Manual Recovery

Recovering deliverability manually is tedious, slow, and error-prone. Here's what Priya and Tomás did over the next two weeks.

They paused all campaigns on the affected domains immediately. Every day those domains kept sending was another day of reputation damage, so speed mattered. Then they exported every lead list from every active campaign and ran them through a third-party email verification service. The verification caught another 180 addresses that hadn't bounced yet but were flagged as risky — role-based addresses (info@, sales@, support@), disposable email providers, and domains with no MX records.

They removed every flagged address. Then they rebuilt the warmup schedule for both domains. Warmup means starting at very low volume — maybe 20 emails per day — and gradually increasing over four to six weeks while maintaining high engagement rates. Tomás set calendar reminders to check warmup progress every morning. He spent about 15 minutes per domain per day monitoring open rates, bounce rates, and spam placement.

After six weeks, domain reputation had recovered enough to resume normal sending volume. Six weeks. That's six weeks of reduced outbound capacity, which meant fewer meetings booked, which meant a pipeline gap that the sales team felt for the entire quarter.

The total cost of that one bad lead list: roughly $34,000 in lost pipeline, plus about 80 hours of Priya and Tomás's time on recovery.

What an Agent Does Differently

After the recovery, we set up a lead cleanup agent that monitors deliverability metrics across all SmartLead campaigns continuously. Not once a day. Not when someone remembers to check. Continuously.

The agent watches four things.

Bounce rates per campaign, per domain, per lead list. If any segment exceeds a 3% bounce rate, the agent pauses sending from that segment and flags it for review. Priya set the threshold. She wanted it aggressive because the cost of false positives (pausing a good campaign for an hour) is negligible compared to the cost of false negatives (letting a bad list burn a domain).

New lead uploads. When anyone adds leads to a SmartLead campaign, the agent checks the list against known problem patterns — domains with no MX records, role-based addresses, disposable email providers, domains that have historically bounced. It quarantines suspect leads before a single email goes out. The contaminated vendor list that started this whole disaster? The agent would have caught 89% of the bad addresses before they entered a campaign.

Sending domain health. The agent monitors authentication records for each sending domain — SPF, DKIM, and DMARC configuration. When Tomás accidentally broke our DKIM record during a DNS migration in January, the agent flagged it within 20 minutes. The manual process would have caught it when open rates dropped, which could have been days later.

Warmup progress. For domains in warmup mode, the agent tracks daily sending volume against the warmup schedule and monitors engagement metrics. If engagement drops below expected thresholds during warmup, it slows the ramp rather than pushing through and damaging the domain.

The Numbers After Four Months

We've been running the agent for four months now. The results compared to our pre-agent baseline:

Average bounce rate dropped from 4.1% to 1.6%. The pre-agent number was already inflated by the November disaster, but even our "healthy" periods ran about 3.2%. The agent keeps it below 2% by catching problems before they propagate.

We've blocked 1,847 suspect email addresses from entering campaigns. That's 1,847 potential bounces that never happened. At our sending volume, those bounces would have been enough to trigger reputation damage on at least one domain.

Domain reputation incidents: zero in four months, compared to two in the three months before the agent. That's not a statistically massive sample, but Tomás sleeps better.

Time spent on deliverability monitoring: about 2 hours per week, down from roughly 12 hours per week during the manual period. Priya reviews the agent's weekly summary, spot-checks a few flagged leads, and moves on. She told me she actually enjoys reviewing deliverability data now because the agent has already done the boring part.

The most satisfying moment was in February. We onboarded a new data vendor and uploaded their first list — 1,100 leads. The agent flagged 67 addresses as suspect and quarantined them. Priya checked manually. Sixty-three of the 67 were genuinely bad. The other four were borderline but not worth the risk. Without the agent, those 63 addresses would have entered our campaigns and we'd have seen bounce rates spike within hours.

"It's like having a bouncer at the door," Rafael said during a team meeting. "Nobody gets into the campaign without getting checked first."

What We'd Do Differently

If I were starting from scratch, I'd set up the monitoring agent before sending a single cold email. Deliverability is one of those things where prevention costs almost nothing and recovery costs everything. We learned that lesson the expensive way.

I'd also set tighter thresholds earlier. Priya initially wanted to set the bounce rate alert at 5% because she thought 3% would generate too many false alarms. After seeing how fast things deteriorate past 5%, she dropped it to 3%. The false alarm rate has been fine — maybe one unnecessary pause per month, which is resolved in minutes.

One thing we haven't automated yet is vendor evaluation. When we get a new lead list from a data provider, the agent checks individual addresses but doesn't track vendor-level quality over time. We want to build a scorecard: Vendor A delivers 98% clean lists, Vendor B delivers 91%, Vendor C delivered that nightmare list in November and should never be used again. That's on the roadmap.

Tomás summed it up well: "Deliverability isn't something you fix. It's something you maintain. And maintaining something 24/7 is exactly what an agent is for."

He's right. The worst part of the November incident wasn't the $34,000 in lost pipeline. It was knowing that the problem had been growing for 48 hours before anyone noticed. An agent doesn't sleep, doesn't get distracted, and doesn't assume that a 5% bounce rate is "probably fine." It checks, and it acts, and it keeps your domains alive.


Try These Agents

For people who think busywork is boring

Build your first agent in minutes with no complex engineering, just typing out instructions.