Skip to content
Go back

Designing a Production-Grade Email System - From Architecture to Scale

Published:  at  08:50 PM

Email systems look deceptively simple.

At a surface level, it feels like:

In reality, email platforms sit at the intersection of distributed systems, data modeling, reliability, search, and business workflows. Small architectural mistakes compound quickly once volume, concurrency, and product requirements increase.

This article walks through how I approached designing a production-grade email system, from planning to execution, while balancing real-world constraints.


Problem Definition

The goal was to build an email platform that supports:

All while remaining:

This was not a toy project. It needed to behave like a real product.


High-Level Architecture

At a high level, the system was divided into three distinct layers:

  1. Email Transport & Storage
  2. Metadata & Indexing
  3. Product & Business Pipelines

Separating these early avoided tight coupling and allowed each layer to evolve independently.


Planning the Data Model (The Most Important Step)

Raw Emails vs Metadata

A critical early decision was to separate raw email storage from structured metadata.

Raw email storage:

Metadata:

Trying to store everything in a single system creates unnecessary cost and complexity.


Storage Strategy

Raw Email Storage

Raw emails were stored as .eml files in object storage (e.g. S3).

Why object storage:

Each email is written once and rarely modified.

Immutability was a feature, not a limitation.


Metadata Storage

Metadata required:

This ruled out document-only approaches early.

A relational database (PostgreSQL) was used for:

This made threading, search, and filtering predictable and debuggable.


Threading & Conversation Management

Threading is where many systems break.

Relying only on headers like In-Reply-To is not sufficient. Real-world email clients:

The solution combined:

Threads became first-class entities, not just inferred relationships.

This made:


Inbound Email Flow

  1. Email received via SMTP (SES)
  2. Raw .eml stored in object storage
  3. Lambda triggered asynchronously
  4. Email parsed into structured metadata
  5. Thread resolved or created
  6. Metadata persisted transactionally
  7. Notifications dispatched

Each step was:

Failures never blocked email ingestion.


Outbound Email Flow

Outbound email followed a mirrored but separate pipeline.

This separation mattered.

Outbound emails:

Sent emails were:

Inbound and outbound logic shared schemas — not execution paths.


Search & Indexing

Search is often underestimated.

The system avoided full-text indexing initially and focused on:

This covered:

Advanced search engines can always be added later.
Correct data modeling cannot.


Performance Constraints

Early constraints shaped the design:

Optimizations included:

Inbox rendering became a metadata problem, not a content problem.


Deployment Strategy

The system was deployed using:

Why serverless:

Cold starts were mitigated by:


Scaling Strategy

Scaling was planned in layers:

Storage Scaling

Read Scaling

Write Scaling

The system scaled by design, not by reaction.


Observability & Reliability

Every pipeline emitted:

Failures were expected and handled:

Silent failures are worse than loud ones.


Marketing & Business Pipelines (Often Ignored)

Email platforms are not just communication tools — they are business engines.

The architecture allowed:

This enabled:

Building this later would have been painful.
Building hooks early was cheap.


Trade-Offs & Lessons Learned

No architecture is perfect.

Trade-offs made:

What was gained:

If I had to rebuild it:


Final Thoughts

Email systems punish shortcuts.

They reward:

This architecture wasn’t designed to impress —
it was designed to survive production.

That, in the end, is the real benchmark.


If you’re building systems that need to last,
design for clarity first — scale will follow.


Suggest Changes
Share this post on: