Why Email Parsing API capability matters for developers
Inbound email is a critical integration surface for modern SaaS. Product feedback loops, ticket ingestion, approvals, automated workflows, and user-generated content often arrive as emails. An effective email parsing API turns raw MIME into clean, structured JSON that your application can trust. Developer teams evaluating options often compare MailParse with Twilio SendGrid's inbound webhook, commonly referred to as SendGrid Inbound Parse. The differences are most obvious in payload format, setup complexity, and how each service handles edge cases like large attachments, non-ASCII headers, and retry semantics.
This article focuses specifically on email-parsing-api capabilities. We cover REST and webhook delivery models, payload structures, security, and developer experience. You will find code examples and a side-by-side comparison so you can choose the best path for your architecture and roadmap.
For broader email stack planning, see the Email Infrastructure Checklist for SaaS Platforms, and if you plan to embed email ingestion into product workflows, review Top Inbound Email Processing Ideas for SaaS Platforms.
How MailParse handles the Email Parsing API
The platform provides two developer-friendly delivery modes: a JSON webhook and a REST polling API. You can start with instant email addresses for rapid prototyping, then add a custom domain later. The service turns raw MIME into deterministic JSON designed to minimize string parsing in your code.
Webhook payload structure
Upon receipt of an email, the service posts a compact JSON body to your endpoint using UTF-8. The payload design aims to normalize variable email formats without losing fidelity.
- Top-level fields: id, received_at (ISO 8601), direction, spam_score, dkim_pass, spf_pass
- Envelope: from, to, rcpt_to, originating_ip
- Headers: a map of canonicalized header names to values with original casing preserved in a parallel map
- Addresses: structured arrays for from, to, cc, bcc with display_name and address components
- Content: subject, text, html with consistent newline normalization
- Attachments: filename, content_type, size, sha256, disposition, content_id, and a short-lived signed url for streaming
- Raw MIME: optional, off by default, retrievable via a signed url or included inline on request
Events are idempotent using the message id plus a stable event id. That lets your app de-duplicate replays in case of transient webhook failures.
REST polling API
Some teams prefer pull over push, particularly for isolated networks or staged rollouts. The REST API supports cursor-based pagination and selective fields to control payload size.
GET /v1/messages?cursor=eyJvZmZzZXQiOjE3MDQxMzEwMDB9&limit=50&include=metadata,addresses,attachments
Authorization: Bearer <token>
200 OK
{
"data": [
{
"id": "msg_24b9e8",
"received_at": "2026-04-22T10:22:51Z",
"subject": "Quarterly Report",
"from": [{"display_name": "Ops", "address": "ops@example.com"}],
"to": [{"display_name": "", "address": "reports@yourapp.mail"}],
"attachments": [
{
"filename": "Q1.pdf",
"size": 284120,
"content_type": "application/pdf",
"sha256": "ab3f...f12c",
"url": "https://api.service/messages/msg_24b9e8/att/1?sig=..."
}
]
}
],
"cursor": "eyJvZmZzZXQiOjE3MDQxMzEwMDB9"
}
Attachment URLs are time limited and require your API token. For high-volume ingestion, request only the fields you need and fetch attachments on demand to keep memory footprint small.
Security and idempotency
- HMAC signatures: each webhook includes X-Timestamp and X-Signature headers. Compute HMAC-SHA256 over timestamp + body using your shared secret, then constant-time compare.
- Replay protection: reject requests with a timestamp older than your threshold, for example 5 minutes.
- Retries: exponential backoff with jitter on non-2xx responses, capped with a multi-hour delivery window. All retries carry the same event id.
- Allowlist: optional source IP ranges for inbound hooks plus mandatory HTTPS.
This approach lets you verify authenticity and safely handle transient incidents without losing messages.
How SendGrid Inbound Parse handles the Email Parsing API
SendGrid Inbound Parse is a webhook that posts inbound email data to your HTTP endpoint. It is tied to the Twilio SendGrid ecosystem and uses multipart/form-data. This is a reliable choice for teams already standardized on SendGrid's infrastructure, and it is commonly deployed for ticketing and CRM ingestion.
Setup and routing
- You configure an inbound parse setting that maps a domain or subdomain to a destination URL.
- MX records for that domain must point to SendGrid. This step requires DNS control and may affect your routing strategy for other mailboxes on the same domain.
- Once DNS is validated and the parse route is active, SendGrid will receive mail for that domain and POST data to your endpoint.
Payload format
The webhook payload is multipart/form-data. Standard text fields include from, to, cc, subject, html, text, headers, charsets, envelope, spam_report, and spam_score. Attachments are posted as file parts with filenames and content types. If you enable the raw MIME option, the full message is provided in an email field or attachment.
Key practical implications for developers:
- You need a multipart parser in your stack. In Node.js that usually means busboy, multer, or a similar library. In Python, use Werkzeug or Starlette form parsing.
- Address values in fields like
tooften include display names and angle brackets. You must parse these if you need structured objects. - Character encodings differ by message. The
charsetsmap indicates encoding per field, which you may need to apply when reading the parts.
Security considerations
- There is no HMAC signature for Inbound Parse at this time. Recommended practices include whitelisting Twilio IP ranges, enforcing HTTPS, and using Basic Auth or mutual TLS on your endpoint.
- Delivery expects a quick 2xx response. If your endpoint is slow or unreachable, delivery can fail with limited retry behavior.
- Because the webhook is multipart, be mindful of memory limits and streaming strategy for large attachments.
Side-by-side Email Parsing API comparison
| Feature | MailParse | SendGrid Inbound Parse |
|---|---|---|
| Delivery models | JSON webhook and REST polling | Webhook only |
| Payload format | UTF-8 JSON, stable schema | multipart/form-data, per-field charsets, optional raw MIME |
| Setup | Instant addresses, optional custom domain | Requires DNS MX delegation to SendGrid |
| Attachments | Structured metadata plus signed streaming URLs | File parts inside multipart POST |
| Security | HMAC-signed webhooks, replay protection | IP allowlist and TLS, no built-in webhook signature |
| Idempotency | Stable event id and message id, explicit de-dup guidance | Derive from headers like Message-ID, custom logic required |
| Raw MIME access | Optional via inline field or signed URL | Optional via email field or attachment |
| Parsing burden | Addresses and headers normalized into objects | Addresses provided as strings, you must parse |
| Retries | Exponential backoff with multi-hour window | Requires fast 2xx, limited retry behavior |
| Testing & sandbox | Instant test inboxes, REST replay tools | Requires DNS or localhost tunneling to receive posts |
Code examples
JSON webhook receiver with Node.js
This example shows how to receive a signed JSON webhook, verify the HMAC, and stream-archive attachments only when needed.
// npm i express body-parser node-fetch
const express = require('express');
const crypto = require('crypto');
const fetch = require('node-fetch');
const app = express();
app.use(express.json({ limit: '10mb' }));
const SHARED_SECRET = process.env.WEBHOOK_SECRET;
function verifySignature(rawBody, timestamp, signature) {
const hmac = crypto.createHmac('sha256', SHARED_SECRET);
hmac.update(timestamp + '.' + rawBody);
const digest = hmac.digest('hex');
return crypto.timingSafeEqual(Buffer.from(digest), Buffer.from(signature));
}
app.post('/inbound', express.raw({ type: '*/*' }), async (req, res) => {
const rawBody = req.body.toString('utf8');
const ts = req.header('X-Timestamp');
const sig = req.header('X-Signature');
if (!verifySignature(rawBody, ts, sig)) {
return res.status(401).send('invalid signature');
}
const event = JSON.parse(rawBody);
const { id, subject, from, to, attachments = [] } = event;
// Persist metadata
console.log('msg', id, subject, from, to);
// Lazily fetch attachments only if they are needed
for (const att of attachments) {
if (att.size > 5 * 1024 * 1024) {
console.log('deferring large attachment', att.filename);
continue;
}
const resp = await fetch(att.url, {
headers: { Authorization: 'Bearer ' + process.env.API_TOKEN }
});
// Stream to storage
// await streamToS3(resp.body, `inbound/${id}/${att.filename}`);
console.log('fetched', att.filename, resp.status);
}
res.status(204).end();
});
app.listen(3000, () => console.log('listening on 3000'));
Polling via REST
# Pull newest 50 messages that include metadata and addresses
curl -sS -H "Authorization: Bearer $API_TOKEN" \
"https://api.service/v1/messages?limit=50&include=metadata,addresses" | jq '.data[].id'
SendGrid Inbound Parse with Node.js and busboy
Inbound Parse posts multipart/form-data. The example below shows how to extract fields and save attachments using Busboy.
// npm i express busboy body-parser
const express = require('express');
const Busboy = require('busboy');
const fs = require('fs');
const path = require('path');
const app = express();
app.post('/sendgrid-inbound', (req, res) => {
const busboy = Busboy({ headers: req.headers });
const fields = {};
const attachments = [];
busboy.on('field', (name, val) => {
fields[name] = val;
});
busboy.on('file', (name, file, info) => {
const { filename, mimeType } = info;
const dest = path.join('/tmp', filename || name);
const stream = fs.createWriteStream(dest);
file.pipe(stream);
attachments.push({ field: name, filename, mimeType, path: dest });
});
busboy.on('close', () => {
// Addresses arrive as raw strings - parse if you need objects
const subject = fields.subject || '';
const from = fields.from || '';
const to = fields.to || '';
console.log('subject', subject);
console.log('from', from);
console.log('to', to);
console.log('attachments', attachments);
// If raw MIME was enabled, fields.email contains the full message
if (fields.email) {
console.log('raw MIME size', fields.email.length);
}
res.status(204).end();
});
req.pipe(busboy);
});
app.listen(3001, () => console.log('listening on 3001'));
Performance and reliability in real-world workloads
When evaluating an email parsing API, consider not just the happy path, but also spikes, malformed mail, and the long tail of encodings. The following are practical differences you will observe during production:
- Attachment handling: JSON plus signed URLs lets your app fetch attachments on demand and stream to object storage. Multipart form-data uploads push the entire file to your endpoint immediately, which increases ingress bandwidth and memory pressure. If your service processes only metadata for many messages, on-demand downloading will reduce costs.
- Character sets and IDNs: A normalizing layer that produces UTF-8 with decoded headers removes guesswork and reduces edge case bugs. With multipart posts, you often rely on the charsets map and must manually decode fields.
- Idempotency: A stable event id enables safe retries. Without that, you will need to construct your own keys from Message-ID, Date, and envelope to avoid duplicates.
- Retries and backoff: A delivery pipeline that queues and retries for hours helps survive brief outages. If your provider requires very fast 2xx responses with minimal retries, you must scale horizontally or implement buffering in front of the webhook.
- Observability: Consistent JSON is easier to log, index, and search. If you ingest multipart payloads, plan for a decoding layer before metrics and tracing.
If you are building a customer-facing email channel, pair your ingestion strategy with deliverability best practices to keep replies flowing. The Email Deliverability Checklist for SaaS Platforms outlines the DNS, authentication, and monitoring steps many teams overlook.
Verdict: Which is better for Email Parsing API?
If you already run Twilio SendGrid for outbound and want a single vendor, SendGrid Inbound Parse is a sensible default. It is mature, stable, and well documented. Expect to spend extra time on multipart parsing, character decoding, and address normalization in your application code. You will also need to complete DNS setup and design for quick 2xx responses under load.
For teams that want instant setup, normalized JSON, streaming attachments, HMAC-signed webhooks, and a REST polling option, MailParse is often the faster path to production. The developer experience centers on predictable schemas and operational safeguards that minimize the work between "email received" and "business logic executed".
FAQ
Does an email-parsing-api need both REST and webhook delivery?
Not always. Webhooks are ideal for low-latency ingestion and event-driven architectures. REST polling helps when your network cannot accept inbound connections, when you want strict rate control, or when you prefer batch processing. Many teams use webhooks for real-time paths and REST as a safety net or for reprocessing.
What is the best way to handle large attachments?
Stream, do not buffer. For multipart payloads, ensure your framework writes directly to disk or object storage. With signed attachment URLs, fetch on demand and stream to your destination. Avoid loading entire files into memory, enforce size caps, and keep a clear retry policy for storage failures.
How do I verify webhook authenticity without built-in signatures?
If the provider does not sign payloads, require HTTPS and Basic Auth, allowlist the provider's IP ranges, and validate that the domain in the headers aligns with expected routes. Store minimal request logs to support incident response and consider mutual TLS for higher assurance.
Should I parse addresses from strings or use structured fields?
Prefer structured fields when available. If you receive strings, rely on a robust address parser that supports comments, encoded display names, and Internationalized Domain Names. Cache results to avoid reparsing the same headers during retries.
How do I test inbound email integrations locally?
Use instant test inboxes when provided, or configure a subdomain and point MX to your provider in a staging environment. For local endpoints, run a tunneling tool like ngrok and post sample payloads or raw MIME messages. Build a fixture library of real-world messages to validate edge cases such as calendar invites, forwarded chains, and signed or encrypted mail.