Validating Email Addresses
Why It’s More Complicated Than It Seems
At first glance, validating an email address sounds easy. You might assume there’s a definitive way to determine whether an address is valid before sending email to it. In reality, email infrastructure was never designed to provide perfect, real-time validation.
There are several layers of “validation,” each with different levels of confidence:
- Syntax validation
- Domain validation
- SMTP-level recipient validation
- Actual deliverability validation
Each method helps, but none are completely reliable on their own.
1. Syntax Validation
The simplest form of validation is checking whether the email address is syntactically correct.
For example:
john@example.com
is obviously better formed than:
john@@example
A syntax validator can check for:
- Presence of
@ - Valid domain format
- Invalid characters
- Excessively long local parts or domains
- Consecutive dots
- Missing top-level domains
However, syntax validation only tells you whether the address looks valid. It says nothing about whether the mailbox actually exists.
For example:
this-user-does-not-exist@gmail.com
is perfectly valid syntactically.
Be Careful About Being Too Strict
One common mistake is rejecting legitimate addresses because the validator assumes email addresses are simpler than they actually are.
Real email addresses may contain:
"john smith"@example.com
user+tag@example.com
customer/department@example.com
Although many systems never encounter these formats, they are legal according to email standards.
In practice, many applications intentionally use a simplified rule set because full RFC compliance is surprisingly complex.
2. Domain Validation via DNS
The next step is checking whether the domain exists and is configured to receive email.
This usually involves a DNS lookup for:
- MX records
- Or fallback A/AAAA records
For example, the domain:
gmail.com
has MX records that point to Google’s mail servers.
If a domain has no MX records and no usable fallback host, it is very unlikely to receive email successfully.
What This Tells You
DNS validation confirms:
- The domain exists
- The domain is configured for email
It does not confirm:
- The mailbox exists
- The recipient is valid
- The mailbox is active
So:
nobody123456@gmail.com
still passes DNS validation.
3. SMTP Recipient Verification
A more advanced technique is connecting directly to the destination SMTP server and attempting recipient verification.
The general process is:
- Lookup the MX record
- Connect to the SMTP server
- Start an SMTP session
- Issue commands such as:
HELOorEHLOMAIL FROMRCPT TO
If the server rejects the RCPT TO command, the address is definitely invalid.
Example:
550 No such user
This is often called SMTP probing or SMTP verification.
Why SMTP Verification Is Unreliable
Years ago, this approach worked fairly well. Today, many mail servers intentionally make validation difficult because of spam prevention and privacy concerns.
Common issues include:
Catch-All Domains
Some domains accept all recipient addresses:
anything@example.com
doesnotexist@example.com
random123@example.com
All may appear valid even if no actual mailbox exists.
The server may silently discard unknown mail later.
Anti-Spam Protections
Many servers:
- Reject verification attempts
- Rate-limit connections
- Require TLS
- Require authentication
- Detect probing behavior
- Temporarily accept recipients and reject later
Some systems intentionally return ambiguous responses to prevent mailbox harvesting.
Greylisting and Temporary Failures
A server may temporarily reject requests with responses such as:
451 Try again later
This does not necessarily mean the address is invalid.
Different Behavior Than Actual Delivery
Some SMTP servers behave differently during verification than during real email delivery.
For example:
- Verification may succeed
- Actual delivery later fails
- Or the reverse
This makes SMTP validation probabilistic rather than definitive.
4. Bounce Processing: The Most Reliable Real-World Method
In practice, the most accurate way to maintain a clean email list is simply to send email and process delivery failures (bounces).
This approach works because:
- The receiving mail system performs the actual delivery attempt
- The remote server has complete knowledge of valid recipients
- Final delivery status is known
Typical workflow:
- Send email normally
- Use a dedicated bounce/return-path address
- Receive non-delivery reports (NDRs)
- Parse bounce messages
- Mark addresses as invalid or problematic
Why Bounce Handling Is Difficult
Unfortunately, bounce messages are not standardized as cleanly as many developers expect.
Different mail systems produce different formats:
- Exchange
- Postfix
- Gmail
- Yahoo
- Office 365
- Legacy SMTP servers
- Foreign-language systems
Bounce parsing often requires heuristics and pattern matching.
Some bounce messages are structured as:
multipart/report
Others are just free-form human-readable text.
Soft Bounces vs Hard Bounces
Not all failures mean the address is invalid.
Hard Bounce
Usually permanent:
- Mailbox does not exist
- Domain does not exist
- Recipient permanently rejected
Example:
550 User unknown
Soft Bounce
Usually temporary:
- Mailbox full
- Server unavailable
- Temporary DNS issue
- Greylisting
Example:
452 Mailbox full
A good email system distinguishes between temporary and permanent failures.
Modern Reality: No Perfect Validation Exists
There is no universally reliable way to determine in advance whether an email address is truly deliverable.
Every method has weaknesses:
Best Practice Strategy
A practical email validation system usually combines multiple techniques:
- Perform syntax validation
- Verify domain MX records
- Optionally perform cautious SMTP checks
- Send confirmation/verification emails
- Process bounces automatically
- Track engagement and delivery history
For user registrations, the most reliable approach is often simply:
Send a confirmation email containing a verification link.
If the user receives the email and clicks the link, the address is valid enough for practical purposes.
Final Thoughts
Email was designed decades ago in a much more trusting Internet environment. Modern anti-spam systems, privacy protections, and large-scale abuse prevention have made deterministic validation increasingly difficult.
As a result, email validation is less about proving an address is valid and more about reducing bad addresses while maintaining good deliverability practices.