Defeating Disposable Email Addresses

By rocketlaunchr.cloud

There are times when you want to prevent disposable emails being used to create accounts. It may be because you want to escalate the inconvenience for someone intending to exploit your product’s trial period. Alternatively, it could be because you want to reject accounts created by spam-robots so you could focus on your real customers. There are a myriad of reasons. In this article, I will highlight some of Gmail’s lesser-known features that you have to account for if you want to do it right.

Just bear in mind that these best-practices are not foolproof and a determined spammer will eventually succeed.

An email address is made up of 3 parts. The “@” sign in the middle, the local-part before and the domain. The most important fact you need to remember is the domain is case-insensitive. This means that domain.com, DOMAIN.COM, and Domain.Com are all equivalent. The local-part is case-sensitive but most (if not all) reputable email providers treat it as case-insensitive. Treating it as case-sensitive or case-insensitive is a business decision and I won’t recommend either way.

There are many disposable email address providers. There are legitimate use-cases for their services, but they are also easily exploitable by spammers.

They provide temporary email addresses which spammers can use to create an account. They receive mail for that address (allowing the spammer to “verify the email”) and then expire after a short period of time.

In order to reject disposable email addresses, you need to have a continually updated list of bad domains. My favorite source is here: https://github.com/martenson/disposable-email-domains

This list is regularly updated and easily parseable by all programming languages. If you are using a language like Go, I recommend loading it into memory once by storing it into a global map[string]struct{}. By using the map’s key for the domain, you can quickly search for a domain.

If you are using PHP (where each request is considered independent), my code snippet can be found in the repository’s readme.

It is a little-known fact that Gmail treats the local-part very liberally. Not only do they treat it as case-insensitive, but they also ignore all periods! This means that john.smith@gmail.com, johnsmith@gmail.com, and JohnSmith@gmail.com are all equivalent.

Many other popular email providers have similar policies so you must check their documentation.

If you are considering all users as unique based on their email address, then special care must be taken. It is recommended practice that you store all your user’s email addresses in a form that complies with their provider’s policies.

In the case of Gmail, you must remove all periods and then lower-case the text (preferably using a function that understands Unicode). When you are comparing the user’s typed-in email address during login, obviously you must also remove all periods and lower-case the text.

+----+-------------+------------+-------------+
| id | email | domain | preferred |
+----+-------------+------------+-------------+
| 1 | johnsmith | gmail.com | john.smith |
+----+-------------+------------+-------------+
| 2 | SallyWatson | noidea.com | SallyWatson |
+----+-------------+------------+-------------+
Note: Add a unique composite index for (email, domain).
The preferred column is optional.

Gmail has another feature that is not well known. If you suffix the local-part with a + and any combination of words and numbers such as john.smith+junk@gmail.com, then Gmail will automatically ignore the suffix. This is used so you can set up filters which will automatically categorize your emails.

For our purposes, this a very nice system for generating easy disposable email addresses. If you choose to block Gmail suffixes for the purpose of email uniqueness, then remove the suffix before storing it in your emails table above.