Garage

How to Protect Your Email Address From Spam Crawlers

The email addresses are not always stolen by leaving footprints on other sites. In most cases, they are stolen directly from your own web site.

What are spam crawlers?

Spam crawlers are automatically programmed bots that steal your email address from the web by various mechanisms and use it in bulk email or other purposes usually referred as spam. This process is also known as email address harvesting.

Spam Crawler

Contact details, terms and policies, about pages and pretty much everywhere on your site where you publicly expose your email address, is a great resource for crawlers.

The crawler or so called harvester automatically searches the HTML content of your web site, usually scanning the link tag

<a></a>

and searching the href=”mailto:” attribute inside.

So what’s the idea?

There’re few workarounds for protecting email addresses and you’ve probably seen solutions like displaying the email in format: info (at) invoicebus (dot) com, embedding the email into an image, implementing contact form etc. However, in all cases you don’t get clickable email link, so here we’re going to use a little hack.

Beside the href=”mailto:” attribute, the spam crawler scans every HTML code that contains “@” sign, so can easily figure out and extract the actual email address. Our objective is to keep the crawler task difficult as much as possible.

In fact, we’ll programmatically hide the email address behind variable and dynamically print it in run-time, exactly when the page is rendered by the browser. The crawler won’t be able to find it anywhere in the code.

How to assign a value to the variable so to the crawler couldn’t see it?

If we divide the email address on 3 parts,
Parts of the email address: Username, at sign, domain name
we can assign the value sequentially in two steps by concatenating strings.
Here’s the JavaScript snippet for it:

<script type="text/javascript">

     var emailE = 'invoicebus.com';
     emailE = ('support' + '@' + emailE);
     document.write('<a href="mailto:' + emailE + '">' + emailE + '</a>');

</script>

The code will display the following link in the browser:

You can even write your own JS function that transforms the letters with custom pattern, but for now I’ll stick to the basics.

Note: If the visitor of the site has disabled the JavaScript in his browser, the email address won’t be shown.

Of course, you also have another option of not using the JavaScript by directly encoding the value of the HREF attribute with HTML ASCII encoding or URL encoding.

Clear HTML:

<a href="mailto:yourname@domain.com">yourname@domain.com</a>



ASCII encoded HTML:

<a href="&#109;&#97;&#105;&#108;&#116;&#111;&#58;&#121;&#111;&#117;&#114;&#110;&#97;&#109;&#101;&#64;&#100;&#111;&#109;&#97;&#105;&#110;&#46;&#99;&#111;&#109;">&#121;&#111;&#117;&#114;&#110;&#97;&#109;&#101;&#64;&#100;&#111;&#109;&#97;&#105;&#110;&#46;&#99;&#111;&#109;</a>

In both cases, the browser will display the following:

yourname@domain.com


Note: Some spam crawlers can extract the email address even HTML encoded.

I quickly wrote a simple HTML ASCII encoder that can be used to encode your email addresses or any text you want.

Enter your email/text
Copy the encoded text

There’re couples of other methods available, but I believe these two are the most effective ones, so far. It’s up to you which one you’ll choose. For us, JS method works pretty well.

A little effort and a few lines of code on your site will save you from tons of unsolicited and junk email later.


Now let me hear your thoughts on it.
Have any suggestions of how to improve these methods, or maybe some others we haven't heard about?

Stefan Chachovski
Co-founder of Invoicebus. Huge lover of nature, science, and chocolate cherry cordials. He occasionally writes on this blog about Invoicebus' stuff. Hello him on Twitter or subscribe to his updates on Facebook.
Stefan Chachovski

Latest posts by Stefan Chachovski (see all)

Leave a Comment