Blocking Email Spam That Comes As Image Attachments, PDF or Excel

To block spam, companies often relied on keyword ‘detection’, and drew up a list of keywords that commonly appeared in most of the spam email. This list would often include keywords such as ‘viagra’ or ‘bank’. However, this method often blocked genuine email and adding more keywords simply resulted in more false positives which in turn blocked legitimate email. But spammers became smarter too, and they addressed keyword blocking by replacing keywords such as ‘viagra’ to ‘v1agra’.

image email attachment

Spammers then began making use of images to bypass text-based content filtering, simply by no longer using any text content.

In the space of two months, spammers have switched from image spam to using PDF, Excel and ZIP file attachments. By using these attachments to send images instead of embedding them in the body of the email message, spammers have taken the cat-and-mouse game with anti-spam software developers to a new level.

Instead of embedding the image within the email itself, they ‘repackaged’ it within a PDF attachment. This move is clever for a number of reasons:

1. Email users ‘expect’ spam to be an image or text within the body of the email and not an attachment.

2. Since most businesses today transfer documents using the PDF format, email users will have to check each PDF document otherwise they risk losing important documentation.

The use of PDF spam was short-lived as anti-spam software vendors quickly came out with updates and filters that analyzed the body of every PDF file. Not to be defeated, spammers took less than a month to come out with a new option: Microsoft Excel files for push-and-dump scams. This move was clever for reasons similar to those above for PDFs:

1. Email users ‘expect’ spam to be an image or text within the body of the email and not an attachment.

2. Excel is another extremely common file-type in use and users are very familiar with this format.

3. Since many businesses use Microsoft Excel for spreadsheets, databases and so on, email users will have to check each document otherwise they risk losing important documentation.

Solution - Using keyword detection methods alone will not solve the problem because new spamming techniques have overcome that hurdle. The solution lies in a product that deploys as many anti-spam techniques as possible, including Bayesian filtering and filtering for images/text embedded in different file-type attachments, while at the same time maintaining false positives at a minimum. Full paper from GFi here.

Amit Agarwal

Amit Agarwal

Google Developer Expert, Google Cloud Champion

Amit Agarwal is a Google Developer Expert in Google Workspace and Google Apps Script. He holds an engineering degree in Computer Science (I.I.T.) and is the first professional blogger in India.

Amit has developed several popular Google add-ons including Mail Merge for Gmail and Document Studio. Read more on Lifehacker and YourStory

Awards & Titles

Digital Inspiration has won several awards since it's launch in 2004.

Google Developer Expert

Google Developer Expert

Google awarded us the Google Developer Expert award recogizing our work in Google Workspace.

ProductHunt Golden Kitty

ProductHunt Golden Kitty

Our Gmail tool won the Lifehack of the Year award at ProductHunt Golden Kitty Awards in 2017.

Microsoft MVP Alumni

Microsoft MVP Alumni

Microsoft awarded us the Most Valuable Professional (MVP) title for 5 years in a row.

Google Cloud Champion

Google Cloud Champion

Google awarded us the Champion Innovator title recognizing our technical skill and expertise.

Email Newsletter

Sign up for our email newsletter to stay up to date.

We will never send any spam emails. Promise.