Posts tagged ‘URL-Blocker’

Spam through Sourceforge.net (Update)

Today happened what I thought it was impossible: I received spam on my username’s alias email address registered at sourceforge.net.

Fig. 1: A simple but effective spam

Fig. 1: A simple but effective spam

Sourceforge.net is the world’s largest open source software development web site. I have an account there since I was a student and started to work as volunteer for an open source project. I still do, even if with not the same intensity as before. Sourceforge is known for its very aggressive anti spam measures. The Spamassasin software at sourceforge.net has detected correctly the email as spam, but why didn’t it stop it for being delivered?

Fig. 2: The mail is correctly flagged as spam.

Fig. 2: The mail is correctly flagged as spam.

The spam mail I’ve seen this morning consists just of one line of text. The only thing which allowed an anti spam filter to detect the message as spam was the fact the link inside was blacklisted because of hosting a spam website and that the IP address from the Received headers was already blacklisted.

So, everything is ok, but why did I receive the email even if it was flagged correctly? The website does say something about the email aliases that simply receive whatever comes there: “Any email sent to a user’s mail alias is automatically passed to the email address that is on file for a that account.”

Well, this is very nice – but very wrong: To test this, I’ve sent an email from my email account at  work (domain is avira.com), but it was immediately whitelisted because of the many security features that our admins support (DKIM, Signatures, reverse DNS, etc.). So, it went through the filter.

I’ve sent another email from another email address having attached the well know GTUBE test file. Now everything was different, the email was blocked and I received immediately a nice email making fun of me:

Fig. 3: Spam that is not automatically forwarded.

Fig. 3: Spam that is not automatically forwarded.

So, why all this happened if Sourceforge doesn’t automatically forward any email sent to the users’ aliases? I don’t know, but I will surely ask Sourceforge. I will blog again if I receive the answer from them. Oh, by the way, Avira Premium Security Suite also correctly marks this kind of email as spam.

Update:

After writing to the Sourceforge Support an email, I received the answer below in less than an hour. I must say that I was pleasantly surprised for such a fast response time, considering the fact that Sourceforge gives all these services for free to the programmers.

“At SourceForge, we do our best to prevent spam from reaching our users. However, it isn’t possible to prevent all spam from getting through, and you will occasionally see examples like the one you’ve provided. We are constantly updating our filters and anti-spam techniques, though, so you should see this problem resolve itself in the next day or so. If it persists, please let us know.

An additional step you can take is to filter based on the “X-VA-Spam-Flag: YES” header, which we apply to email we suspected of being spam. Finally, we recently added the ability to control what sorts of email you receive through your email alias; you can find this feature on your Account Options page.”

Sorin Mustaca
Manager International Software Development

Providing protection against malware and phishing URLs

Phishing, spam and malware have a couple of things in common: they have become a major problem for the users, for the banks and for online businesses. They are delivered either as attachments or via URLs contained in the emails. The AV industry is trying to protect its customers as good as it can by gathering and analysing the emails with dangerous attachments and by blocking the URLs to phishing and malware websites.

Because the emails are so well crafted, sometimes it is not possible to mark them as SPAM, thus reaching users’ inboxes. Some of these spam emails are spreading malware. Not only malware is nowadays a threat for the users but also phishing emails and websites which sell faked products which can be potentially dangerous as well (pharmaceutilcals).

The only solution to block access to the malware is to block the target URL in a generic way, without knowing for sure from the beginning the reason for which it is blocked. Such a powerful and dynamic system needs a very good control and monitoring center in order to be maintainable.

Avira developed a system in order to manage from a single point the malware and phishing URLs gathered from multiple sources, track the URLs in order to see that they are taken down, generate statics for detecting outbreaks and generate information to prevent companies when they are targeted by some phishing attacks.

Fig. 1: Architecture

Fig. 1: Architecture

The system is created having in mind that we can add at any time a new source of URLs.(represented by the gray source with a „?“)

Fig. 2: Categories of URLs

Fig. 2: Categories of URLs

As we can see, most of the URLs we block are pointing to malware and only about a quarter are pointing to phishing websites. These URLs are used to create updates for several web filtering products of Avira like Webguard, a module of the „Avira Premium Security Suite“ product.

Features

One of the most important features of the system is the ability to find the registrar which is hosting the phishing or the malware page. Once we find the registrar, we can find its location and create a world map of the sites which host malware and phishing.

Fig. 3: World distribution of malware and phishing

Fig. 3: World distribution of malware and phishing

As we can see in the Figure, most of the threats are hosted in U.S.A., followed by Europe. Another interesting statistic generated by the system is the top of the most attacked brands and the top of the providers which host most of the files.

Fig. 4: Attacked brands (from September 2008)

Fig. 4: Attacked brands (from September 2008)

On the first place in the top of the most attacked brands is eBay with 3277 unique phishing websites. On the second place is PayPal with 2606 websites and on the third place, very close to American Express with 2464 websites.

Fig. 5: Number of threats

Fig. 5: Number of threats

Challenges

Since end of September 2008 when the system was started, we encountered many challenges while creating this system. The challenges were caused by the differences between the sources we used: the URLs detected by our own Antiphishing product, Phishtank, LCheck (an internal system dealing only with Malware URLs) and Clean-MX ( a system that deals with both phishing and malware URLs). The only thing these sources have in common is the fact that they have an URL which should be blocked. Other challenges we faced are the errors and special situations these services produced: invalid data, lack of availability and false positives.

The system started to record about 100 new URLs at the beginning, which was not a great challenge for our hardware. The situation completely changed when we had to deal with almost 1000 unique URLs per day. These unique URLs are gathered from more than 20000 URLs which have to be verified and sorted. The server has to deal with these special situations and must also check the validity of the URLs by downloading each file in order to analyse and scan it.

A real challenge was removing non relevant URLs like those pointing to no longer existing websites and malware files. Usually, when a web resource is no longer available, a webserver is returning a special error (404). In order to become more user friendly, many websites are no longer returning this error but redirect to a special webpage informing the visitor that the requested resource is no longer there. Since the websites are very often hosted in non English speaking countries, it is not really a solution to parse the webpage and look for some known content.

Fig. 6: Answers provided by various websites

Fortunately, by analysing some of these websites, we figured out that they use some common “keywords” and “key sentences” explaining what is happening. Many of these are international words. We filter about 60% of the pages with this empiric technique.

More details about various techniques for reaching the real content of a page are explained in the article „Delivering reliable phishing protection“, published in Virus Bulletin Magazine, May 2008.

Sorin Mustaca
Manager International Software Development