Steganalysis in modern day anti-malware systems

Steganalysis and steganography are exact opposites. Steganography in a nutshell is a way of hiding data in such a way that no external observer would be able to see it’s there. Steganalysis is the exact opposite… trying to detect the presence of this hidden data.

It’s easier to understand steganography with an example. Consider the following title of a (real) databse:

Database for Annotation, Visualization and Integrated Discovery

Outwardly it doesn’t show anything but the name of some complicated database. But take the first capitalised letter in each non-trivial word and you end up with my name: DAVID.

Granted, the above is a trivial and simplistic example, but steganography has been around for ages… the Romans used it to hide generals’ orders to armies, spies used it in both world wars, and so on. Lately, it’s expanded into IT security. Ever seen the movie “Along came a spider”? In this movie, at one point one of the characters receives an innocent looking picture. She then presses a key, the image flashes a bit and a hidden message is revealed, asking her to meet up with the baddies. That is one example of steganography in the IT world.

Obviously the merits of hiding data in such an innocuous way in electronic communications can prove invaluable to black hats. Most of the time, a network security expert relies on Intrusion Detection/Prevention Systems (IDS/IPS) to detect, track down, and remove malware or trojans on the network. These IDS and IPS systems in turn, almost exclusively rely on sniffing network traffic, and matching that traffic with well-known and predefined signatures. Now imagine a crafty black hat really needs to get information from / to the infected machines under his control without a snooping white hat realising what’s going on. He hides the information in innocent looking traffic that any legitimate user might have generated, and no one’s the wiser. There you have it… steganography.

Black hats could have quite a bit of uses for steganography. Extrapolating from my scenario above, we could argue that steganography could be used to:

1. Send commands to zombies (infected PCs)

2. Retrieve personal information from PCs that have trojans installed on them

3. Increase the survivability and usefulness of botness

4. Protection of sensitive data in transit (like hiding IPs, hostnames, contact details, of malware command and control centers)

There are several posts about the above around on the internet. A quick search on google turns up with a lot of links. One of the more interesting ones was from irongeek:

http://www.irongeek.com/i.php?page=security/steganographic-command-and-control

This is quite an awesome article, and irongeek discusses more practical implementations of  steganography and argues how with the advent of Web2.0, it will be easier for black hats to not only hide data in pictures like I already pointed out, but to hide information in facebook updates, tweets, and so on.

So what can the security conscious do about all this? Well this is where steganalysis comes into the picture. Steganalysis uses various methods, predominantly statistical analysis, to determine if there is any hidden data. Unfortunately steganalysis is still in its infancy in terms of IT. I am personally not aware of any vendors or programs that have IDS/IPS -like capability to detect and stop any steganographic communications. Certainly some steganalysis programs exist out there, I myself have written some for my undergrad thesis, but they mostly focus on pre-defined caches of data (like crawling websites and analysing data to check for steganographic content).

What would be awesome, would be to create a system which acts as a gateway/proxy, scans network traffic in realtime and tries to decide if such traffic is steganographic data. So… I’d call it something like “Steganalysis of dynamic data”. This would be quite a major undertaking but could provide network security a tool against what can potentially play quite a large role in malware communication.

I will post my thoughts about the challenges of such a system in a subsequent blog post…

Update 29/06/2010:

Just a follow up to the above. Today a headline grabbed the news, which involves Russian spies (allegedly) using steganography:

http://news.cnet.com/politics-and-law/?tag=rb_content;overviewHead