Stop Botnets by Knowing a Zombie From a User

Better profiles of ‘normal’ net activity can help pick botnet traffic from the background.

Identifying calls between one zombie PC and the botnet that owns it, from inside a company with thousands of computer systems, is like trying to identify one goldfish among thousands in a giant fish tank: among thousands of others doing almost the same things, it’s hard to identify the one fish with evil on its mind.

But a half-century-old statistical analysis tool may offer more hope, by suggesting enough about the behavior of well-adjusted fish to make the behavior of the bad ones stand out.

Anti-malware and intrusion-detection applications already depend heavily on statistical analyses of various kinds, as well as scanning for signatures and other methods. It is extraordinarily difficult, for example, to tell whether a small bit of software that downloads, installs itself and contacts an outside server is malware connecting to its botnet controller, or a game or video controller launched by a user playing games or watching video.

Encrypting a malicious payload or changing the signature a bit is enough to defeat most analytical models, according to a team of researchers at PSG College of Technology in Coimbatore, India. For example, Zeus, the financial-data-stealing malware security firm FortiGuard calls “The God of DIY Botnets,” is evading detection using a combination of encryption and misdirection.

The newest version of the GameOver Zeus variant slipped through 50 anti-virus filters at online anti-virus service VirusTotal by encrypting its malicious payload and changing the name to make it look inert, according to security researcher Gary Warner at Malcovery, who blogged about it Feb. 2. “Why? Well, because technically, it isn’t malware. It doesn’t actually execute!” Warner wrote. “All Windows EXE files start with the bytes “MZ”. These files start with “ZZP”. They aren’t executable, so how could they be malware? Except they are.”

Rather than launching its own malicious payload, the attachment downloads an encrypted file ending in .enc, then decrypts it, renames it and stores the new payload somewhere else on the infected machine – as an executable scheduled to launch sometime later.

It was easier when botnets used IRC to control malware-infected zombies, but the state of the art is now to use TCP and HTTP, which helps botnets hide their tracks among gigabytes of legitimate HTTP traffic, according to the team from the Department of Applied Mathematics and Computational Sciences at PSG College of Technology. Pattern matching and behavior analysis can pick out some malicious behavior, but it’s much easier to identify anomalous behavior if you have a detailed profile of “normal” network behavior.

It is possible to create a detailed-enough profile by applying a statistical analysis tool called a hidden semi-Markov model (HsMM) to data recorded in the management information bases (MIB) that define and track specific actions for management apps using the Simple Network Management Protocol (SNMP), the team argued in a paper published in the current edition of the International Journal of Electronic Security and Digital Forensics. Their analysis uses a “hidden semi-Markov model” (HsMM) to analyze network activity; HsMM is a double variant of a statistical analysis called the Markov model – a process of predictive analysis of a chain of events in which the next event is predictable even though some of the previous steps are unknown. The photo of a basketball poised on the rim of a basket at a Bit 10 college basketball game, for example, could be used to predict the defending team is about to lose ground or stay even, depending on which direction the ball bounces; a semi-Markov model is the same except there are specific time spans associated with each step.

A hidden Markov model is one in which the future is also predictable, but the previous steps are hidden and can only be inferred from indirect evidence – the expression on the defending coach’s face, for example. A hidden semi-Markov model can be used with either direct observations of a PC’s behavior, or inferences about it based on other information, and can be relied upon to make good guesses defining events in both the future and the past.

Using an HsMM analytical framework and data from SNMP MIBs, it is possible to establish quite accurate profiles of what constitutes “normal” network activity – a pattern in which the activity of zombies stands out like a sore thumb. The difference is that all botnet activity is driven by the machine and have nothing to do with behavior of the user – a difference that can be identified and separated statistically from other results, according to the authors.

In tests with machines made zombie by Spyeye or Blackenergy malware, the model was able to identify 98.7 percent of infections, with a false-positive rate of only 1.3 percent, “which is really high with low false positive rate,” the authors concluded. “The proposed model is an efficient, light weight with high detection accuracy and less false positive rate.”

Image: Balefire