Your browser does not support JavaScript!
Tronserve Blog - Home

Posted on : Monday 9th March 2020 11:12 AM

A Machine Learning Classifier Can Spot Serial Hijackers Before They Strike

  • User Image
Posted by  Tronserve
image cap

What will be your perception if, any time you had to send susceptible information somewhere, you relied on a chain of people playing the telephone game to discover that information to where it needs to go? Feels like a terrible idea, right? Well, too bad, because that is how the online world works.

Data is routed by the Internet’s various metaphorical tubes using what's called the Border Gateway Protocol (BGP). Any data progressing over the Internet needs a physical path of networks and routers to make it from A to B. BGP is the method that moves information through those paths - though the downside, much like a person in a game of telephone, is that each junction in the path only knows what they have been told by their immediate neighbor.

For a specialized junction in a route recognizes only where the data it is transmitting just came from and where it’s headed next, it is relatively simple for someone to step in and divert the data. At these special junctions, autonomous systems establish BGP connections. Like a party pooper deliberately ruining a game of telephone by whispering a fully different phrase than the one that was told to them, a hacker could possibly insert their own autonomous system to reroute information. The worst offenders are serial hijackers, who over and over deviate data to skim information or enable distributed denial-of-service (DDOS) attacks. In 1998, a couple of hackers testified to the U.S. Congress that the Internet could be taken down by a dedicated hacker in 30 minutes by deploying BGP hacking.

Over the years, serial hijackers have been tough to stop. One latest example was Bitcanal, a Portuguese web hosting firm that devoted years supporting serial hijackers in their attacks. It took years of coordinated effort from legitimate service providers to shut down Bitcanal, and meantime, several other serial hijackers still roam the Web. What’s worse, serial hijackers have to, as the name suggests, start variable attacks before it becomes clear that they are a bad-faith actor.

“BGP [hacking] is one way to sniff at traffic, or steal traffic,” says Cecilia Testart, a graduate student at MIT’s Computer Science and Artificial Intelligence Lab (CSAIL). ”Given that the Internet is becoming more and more critical, we should try and prevent these attacks.”

Testart is the leading author on a paper published today [PDF] by a few researchers at CSAIL and the Center for Applied Internet Data Analysis (CAIDA). They have recommended that machine learning can be used to pro-actively prevent serial hijackers from their hijinks. Serial hijackers, the researchers suggest, show some characteristic traits that make them get noticed compared to regular network providers. They reveal that machine learning could spot serial hijackers much faster than the standard method of identifying them only after numerous attacks.

The joint team used a machine learning technique termed an extremely-randomized trees (extra-trees) classifier. In a test with their classifier, the classifier flagged 934 out of 19,103 autonomous systems it tested as prospective serial hijackers. You can view extra-trees classifiers as though you were growing a forest of trees, where each tree represents a vote of confidence - for instance, whether somebody is a serial hijacker - according to a randomized subset of available information.

The resulting forest represents an opinion. If most trees have arrived at the choice that someone is a serial hijacker using the limited information available to them, then you probably have one on your hands. Testart says extra-trees classifiers and other forest classifiers don’t have the same bias toward a set of training data that a machine learning technique such as deep learning may have. Because the available data on known serial hijackers is so small, deep learning techniques may have skewed toward searching for only ones most similar to known attackers and missed ones that might differ more.

Needless to say, for individual trees to cast a vote, they have to know what they're looking for. The research group identified a few ways in which serial hijackers are different from authentic network providers that usually route Internet traffic. For example, authentic providers tend to be online more continually, as they are providing Internet service to real customers. Serial hijackers, on the other hand, would only be online while they are skimming data.

Serial hijackers also usually have more different Internet Protocol (IP) blocks - mainly the street addresses of the Internet. Testart explains that an institution like MIT typically has a block of consecutive IP addresses that it uses. Hijackers, however, usually tend to pick up small strings of IP addresses as they become defunct from other users. One user with an odd selection of IP blocks, therefore, is much more likely to be a serial hijacker.

These rules aren’t set in stone. Testart notes there are situations when a recognized network provider could go offline - for example, during an earthquake or blackout. Fat finger errors could even lead to typos and misconfigurations that could make a legitimate provider look leery at first glance. Testart says there’s still plenty to be done with the work the research team has published so far. She suggests that an extra-trees classifier like the one the group developed could give network operators a sort of reputation score, to ensure that serial hijackers would see their reputations drop quickly as they went about their nefarious business.

One other alternative is to update BGP and turn the game of the telephone into something more secure. But Testart doesn’t think that is likely. “The Internet is a huge network,” she says. “It’s running on infrastructure set up many years ago. If you update a major protocol, you need to update all that infrastructure.” Just imagine the headaches of trying to get every network provider in the world to agree to change a protocol - it is far easier just to build a tool that can sniff out serial hijackers.


machine learning serial hijacker hijacker cyber risks cyber safety cyber attacks

Comment Section