commit d0fd49800ce25a5225b4483242ff86af51f4b399 Author: George Hughey Date: Fri Nov 15 02:38:40 2019 +0000 Initial commit Initial commit of Geneva diff --git a/README.md b/README.md new file mode 100644 index 0000000..592ef23 --- /dev/null +++ b/README.md @@ -0,0 +1,115 @@ +# Geneva + +Geneva is a artificial intelligence tool that defeats censorship by exploiting bugs in censors, such as those in China, India, and Kazakhstan. Unlike many other anti-censorship solutions which require assistance from outside the censoring regime (Tor, VPNs, etc.), Geneva runs strictly on the client. + +Under the hood, Geneva uses a genetic algorithm to evolve censorship evasion strategies and has found several previously unknown bugs in censors. Geneva's strategies manipulate the client's packets to confuse the censor without impacting the client/server communication. This makes Geneva effective against many types of in-network censorship (though it cannot be used against IP-blocking censorship). + +This code release specifically contains the strategy engine used by Geneva, its Python API, and a subset of published strategies, so users and researchers can test and deploy Geneva's strategies. To learn more about how Geneva works, visit [How it Works](#How-it-Works). We will be releasing the genetic algorithm at a later date. + +## Setup + +Geneva has been developed and tested for Centos or Debian-based systems. Due to limitations of +netfilter and raw sockets, Geneva does not work on OS X or Windows at this time. + +Install netfilterqueue dependencies: +``` +# sudo apt-get install build-essential python-dev libnetfilter-queue-dev libffi-dev libssl-dev +``` + +Install Python dependencies: +``` +# python3 -m pip install -r requirements.txt +``` + +## Running it + +``` +# python3 engine.py --server-port 80 --strategy "[TCP:flags:A]-duplicate(,tamper{TCP:flags:replace:R})-| \/" --log debug +2019-10-14 16:34:45 DEBUG:[ENGINE] Engine created with strategy \/ (ID bm3kdw3r) to port 80 +2019-10-14 16:34:45 DEBUG:[ENGINE] Configuring iptables rules +2019-10-14 16:34:45 DEBUG:[ENGINE] iptables -A OUTPUT -p tcp --sport 80 -j NFQUEUE --queue-num 1 +2019-10-14 16:34:45 DEBUG:[ENGINE] iptables -A INPUT -p tcp --dport 80 -j NFQUEUE --queue-num 2 +2019-10-14 16:34:45 DEBUG:[ENGINE] iptables -A OUTPUT -p udp --sport 80 -j NFQUEUE --queue-num 1 +2019-10-14 16:34:45 DEBUG:[ENGINE] iptables -A INPUT -p udp --dport 80 -j NFQUEUE --queue-num 2 +``` + +Note that if you have stale `iptables` rules or other rules that rely on Geneva's default queues, +this will fail. To fix this, remove those rules. + +## Strategy Library + +Geneva has found dozens of strategies that work against censors in China, Kazakhstan, and India. We include several of these strategies in [strategies.md](strategies.md). Note that this file contains success rates for each individual country; a strategy that works in one country may not work as well as other countries. + +Researchers have observed that strategies may have differing success rates based on your exact location. Although we have not observed this from our vantage points, you may find that some strategies may work differently in a country we have tested. If this is the case, don't be alarmed. However, please feel free to reach out to a member of the team directly or open an issue on this page so we can track how the strategies work from other geographic locations. + +## Disclaimer + +Running these strategies may place you at risk if you use it within a censoring regime. Geneva takes overt actions that interfere with the normal operations of a censor and its strategies are detectable on the network. Geneva is not an anonymity tool, nor does it encrypt any traffic. Understand the risks of running Geneva in your country before trying it. + +------- + +## How it Works + +See our paper for an in-depth read on how Geneva works. Below is a rundown of the format of Geneva's strategy DNA. + +### Strategy DNA + +Geneva's strategies can be arbitrarily complicated, and it defines a well-formatted syntax for +expressing strategies to the engine. + +A strategy is simply a _description of how network traffic should be modified_. A strategy is not +code, it is a description that tells the engine how it should operate over traffic. + +A strategy divides how it handles outbound and inbound packets: these are separated in the DNA by a +"\\/". Specifically, the strategy format is ` \/ `. If `\/` is not +present in a strategy, all of the action trees are in the outbound forest. + +Both forests are composed of action trees, and each forest is allowed an arbitrarily many trees. + +An action tree is comprised of a _trigger_ and a _tree_. The trigger describes _when_ the strategy +should run, and the tree describes what should happen when the trigger fires. Recall that Geneva +operates at the packet level, therefore all triggers are packet-level triggers. Action trees start +with a trigger, and always end with a `-|`. + +Triggers operate as exact-matches, are formatted as follows: `[::]`. For +example, the trigger: `[TCP:flags:S]` will run its corresponding tree whenever it sees a `SYN` +TCP packet. If the corresponding action tree is `[TCP:flags:S]-drop-|`, this action tree will cause +the engine to drop any `SYN` packets. `[TCP:flags:S]-duplicate-|` will cause the engine to +duplicate the SYN packet. + +Depending on the type of action, some actions can have up to two children. These are represented +with the following syntax: `[TCP:flags:S]-duplicate(,)-|`, where +`` and `` themselves are trees. If `(,)` is not specified, any packets +that emerge from the action will be sent on the wire. + +Any action that has parameters associated with it contain those parameters in `{}`. Consider the +following strategy with `tamper`. +``` +[TCP:flags:A]-duplicate(tamper{TCP:flags:replace:R},)-| \/ +``` +This strategy takes outbound `ACK` packets and duplicates them. To the first duplicate, it tampers +the packet by replacing the `TCP` `flags` field with `RST`, and does nothing to the second +duplicate. + +Note that due to NFQueue limitations, actions that introduce branching (fragment, duplicate) are +disabled for incoming action forests. + +------- + +## Citation + +If you like the work or plan to use it in your projects, please follow the guidelines in [citation.bib](https://github.com/Kkevsterrr/geneva/blob/master/citation.bib). + +## Paper + +See [our paper](http://geneva.cs.umd.edu/papers/geneva_ccs19.pdf) from CCS for an in-depth dive into how it works. + +## Contributors + +[Kevin Bock](https://github.com/Kkevsterrr) + +[George Hughey](https://github.com/ecthros) + +[Xiao Qiang](https://twitter.com/rockngo) + +[Dave Levin](https://www.cs.umd.edu/~dml/) diff --git a/actions/action.py b/actions/action.py new file mode 100644 index 0000000..03e6d4f --- /dev/null +++ b/actions/action.py @@ -0,0 +1,149 @@ +""" +Action + +Geneva object for defining a packet-level action. +""" + +import inspect +import importlib +import os +import sys + +import actions.utils + + +ACTION_CACHE = {} +ACTION_CACHE["in"] = {} +ACTION_CACHE["out"] = {} +BASEPATH = os.path.sep.join(os.path.dirname(os.path.abspath(__file__)).split(os.path.sep)[:-1]) + +class Action(): + """ + Defines the superclass for a Geneva Action. + """ + # Give each Action a unique ID - this is needed for graphing/visualization + ident = 0 + + def __init__(self, action_name, direction): + """ + Initializes this action object. + """ + self.enabled = True + self.action_name = action_name + self.direction = direction + self.requires_undo = False + self.num_seen = 0 + + self.left = None + self.right = None + self.branching = False + self.terminal = False + self.ident = Action.ident + Action.ident += 1 + + def applies(self, direction): + """ + Returns whether this action applies to the given direction, as + branching actions are not supported on inbound trees. + """ + if direction == self.direction or self.direction == "both": + return True + return False + + def mutate(self, environment_id=None): + """ + Mutates packet. + """ + + def __str__(self): + """ + Defines string representation of this action. + """ + return "%s" % (self.action_name) + + @staticmethod + def get_actions(direction, disabled=None, allow_terminal=True): + """ + Dynamically imports all of the Action classes in this directory. + + Will only return terminal actions if terminal is set to True. + """ + if disabled is None: + disabled = [] + # Recursively call this function again to enumerate in and out actions + if direction.lower() == "both": + return list(set(Action.get_actions("in", disabled=disabled, allow_terminal=allow_terminal) + \ + Action.get_actions("out", disabled=disabled, allow_terminal=allow_terminal))) + + terminal = "terminal" + if not allow_terminal: + terminal = "non-terminal" + + if terminal not in ACTION_CACHE[direction]: + ACTION_CACHE[direction][terminal] = {} + else: + return ACTION_CACHE[direction][terminal] + + + collected_actions = [] + # Get the base path for the project relative to this file + path = os.path.join(BASEPATH, "actions") + for action_file in os.listdir(path): + if not action_file.endswith(".py"): + continue + action = action_file.replace(".py", "") + if BASEPATH not in sys.path: + sys.path.append(BASEPATH) + + importlib.import_module("actions." + action) + def check_action(obj): + return inspect.isclass(obj) and \ + issubclass(obj, actions.action.Action) and \ + obj != actions.action.Action and \ + obj().applies(direction) and \ + obj().enabled and \ + not any([x in str(obj) for x in disabled]) and \ + (allow_terminal or not obj().terminal) + clsmembers = inspect.getmembers(sys.modules["actions."+action], predicate=check_action) + collected_actions += clsmembers + + collected_actions = list(set(collected_actions)) + + ACTION_CACHE[direction][terminal] = collected_actions + + return collected_actions + + @staticmethod + def parse_action(str_action, direction, logger): + """ + Parses a string action into the action object. + """ + # Collect all viable actions that can run for each respective direction + outs = Action.get_actions("out") + ins = Action.get_actions("in") + + # If we're currently parsing the OUT forest, only search the out-compatible actions + if direction == "out": + search = outs + # Otherwise only search in-compatible actions (no branching) + else: + search = ins + + action_obj = None + data = None + # If this action has parameters (defined within {} attached to the action), + # split off the data parameters from the raw action name + if "{" in str_action: + str_action, data = str_action.split("{") + data = data.replace("}", "") + + # Search through all of the actions available for this direction to find the right class + for action_name, action_cls in search: + if str_action.strip() and str_action.lower() in action_name.lower(): + # Define the action, and give it a reference to its parent strategy + action_obj = action_cls() + # If this action has data, ask the new module to parse & initialize itself to it + if data: + # Pass our logger to the action to alert us if it can't parse something + action_obj.parse(data, logger) + return action_obj diff --git a/actions/drop.py b/actions/drop.py new file mode 100644 index 0000000..a17436b --- /dev/null +++ b/actions/drop.py @@ -0,0 +1,15 @@ +from actions.action import Action + +class DropAction(Action): + def __init__(self, environment_id=None): + Action.__init__(self, "drop", "both") + self.terminal = True + self.branching = False + + def run(self, packet, logger): + """ + The drop action returns None for both it's left and right children, and + does not pass the packet along for continued use. + """ + logger.debug(" - Dropping given packet.") + return None, None diff --git a/actions/duplicate.py b/actions/duplicate.py new file mode 100644 index 0000000..9f9e75c --- /dev/null +++ b/actions/duplicate.py @@ -0,0 +1,21 @@ +from actions.action import Action + + +class DuplicateAction(Action): + def __init__(self, environment_id=None): + Action.__init__(self, "duplicate", "out") + self.branching = True + + def run(self, packet, logger): + """ + The duplicate action duplicates the given packet and returns one copy + for the left branch, and one for the right branch. + """ + logger.debug(" - Duplicating given packet %s" % str(packet)) + return packet, packet.copy() + + def mutate(self, environment_id=None): + """ + Swaps its left and right child + """ + self.left, self.right = self.right, self.left diff --git a/actions/fragment.py b/actions/fragment.py new file mode 100644 index 0000000..e16a94e --- /dev/null +++ b/actions/fragment.py @@ -0,0 +1,217 @@ +import random +from actions.action import Action +import actions.packet + +from scapy.all import IP, TCP, fragment + + +class FragmentAction(Action): + def __init__(self, environment_id=None, correct_order=None, fragsize=-1, segment=True): + ''' + correct_order specifies if the fragmented packets should come in the correct order + fragsize specifies how + ''' + Action.__init__(self, "fragment", "out") + self.enabled = True + self.branching = True + self.terminal = False + self.fragsize = fragsize + self.segment = segment + + if correct_order == None: + self.correct_order = self.get_rand_order() + else: + self.correct_order = correct_order + + def get_rand_order(self): + """ + Randomly decides if the fragments should be reversed. + """ + return random.choice([True, False]) + + def fragment(self, original, fragsize): + """ + Fragments a packet into two, given the size of the first packet (0:fragsize) + Always returns two packets + """ + if fragsize == 0: + frags = [original] + else: + frags = fragment(original, fragsize=fragsize) + # If there were more than 2 fragments, join the loads so we still have 2 packets + if len(frags) > 2: + for frag in frags[2:]: + frags[1]["IP"].load += frag["IP"].load + # After scapy fragmentation, the flags field is set to "MF+DF" + # In order for the packet to remain valid, strip out the "MF" + frags[1]["IP"].flags = "DF" + # If scapy tried to fragment but there were only enough bytes for 1 packet, just duplicate it + elif len(frags) == 1: + frags.append(frags[0].copy()) + + return frags[0], frags[1] + + def ip_fragment(self, packet, logger): + """ + Perform IP fragmentation. + """ + if not packet.haslayer("IP") or not hasattr(packet["IP"], "load"): + return packet, packet.copy() # duplicate if no TCP or no payload to segment + load = "" + if packet.haslayer("TCP"): + load = bytes(packet["TCP"]) + elif packet.haslayer("UDP"): + load = bytes(packet["UDP"]) + else: + load = bytes(packet["IP"].load) + + # If there is no load, duplicate the packet + if not load: + return packet, packet.copy() + + if self.fragsize == -1 or (self.fragsize * 8) > len(load) or len(load) <= 8: + fragsize = int(int(((int(len(load)/2))/8))*8) + frags = self.fragment(packet.copy().packet, fragsize=fragsize) + else: + # packet can be fragmented as requested + frags = self.fragment(packet.copy().packet, fragsize=self.fragsize*8) + packet1 = actions.packet.Packet(frags[0]) + packet2 = actions.packet.Packet(frags[1]) + if self.correct_order: + return packet1, packet2 + else: + return packet2, packet1 + + def tcp_segment(self, packet, logger): + """ + Segments a packet into two, given the size of the first packet (0:fragsize) + Always returns two packets, since fragment is a branching action, so if we + are unable to segment, it will duplicate the packet. + """ + if not packet.haslayer("TCP") or not hasattr(packet["TCP"], "load") or not packet["TCP"].load: + return packet, packet.copy() # duplicate if no TCP or no payload to segment + + # Get the original payload and delete it from the packet so it + # doesn't come along when copying the TCP layer + payload = packet["TCP"].load + del(packet["TCP"].load) + + fragsize = self.fragsize + if self.fragsize == -1 or self.fragsize > len(payload) - 1: + fragsize = int(len(payload)/2) + + # Craft new packets + pkt1 = IP(packet["IP"])/payload[:fragsize] + pkt2 = IP(packet["IP"])/payload[fragsize:] + + # We cannot rely on scapy's native parsing here - if a previous action has changed the + # fragment offset, scapy will not identify this as TCP, so we must do it for scapy + if not pkt1.haslayer("TCP"): + pkt1 = IP(packet["IP"])/TCP(bytes(pkt1["IP"].load)) + + if not pkt2.haslayer("TCP"): + pkt2 = IP(packet["IP"])/TCP(bytes(pkt2["IP"].load)) + + packet1 = actions.packet.Packet(pkt1) + packet2 = actions.packet.Packet(pkt2) + + # Reset packet2's SYN number + packet2["TCP"].seq += fragsize + + del packet1["IP"].chksum + del packet2["IP"].chksum + del packet1["IP"].len + del packet2["IP"].len + del packet1["TCP"].chksum + del packet2["TCP"].chksum + del packet1["TCP"].dataofs + del packet2["TCP"].dataofs + + if self.correct_order: + return [packet1, packet2] + else: + return [packet2, packet1] + + def run(self, packet, logger): + """ + The fragment action fragments each given packet. + """ + logger.debug(" - Fragmenting given packet %s" % str(packet)) + if self.segment: + return self.tcp_segment(packet, logger) + else: + return self.ip_fragment(packet, logger) + + def __str__(self): + """ + Returns a string representation with the fragsize + """ + s = Action.__str__(self) + if self.segment: + s += "{" + "tcp" + ":" + str(self.fragsize) + ":" + str(self.correct_order) + "}" + else: + s += "{" + "ip" + ":"+ str(self.fragsize) + ":" + str(self.correct_order) + "}" + return s + + def parse(self, string, logger): + """ + Parses a string representation of fragmentation. Nothing particularly special, + but it does check for a the fragsize. + + Note that the given logger is a DIFFERENT logger than the logger passed + to the other functions, and they cannot be used interchangeably. This logger + is attached to the main GA driver, and is run outside the evaluator. When the + action is actually run, it's run within the evaluator, which by necessity must + pass in a different logger. + """ + + # Count the number of params in this given string + num_parameters = string.count(":") + + # If num_parameters is greater than 2, it's not a valid fragment action + if num_parameters != 2: + msg = "Cannot parse fragment action %s" % string + logger.error(msg) + raise Exception(msg) + else: + params = string.split(":") + seg, fragsize, correct_order = params + if "tcp" in seg: + self.segment = True + else: + self.segment = False + + try: + # Try to convert to int + self.fragsize = int(fragsize) + except ValueError: + msg = "Cannot parse fragment action %s" % string + logger.error(msg) + raise Exception(msg) + + # Parse ordering + if correct_order.startswith('True'): + self.correct_order = True + else: + self.correct_order = False + + return True + + def mutate(self, environment_id=None): + """ + Mutates the fragment action - it either chooses a new segment offset, + switches the packet order, and/or changes whether it segments or fragments. + """ + self.correct_order = self.get_rand_order() + self.segment = random.choice([True, True, True, False]) + if self.segment: + if random.random() < 0.5: + self.fragsize = int(random.uniform(1, 60)) + else: + self.fragsize = -1 + else: + if random.random() < 0.2: + self.fragsize = int(random.uniform(1, 50)) + else: + self.fragsize = -1 + return self diff --git a/actions/layer.py b/actions/layer.py new file mode 100644 index 0000000..8b976c5 --- /dev/null +++ b/actions/layer.py @@ -0,0 +1,594 @@ +import binascii +import random +import string +import os +import urllib.parse + +from scapy.all import IP, RandIP, UDP, Raw, TCP, fuzz + +class Layer(): + """ + Base class defining a Geneva packet layer. + """ + protocol = None + + def __init__(self, layer): + """ + Initializes this layer. + """ + self.layer = layer + # No custom setter, getters, generators, or parsers are needed by default + self.setters = {} + self.getters = {} + self.generators = {} + self.parsers = {} + + @classmethod + def reset_restrictions(cls): + """ + Resets field restrictions placed on this layer. + """ + cls.fields = cls._fields + + def get_next_layer(self): + """ + Given the current layer returns the next layer beneath us. + """ + if len(self.layer.layers()) == 1: + return None + + return self.layer[1] + + def get_random(self): + """ + Retreives a random field and value. + """ + field = random.choice(self.fields) + return field, self.get(field) + + def gen_random(self): + """ + Generates a random field and value. + """ + assert self.fields, "Layer %s doesn't have any fields" % str(self) + field = random.choice(self.fields) + return field, self.gen(field) + + @classmethod + def name_matches(cls, name): + """ + Checks if given name matches this layer name. + """ + return name.upper() == cls.name.upper() + + def get(self, field): + """ + Retrieves the value from a given field. + """ + assert field in self.fields + if field in self.getters: + return self.getters[field](field) + + # Dual field accessors are fields that require two pieces of information + # to retrieve them (for example, "options-eol"). These are delimited by + # a dash "-". + base = field.split("-")[0] + if "-" in field and base in self.getters: + return self.getters[base](field) + + return getattr(self.layer, field) + + def set(self, packet, field, value): + """ + Sets the value for a given field. + """ + assert field in self.fields + base = field.split("-")[0] + if field in self.setters: + self.setters[field](packet, field, value) + + # Dual field accessors are fields that require two pieces of information + # to retrieve them (for example, "options-eol"). These are delimited by + # a dash "-". + elif "-" in field and base in self.setters: + self.setters[base](packet, field, value) + else: + setattr(self.layer, field, value) + + # Request the packet be reparsed to confirm the value is stable + # XXX Temporarily disabling the reconstitution check due to scapy bug (#2034) + #assert bytes(self.protocol(bytes(self.layer))) == bytes(self.layer) + + def gen(self, field): + """ + Generates a value for this field. + """ + assert field in self.fields + if field in self.generators: + return self.generators[field](field) + + # Dual field accessors are fields that require two pieces of information + # to retrieve them (for example, "options-eol"). These are delimited by + # a dash "-". + base = field.split("-")[0] + if "-" in field and base in self.generators: + return self.generators[base](field) + + sample = fuzz(self.protocol()) + + new_value = getattr(sample, field) + if new_value == None: + new_value = 0 + elif type(new_value) != int: + new_value = new_value._fix() + + return new_value + + def parse(self, field, value): + """ + Parses the given value for a given field. This is useful for fields whose + value cannot be represented in a string type easily - it lets us define + a common string representation for the strategy, and parse it back into + a real value here. + """ + assert field in self.fields + if field in self.parsers: + return self.parsers[field](field, value) + + # Dual field accessors are fields that require two pieces of information + # to retrieve them (for example, "options-eol"). These are delimited by + # a dash "-". + base = field.split("-")[0] + if "-" in field and base in self.parsers: + return self.parsers[base](field, value) + + try: + parsed = int(value) + except ValueError: + parsed = value + + return parsed + + def get_load(self, field): + """ + Helper method to retrieve load, as scapy doesn't recognize 'load' as + a regular field properly. + """ + try: + load = self.layer.payload.load + except AttributeError: + pass + try: + load = self.layer.load + except AttributeError: + return "" + + if not load: + return "" + + return urllib.parse.quote(load.decode('utf-8', 'ignore')) + + def set_load(self, packet, field, value): + """ + Helper method to retrieve load, as scapy doesn't recognize 'load' as + a field properly. + """ + if packet.haslayer("IP"): + del packet["IP"].len + + value = urllib.parse.unquote(value) + + value = value.encode('utf-8') + + self.layer.payload = Raw(value) + + def gen_load(self, field): + """ + Helper method to generate a random load, as scapy doesn't recognize 'load' + as a field properly. + """ + load = ''.join([random.choice(string.ascii_lowercase + string.digits) for k in range(10)]) + return urllib.parse.quote(load) + + +class IPLayer(Layer): + """ + Defines an interface to access IP header fields. + """ + name = "IP" + protocol = IP + _fields = [ + 'version', + 'ihl', + 'tos', + 'len', + 'id', + 'flags', + 'frag', + 'ttl', + 'proto', + 'chksum', + 'src', + 'dst', + 'load' + ] + fields = _fields + + def __init__(self, layer): + """ + Initializes the IP layer. + """ + Layer.__init__(self, layer) + self.getters = { + "flags" : self.get_flags, + "load" : self.get_load + } + self.setters = { + "flags" : self.set_flags, + "load" : self.set_load + } + self.generators = { + "src" : self.gen_ip, + "dst" : self.gen_ip, + "chksum" : self.gen_chksum, + "len" : self.gen_len, + "load" : self.gen_load, + "flags" : self.gen_flags + } + + def gen_len(self, field): + """ + Generates a valid IP length. Scapy breaks if the length is set to 0, so + return a random int starting at 1. + """ + return random.randint(1, 500) + + def gen_chksum(self, field): + """ + Generates a checksum. + """ + return random.randint(1, 65535) + + def gen_ip(self, field): + """ + Generates an IP address. + """ + return RandIP()._fix() + + def get_flags(self, field): + """ + Retrieves flags as a string. + """ + return str(self.layer.flags) + + def set_flags(self, packet, field, value): + """ + Sets the flags field. There is a bug in scapy, if you retrieve an empty + flags field, it will return "", but you cannot set this value back. + To reproduce this bug: + >>> setattr(IP(), "flags", str(IP().flags)) # raises a ValueError + To handle this case, this method converts empty string to zero so that + it can be safely stored. + """ + if value == "": + value = 0 + self.layer.flags = value + + def gen_flags(self, field): + """ + Generates random valid flags. + """ + sample = fuzz(self.protocol()) + + # Since scapy lazily evaluates fuzzing, we first must set a + # legitimate value for scapy to evaluate what combination of flags it is + sample.flags = sample.flags + + return str(sample.flags) + + +class TCPLayer(Layer): + """ + Defines an interface to access TCP header fields. + """ + name = "TCP" + protocol = TCP + _fields = [ + 'sport', + 'dport', + 'seq', + 'ack', + 'dataofs', + 'reserved', + 'flags', + 'window', + 'chksum', + 'urgptr', + 'load', + 'options-eol', + 'options-nop', + 'options-mss', + 'options-wscale', + 'options-sackok', + 'options-sack', + 'options-timestamp', + 'options-altchksum', + 'options-altchksumopt', + 'options-md5header', + 'options-uto' + ] + fields = _fields + + options_names = { + "eol": 0, + "nop": 1, + "mss": 2, + "wscale": 3, + "sackok": 4, + "sack": 5, + #"echo" : 6, + #"echo_reply" : 7, + "timestamp": 8, + "altchksum": 14, + "altchksumopt": 15, + "md5header": 19, + #"quick_start" : 27, + "uto": 28 + #"authentication": 29, + #"experiment": 254 + } + + # Each entry is Kind: length + options_length = { + 0: 0, # EOL + 1: 0, # NOP + 2: 2, # MSS + 3: 1, # WScale + 4: 0, # SAckOK + 5: 0, # SAck + 6: 4, # Echo + 7: 4, # Echo Reply + 8: 8, # Timestamp + 14: 3, # AltChkSum + 15: 0, # AltChkSumOpt + 19: 16, # MD5header Option + 27: 6, # Quick-Start response + 28: 2, # User Timeout Option + 29: 4, # TCP Authentication Option + 254: 8, # Experiment + + } + # Required by scapy + scapy_options = { + 0: "EOL", + 1: "NOP", + 2: "MSS", + 3: "WScale", + 4: "SAckOK", + 5: "SAck", + 8: "Timestamp", + 14: "AltChkSum", + 15: "AltChkSumOpt", + 28: "UTO", + # 254:"Experiment" # scapy has two versions of this, so it doesn't work + } + + def __init__(self, layer): + """ + Initializes the TCP layer. + """ + Layer.__init__(self, layer) + # Special methods to help access fields that cannot be accessed normally + self.getters = { + 'load' : self.get_load, + 'options' : self.get_options + } + self.setters = { + 'load' : self.set_load, + 'options' : self.set_options + } + # Special methods to help generate fields that cannot be generated normally + self.generators = { + 'load' : self.gen_load, + 'dataofs' : self.gen_dataofs, + 'flags' : self.gen_flags, + 'chksum' : self.gen_chksum, + 'options' : self.gen_options + } + + + def gen_chksum(self, field): + """ + Generates a checksum. + """ + return random.randint(1, 65535) + + def gen_dataofs(self, field): + """ + Generates a valid value for the data offset field. + """ + # Dataofs is a 4 bit header, so a max of 15 + return random.randint(1, 15) + + def gen_flags(self, field): + """ + Generates a random set of flags. 50% of the time it picks randomly from + a list of real flags, otherwise it returns fuzzed flags. + """ + if random.random() < 0.5: + return random.choice(['S', 'A', 'SA', 'PA', 'FA', 'R', 'P', 'F', 'RA', '']) + else: + sample = fuzz(self.protocol()) + # Since scapy lazily evaluates fuzzing, we first must set a + # legitimate value for scapy to evaluate what combination of flags it is + sample.flags = sample.flags + return str(sample.flags) + + def get_options(self, field): + """ + Helper method to retrieve options. + """ + base, req_option = field.split("-") + assert base == "options", "get_options can only be used to fetch options." + option_type = self.option_str_to_int(req_option) + i = 0 + # First, check if the option is already present in the packet + for option in self.layer.options: + # Scapy may try to be helpful and return the string of the option + next_option = self.option_str_to_int(option[0]) + if option_type == next_option: + _name, value = self.layer.options[i] + # Some options (timestamp, checksums, nop) store their value in a + # tuple. + if isinstance(value, tuple): + # Scapy returns values in any of these types + if value in [None, b'', ()]: + return '' + value = value[0] + if value in [None, b'', ()]: + return '' + if req_option == "md5header": + return binascii.hexlify(value).decode("utf-8") + + return value + i += 1 + return '' + + def set_options(self, packet, field, value): + """ + Helper method to set options. + """ + base, option = field.split("-") + assert base == "options", "Must use an options field with set_options" + + option_type = self.option_str_to_int(option) + if type(value) == str: + # Prepare the value for storage in the packet + value = binascii.unhexlify(value) + + # Scapy requires these options to be a tuple - since evaling this + # is not yet supported, for now, SAck will always be an empty tuple + if option in ["sack"]: + value = () + # These options must be set as integers - if they didn't exist, they can + # be added like this + if option in ["timestamp", "mss", "wscale", "altchksum", "uto"] and not value: + value = 0 + i = 0 + # First, check if the option is already present in the packet + for option in self.layer.options: + # Scapy may try to be helpful and return the string of the option + next_option = self.option_str_to_int(option[0]) + + if option_type == next_option: + packet["TCP"].options[i] = self.format_option(option_type, value) + break + i += 1 + # If we didn't break, the option doesn't exist in the packet currently. + else: + old_options_array = packet["TCP"].options + old_options_array.append(self.format_option(option_type, value)) + packet["TCP"].options = old_options_array + + # Let scapy recalculate the required values + del self.layer.chksum + del self.layer.dataofs + if packet.haslayer("IP"): + del packet["IP"].chksum + del packet["IP"].len + return True + + def gen_options(self, field): + """ + Helper method to set options. + """ + _, option = field.split("-") + option_num = self.options_names[option] + length = self.options_length[option_num] + + data = b'' + if length > 0: + data = os.urandom(length) + data = binascii.hexlify(data).decode() + # MSS must be a 2-byte int + if option_num == 2: + data = random.randint(0, 65535) + # WScale must be a 1-byte int + elif option_num == 3: + data = random.randint(0, 255) + # Timestamp must be an int + elif option_num == 8: + data = random.randint(0, 4294967294) + elif option_num == 14: + data = random.randint(0, 255) + elif option_num == 28: + data = random.randint(0, 255) + + return data + + def option_str_to_int(self, option): + """ + Takes a string representation of an option and returns the option integer code. + """ + if type(option) == int: + return option + + assert "-" not in option, "Must be given specific option: %s." % option + + for val in self.scapy_options: + if self.scapy_options[val].lower() == option.lower(): + return val + + if " " in option: + option = option.replace(" ", "_").lower() + + if option.lower() in self.options_names: + return self.options_names[option.lower()] + + def format_option(self, options_int, value): + """ + Formats the options so they will work with scapy. + """ + # NOPs + if options_int == 1: + return (self.scapy_options[options_int], ()) + elif options_int in [5]: + return (self.scapy_options[options_int], value) + # Timestamp + elif options_int in [8, 14]: + return (self.scapy_options[options_int], (value, 0)) + elif options_int in self.scapy_options: + return (self.scapy_options[options_int], value) + else: + return (options_int, value) + + +class UDPLayer(Layer): + """ + Defines an interface to access UDP header fields. + """ + name = "UDP" + protocol = UDP + _fields = [ + "sport", + "dport", + "chksum", + "len", + "load" + ] + fields = _fields + + def __init__(self, layer): + """ + Initializes the UDP layer. + """ + Layer.__init__(self, layer) + self.getters = { + 'load' : self.get_load, + } + self.setters = { + 'load' : self.set_load, + } + self.generators = { + 'load' : self.gen_load, + } diff --git a/actions/packet.py b/actions/packet.py new file mode 100644 index 0000000..ed1c575 --- /dev/null +++ b/actions/packet.py @@ -0,0 +1,240 @@ +import copy +import random + +import actions.layer + + +_SUPPORTED_LAYERS = [ + actions.layer.IPLayer, + actions.layer.TCPLayer, + actions.layer.UDPLayer +] +SUPPORTED_LAYERS = _SUPPORTED_LAYERS + + +class Packet(): + """ + Defines a Packet class, a convenience wrapper around + scapy packets for ease of use. + """ + def __init__(self, packet=None): + """ + Initializes the packet object. + """ + self.packet = packet + self.layers = self.setup_layers() + self.sleep = 0 + + def __str__(self): + """ + Defines string representation for the packet. + """ + return self._str_packet(self.packet) + + @staticmethod + def _str_packet(packet): + """ + Static method to print a scapy packet. + """ + if packet.haslayer("TCP"): + return "TCP %s:%d --> %s:%d [%s] %s: %s" % ( + packet["IP"].src, + packet["TCP"].sport, + packet["IP"].dst, + packet["TCP"].dport, + packet["TCP"].sprintf('%TCP.flags%'), + str(packet["TCP"].chksum), + Packet._str_load(packet["TCP"], "TCP")) + elif packet.haslayer("UDP"): + return "UDP %s:%d --> %s:%d %s: %s" % ( + packet["IP"].src, + packet["UDP"].sport, + packet["IP"].dst, + packet["UDP"].dport, + str(packet["UDP"].chksum), + Packet._str_load(packet["UDP"], "UDP")) + load = "" + if hasattr(packet["IP"], "load"): + load = str(bytes(packet["IP"].load)) + return "%s --> %s: %s" % ( + packet["IP"].src, + packet["IP"].dst, + load) + + @staticmethod + def _str_load(packet, protocol): + """ + Prints packet payload + """ + return str(packet[protocol].payload) + + def __bytes__(self): + """ + Returns packet's binary representation. + """ + return bytes(self.packet) + + def show(self, **kwargs): + """ + Calls scapy's show method. + """ + return self.packet.show(**kwargs) + + def show2(self, **kwargs): + """ + Calls scapy's show method. + """ + return self.packet.show2(**kwargs) + + def read_layers(self): + """ + Generator that yields parsed Layer objects from the protocols in the given packet. + """ + iter_packet = self.packet + while iter_packet: + if iter_packet.name.lower() == "raw": + return + parsed_layer = Packet.parse_layer(iter_packet) + if parsed_layer: + yield parsed_layer + iter_packet = parsed_layer.get_next_layer() + else: + iter_packet = iter_packet.payload + + def has_supported_layers(self): + """ + Checks if this packet contains supported layers. + """ + return bool(self.layers) + + def setup_layers(self): + """ + Sets up a lookup dictionary for the given layers in this packet. + """ + layers = {} + for layer in self.read_layers(): + layers[layer.name.upper()] = layer + return layers + + def copy(self): + """ + Deep copies this packet. This method is required because it is not safe + to use copy.deepcopy on this entire packet object, because the parsed layers + become disassociated with the underlying packet layers, which breaks layer + setting. + """ + return Packet(copy.deepcopy(self.packet)) + + @staticmethod + def parse_layer(to_parse): + """ + Takes a given scapy layer object and returns a Geneva Layer object. + """ + for layer in SUPPORTED_LAYERS: + if layer.name_matches(to_parse.name): + return layer(to_parse) + + def haslayer(self, layer): + """ + Checks if a given layer is in the packet. + """ + return self.packet.haslayer(layer) + + def __getitem__(self, item): + """ + Returns a layer. + """ + return self.packet[item] + + def set(self, str_protocol, field, value): + """ + Sets the given protocol field to the given value. + + Raises AssertionError if the protocol is not present. + """ + assert self.haslayer(str_protocol), "Given protocol %s is not in packet." % str_protocol + assert str_protocol in self.layers, "Given protocol %s is not permitted." % str_protocol + + # Recalculate the checksums + if self.haslayer("IP"): + del self.packet["IP"].chksum + if self.haslayer("TCP"): + del self.packet["TCP"].chksum + + return self.layers[str_protocol].set(self.packet, field, value) + + def get(self, str_protocol, field): + """ + Retrieves the value of a given field for a given protocol. + + Raises AssertionError if the protocol is not present. + """ + assert self.haslayer(str_protocol), "Given protocol %s is not in packet." % str_protocol + assert str_protocol in self.layers, "Given protocol %s is not permitted." % str_protocol + + return self.layers[str_protocol].get(field) + + def gen(self, str_protocol, field): + """ + Generates a value of a given field for a given protocol. + + Raises AssertionError if the protocol is not present. + """ + assert self.haslayer(str_protocol), "Given protocol %s is not in packet." % str_protocol + assert str_protocol in self.layers, "Given protocol %s is not permitted." % str_protocol + + return self.layers[str_protocol].gen(field) + + @classmethod + def parse(cls, str_protocol, field, value): + """ + Parses a given value for a given field of a given protocool. + + Raises AssertionError if the protocol is not present. + """ + parsing_layer = None + for layer in SUPPORTED_LAYERS: + if layer.name_matches(str_protocol): + parsing_layer = layer(None) + + assert parsing_layer, "Given protocol %s is not permitted." % str_protocol + + return parsing_layer.parse(field, value) + + def get_random_layer(self): + """ + Retrieves a random layer from this packet. + """ + return self.layers[random.choice(list(self.layers.keys()))] + + def get_random(self): + """ + Retrieves a random protocol, field, and value from this packet. + """ + layer = self.get_random_layer() + field, value = layer.get_random() + return layer.protocol, field, value + + @staticmethod + def gen_random(): + """ + Generates a possible random protocol, field, and value. + """ + # layer is a Geneva Layer class - to instantiate it, we must give it a layer + # to use. Every Geneva Layer stores the underlying scapy layer it wraps, + # so simply invoke that as a default. + layer = random.choice(SUPPORTED_LAYERS) + layer_obj = layer(layer.protocol()) + field, value = layer_obj.gen_random() + return layer.protocol, field, value + + @staticmethod + def get_supported_protocol(protocol): + """ + Checks if the given protocol exists in the SUPPORTED_LAYERS list. + """ + for layer in SUPPORTED_LAYERS: + if layer.name_matches(protocol.upper()): + return layer + + return None diff --git a/actions/sleep.py b/actions/sleep.py new file mode 100644 index 0000000..baf33f2 --- /dev/null +++ b/actions/sleep.py @@ -0,0 +1,37 @@ +from actions.action import Action + +class SleepAction(Action): + def __init__(self, time=1, environment_id=None): + Action.__init__(self, "sleep", "out") + self.terminal = False + self.branching = False + self.time = time + + def run(self, packet, logger): + """ + The sleep action simply passes along the packet it was given with an instruction for how long the engine should sleep before sending it. + """ + logger.debug(" - Adding %d sleep to given packet." % self.time) + packet.sleep = self.time + return packet, None + + def __str__(self): + """ + Returns a string representation. + """ + s = Action.__str__(self) + s += "{%d}" % self.time + return s + + def parse(self, string, logger): + """ + Parses a string representation for this object. + """ + try: + if string: + self.time = float(string) + except ValueError: + logger.exception("Cannot parse time %s" % string) + return False + + return True diff --git a/actions/sniffer.py b/actions/sniffer.py new file mode 100644 index 0000000..d4fc543 --- /dev/null +++ b/actions/sniffer.py @@ -0,0 +1,85 @@ +import threading +import os + +import actions.packet +from scapy.all import sniff +from scapy.utils import PcapWriter + + +class Sniffer(): + """ + The sniffer class lets the user begin and end sniffing whenever in a given location with a port to filter on. + Call start_sniffing to begin sniffing and stop_sniffing to stop sniffing. + """ + + def __init__(self, location, port, logger): + """ + Intializes a sniffer object. + Needs a location and a port to filter on. + """ + self.stop_sniffing_flag = False + self.location = location + self.port = port + self.pcap_thread = None + self.packet_dumper = None + self.logger = logger + full_path = os.path.dirname(location) + assert port, "Need to specify a port in order to launch a sniffer" + if not os.path.exists(full_path): + os.makedirs(full_path) + + def __packet_callback(self, scapy_packet): + """ + This callback is called whenever a packet is applied. + Returns true if it should finish, otherwise, returns false. + """ + packet = actions.packet.Packet(scapy_packet) + for proto in ["TCP", "UDP"]: + if(packet.haslayer(proto) and ((packet[proto].sport == self.port) or (packet[proto].dport == self.port))): + break + else: + return self.stop_sniffing_flag + + self.logger.debug(str(packet)) + self.packet_dumper.write(scapy_packet) + return self.stop_sniffing_flag + + def __spawn_sniffer(self): + """ + Saves pcaps to a file. Should be run as a thread. + Ends when the stop_sniffing_flag is set. Should not be called by user + """ + self.packet_dumper = PcapWriter(self.location, append=True, sync=True) + while(self.stop_sniffing_flag == False): + sniff(stop_filter=self.__packet_callback, timeout=1) + + def start_sniffing(self): + """ + Starts sniffing. Should be called by user. + """ + self.stop_sniffing_flag = False + self.pcap_thread = threading.Thread(target=self.__spawn_sniffer) + self.pcap_thread.start() + self.logger.debug("Sniffer starting to port %d" % self.port) + + def __enter__(self): + """ + Defines a context manager for this sniffer; simply starts sniffing. + """ + self.start_sniffing() + return self + + def __exit__(self, exc_type, exc_value, tb): + """ + Defines exit context manager behavior for this sniffer; simply stops sniffing. + """ + self.stop_sniffing() + + def stop_sniffing(self): + """ + Stops the sniffer by setting the flag and calling join + """ + if(self.pcap_thread): + self.stop_sniffing_flag = True + self.pcap_thread.join() + self.logger.debug("Sniffer stopping") diff --git a/actions/strategy.py b/actions/strategy.py new file mode 100644 index 0000000..613f029 --- /dev/null +++ b/actions/strategy.py @@ -0,0 +1,88 @@ +import random + +import actions.utils +import actions.tree + + +class Strategy(object): + def __init__(self, in_actions, out_actions, environment_id=None): + self.in_actions = in_actions + self.out_actions = out_actions + self.in_enabled = True + self.out_enabled = True + + self.environment_id = environment_id + self.fitness = -1000 + + def __str__(self): + """ + Builds a string describing the action trees for this strategy. + """ + return "%s \/ %s" % (self.str_forest(self.out_actions).strip(), self.str_forest(self.in_actions).strip()) + + def __len__(self): + """ + Returns the number of actions in this strategy. + """ + num = 0 + for tree in self.in_actions: + num += len(tree) + for tree in self.out_actions: + num += len(tree) + return num + + def str_forest(self, forest): + """ + Returns a string representation of a given forest (inbound or outbound) + """ + rep = "" + for action_tree in forest: + rep += "%s " % str(action_tree) + return rep + + def pretty_print(self): + return "%s \n \/ \n %s" % (self.pretty_str_forest(self.out_actions), self.pretty_str_forest(self.in_actions)) + + def pretty_str_forest(self, forest): + """ + Returns a string representation of a given forest (inbound or outbound) + """ + rep = "" + for action_tree in forest: + rep += "%s\n" % action_tree.pretty_print() + return rep + + def act_on_packet(self, packet, logger, direction="out"): + """ + Runs the strategy on a given scapy packet. + """ + # If there are no actions to run for this strategy, just send the packet + if (direction == "out" and not self.out_actions) or \ + (direction == "in" and not self.in_actions): + return [packet] + return self.run_on_packet(packet, logger, direction) + + def run_on_packet(self, packet, logger, direction): + """ + Runs the strategy on a given packet given the forest direction. + """ + forest = self.out_actions + if direction == "in": + forest = self.in_actions + + ran = False + original_packet = packet.copy() + packets_to_send = [] + for action_tree in forest: + if action_tree.check(original_packet, logger): + logger.debug(" + %s action tree triggered: %s", direction, str(action_tree)) + # If multiple trees run, the previous packet may have been tampered with. Ensure + # we're always acting on a fresh copy + fresh_packet = original_packet.copy() + packets_to_send += action_tree.run(fresh_packet, logger) + ran = True + + # If no action tree was applicable, send the packet unimpeded + if not ran: + packets_to_send = [packet] + return packets_to_send diff --git a/actions/tamper.py b/actions/tamper.py new file mode 100644 index 0000000..9deaf47 --- /dev/null +++ b/actions/tamper.py @@ -0,0 +1,114 @@ +""" +TamperAction + +One of the four packet-level primitives supported by Geneva. Responsible for any packet-level +modifications (particularly header modifications). It supports replace and corrupt mode - +in replace mode, it changes a packet field to a fixed value; in corrupt mode, it changes a packet +field to a randomly generated value each time it is run. +""" + +from actions.action import Action +import actions.utils + +import random + + +class TamperAction(Action): + """ + Defines the TamperAction for Geneva. + """ + def __init__(self, environment_id=None, field=None, tamper_type=None, tamper_value=None, tamper_proto="TCP"): + Action.__init__(self, "tamper", "both") + self.field = field + self.tamper_value = tamper_value + self.tamper_proto = actions.utils.string_to_protocol(tamper_proto) + self.tamper_proto_str = tamper_proto + + self.tamper_type = tamper_type + if not self.tamper_type: + self.tamper_type = random.choice(["corrupt", "replace"]) + + def tamper(self, packet, logger): + """ + Edits a given packet according to the action settings. + """ + # Return packet untouched if not applicable + if not packet.haslayer(self.tamper_proto_str): + return packet + + # Retrieve the old value of the field for logging purposes + old_value = packet.get(self.tamper_proto_str, self.field) + + new_value = self.tamper_value + # If corrupting the packet field, generate a value for it + if self.tamper_type == "corrupt": + new_value = packet.gen(self.tamper_proto_str, self.field) + + logger.debug(" - Tampering %s field `%s` (%s) by %s (to %s)" % + (self.tamper_proto_str, self.field, str(old_value), self.tamper_type, str(new_value))) + + packet.set(self.tamper_proto_str, self.field, new_value) + + return packet + + def run(self, packet, logger): + """ + The tamper action runs its tamper procedure on the given packet, and + returns the edited packet down the left branch. + + Nothing is returned to the right branch. + """ + return self.tamper(packet, logger), None + + def __str__(self): + """ + Defines string representation for this object. + """ + s = Action.__str__(self) + if self.tamper_type == "corrupt": + s += "{%s:%s:%s}" % (self.tamper_proto_str, self.field, self.tamper_type) + elif self.tamper_type in ["replace"]: + s += "{%s:%s:%s:%s}" % (self.tamper_proto_str, self.field, self.tamper_type, self.tamper_value) + + return s + + def parse(self, string, logger): + """ + Parse out a given string representation of this action and initialize + this action to those parameters. + + Note that the given logger is a DIFFERENT logger than the logger passed + to the other functions, and they cannot be used interchangeably. This logger + is attached to the main GA driver, and is run outside the evaluator. When the + action is actually run, it's run within the evaluator, which by necessity must + pass in a different logger. + """ + # Different tamper actions will have different number of parameters + # Count the number of params in this given string + num_parameters = string.count(":") + + # If num_parameters is greater than 3, it's not a valid tamper action + if num_parameters > 3 or num_parameters < 2: + msg = "Cannot parse tamper action %s" % string + logger.error(msg) + raise Exception(msg) + params = string.split(":") + if num_parameters == 3: + self.tamper_proto_str, self.field, self.tamper_type, self.tamper_value = params + self.tamper_proto = actions.utils.string_to_protocol(self.tamper_proto_str) + if "options" in self.field: + if not self.tamper_value: + self.tamper_value = '' # An empty string instead of an empty byte literal + + # tamper_value might be parsed as a string despite being an integer in most cases. + # Try to parse it out here + try: + if "load" not in self.field: + self.tamper_value = int(self.tamper_value) + except: + pass + else: + self.tamper_proto_str, self.field, self.tamper_type = params + self.tamper_proto = actions.utils.string_to_protocol(self.tamper_proto_str) + + return True diff --git a/actions/tree.py b/actions/tree.py new file mode 100644 index 0000000..6a87ac4 --- /dev/null +++ b/actions/tree.py @@ -0,0 +1,483 @@ +""" +Defines an action tree. Action trees are comprised of a trigger and a tree of actions. +""" + +import random +import re + +import anytree +from anytree.exporter import DotExporter + +import actions.utils +import actions.trigger + + +class ActionTreeParseError(Exception): + """ + Exception thrown when an action tree is malformed or cannot be parsed. + """ + + +class ActionTree(): + """ + Defines an ActionTree for the Geneva system. + """ + + def __init__(self, direction, trigger=None): + self.trigger = trigger + self.action_root = None + self.direction = direction + self.environment_id = None + self.ran = False + + def initialize(self, num_actions, environment_id, allow_terminal=True, disabled=None): + """ + Sets up this action tree with a given number of random actions. + Note that the returned action trees may have less actions than num_actions + if terminal actions are used. + """ + self.environment_id = environment_id + self.trigger = actions.trigger.Trigger(None, None, None, environment_id=environment_id) + if not allow_terminal or random.random() > 0.1: + allow_terminal = False + + for _ in range(num_actions): + new_action = self.get_rand_action(self.direction, disabled=disabled) + self.add_action(new_action) + return self + + def __iter__(self): + """ + Sets up a preoder iterator for the tree. + """ + for node in self.preorder(self.action_root): + yield node + + def __getitem__(self, index): + """ + Implements indexing + """ + if index > len(self): + return None + # Wrap around if given negative number to allow for negative indexing + if index < 0: + index = index + len(self) + idx = 0 + for action in self: + if idx == index: + return action + idx += 1 + + def __len__(self): + """ + Calculates the number of actions in the tree. + """ + num = 0 + for node in self: + if node: + num += 1 + return num + + def __str__(self): + """ + Returns a string representation for the action tree. + """ + rep = "[%s]-" % str(self.trigger) + for string in self.string_repr(self.action_root): + rep += string + if not rep.endswith("-"): + rep += "-" + rep += "|" + return rep + + def do_parse(self, node, string, logger): + """ + Handles the preorder recursive parsing. + """ + # If we're passed an empty string, return None + if not string.strip(): + return None + + # If there is no subtree defined here, the action string is the string we've been given, + # and there are no left or right actions left + if "(" not in string: + action_string = string + left_actions, right_actions = "", "" + else: + # Find the outermost (first) occurance of "(" - this defines the boundaries between + # this current action and it's subtree actions + subtree_idx = string.find("(") + # Split this string into the action and it's subtree string + action_string, rest = string[:subtree_idx], string[subtree_idx:] + + # We need to split the remaining string at the correct comma that splits this subtree's + # left and right actions. To find the correct comma to split on, we need to find the + # comma that splits the current tree 'in the middle' - where the depth is the number of + # splits. This occurs when we cound the same number of commas as open parenthesis "(". + depth = 0 + comma = 0 + idx = 0 + for char in rest: + if char == "(": + depth += 1 + if char == ",": + comma += 1 + if comma == depth and depth > 0: + break + idx += 1 + # If we did not break, we didn't find where to split. Raise an exception + else: + raise ActionTreeParseError("Given string %s is malformed" % string) + + # Split on this index, and ignore the first character "(" and last character ")" + left_actions, right_actions = rest[1:idx], rest[idx+1:-1] + + # Parse the action_string using action.utils + action_obj = actions.action.Action.parse_action(action_string, self.direction, logger) + if not action_obj: + raise ActionTreeParseError("Did not get a legitimate action object from %s" % + action_string) + + # Assign this action_obj to the node + node = action_obj + + # Sanity check - if this is not a branching action but it has right actions, raise + if not node.branching and right_actions: + raise ActionTreeParseError("Cannot have a non branching action with right subtree") + + # Sanity check = if this is a termainal action but it has sub actions, raise + if node.terminal and (right_actions or left_actions): + raise ActionTreeParseError("Cannot have a terminal action with children") + + # If we have a left action and were given a packet to pass on, run + # on the left packet + if left_actions: + node.left = self.do_parse(node.left, left_actions, logger) + + # If we have a left action and were given a packet to pass on, run + # on the left packet + if right_actions: + node.right = self.do_parse(node.right, right_actions, logger) + + return node + + def parse(self, string, logger): + """ + Parses a string representation of an action tree into this object. + """ + # Individual action trees always end in "|" to signify the end - refuse + # to parse if this is malformed + if not string.strip().endswith("|"): + msg = "Tree does not end with |. Was I given an entire strategy or part of a tree?" + logger.error(msg) + return False + + # The format of each action matches this regular expression. For example, given + # the action tree: [TCP:flags:SA]-tamper{TCP:flags:corrupt}-| + # it would parse out the trigger "TCP:flags:SA" and the tree as + # "-tamper{TCP:flags:corrupt}" + match = re.match(r"\[(\S*)\]-(\S*)|", string) + if not match or not match[0]: + logger.error("Could not identify trigger or tree") + return False + + # Ask the trigger class to parse this trigger to define a new object + trigger = actions.trigger.Trigger.parse(match.group(1)) + # If we couldn't parse the trigger, bail + if not trigger: + logger.error("Trigger failed to parse") + return False + tree = match.group(2) + + # Depending on the form of the action tree, there might be a hanging "-|" or "|" + # Remove them + if tree.endswith("-|"): + tree = tree.replace("-|", "") + if tree.endswith("|"): + tree = tree.replace("|", "") + + # Parse the rest of the tree and setup the action tree + try: + self.action_root = self.do_parse(self.action_root, tree, logger) + except ActionTreeParseError: + logger.exception("Exception caught from parser") + return False + + self.trigger = trigger + return self + + def check(self, packet, logger): + """ + Checks if this action tree should run on this packet. + """ + return self.trigger.is_applicable(packet, logger) + + def do_run(self, node, packet, logger): + """ + Handles recursively running a packet down the tree. + """ + # If there is no action here, yield None + if not node: + yield None + else: + # Run this current action against the given packet. + # It will give us a packet to pass to the left and right child + left_packet, right_packet = node.run(packet, logger) + + # If there is no left child, yield the left packet + if not node.left: + yield left_packet + + # If we have a left action and were given a packet to pass on, run + # on the left packet + if node.left and left_packet: + for lpacket in self.do_run(node.left, left_packet, logger): + yield lpacket + + # If there is no right child, yield the right packet + if not node.right: + yield right_packet + + # If we have a left action and were given a packet to pass on, run + # on the left packet + if node.right and right_packet: + for rpacket in self.do_run(node.right, right_packet, logger): + yield rpacket + + def run(self, packet, logger): + """ + Runs a packet through the action tree. + """ + self.ran = True + packets = [] + for processed in self.do_run(self.action_root, packet, logger): + if processed: + packets.append(processed) + return packets + + def preorder(self, node): + """ + Yields a preorder traversal of the tree. + """ + yield node + if node and node.left: + for lnode in self.preorder(node.left): + yield lnode + if node and node.right: + for rnode in self.preorder(node.right): + yield rnode + + def string_repr(self, node): + """ + Yields a preorder traversal of the tree to create a string representation. + """ + if not node: + yield "" + else: + + yield "%s" % node + # Only yield a subtree start representation if there is a subtree to build + if node.left or node.right: + yield "(" + + # Setup the left subtree representation + if node.left: + for lnode in self.string_repr(node.left): + yield str(lnode) + + # Only yield subtree representation if there is a subtree to build + if node.left or node.right: + yield "," + + # Setup the right subtree representation + if node and node.right: + for rnode in self.string_repr(node.right): + yield str(rnode) + + # Only yield subtree representation if there is a subtree to build + if node.left or node.right: + yield ")" + + def remove_action(self, action): + """ + Removes a given action from the tree. + """ + # If there is only an action root and no other actions, just delete the root + if action == self.action_root and not self.action_root.left and not self.action_root.right: + self.action_root = None + return True + + # If there is no tree at all, nothing to remove + if not self.action_root: + return False + + for node in self: + # If the node we're removing is the root of the tree, replace it with the left child + # if it exists; if not, the right child. + if node == action and action == self.action_root: + self.action_root = action.left or action.right + return True + if node.left == action: + node.left = action.left + return True + if node.right == action: + node.right = action.left + return True + return False + + def get_slots(self): + """ + Returns the number of locations a new action could be added. + """ + slots = 0 + for action in self: + # Terminal actions have no children + if action.terminal: + continue + if not action.left: + slots += 1 + if not action.right and action.branching: + slots += 1 + return slots + + def count_leaves(self): + """ + Counts the number of leaves. + """ + leaves = 0 + for action in self: + if not action.left and not action.right: + leaves += 1 + return leaves + + def contains(self, action): + """ + Checks if an action is contained in the tree. + """ + for node in self: + if node == action: + return True + return False + + def add_action(self, new_action): + """ + Adds an action to this action tree. + """ + # Refuse to add None actions + if not new_action: + return False + + # If no actions are in this tree yet, this given action is the new root + if not self.action_root: + self.action_root = new_action + return True + + # We cannot add an action if it is already in the tree, or we could recurse + # forever + if self.contains(new_action): + return False + # Count the open spaces that we could put a new action to. + # This is effectively counting the leaves we could have if all the leaves had children + slots = self.get_slots() + + # We will visit each available slot and add the action there with probability + # 1/slots. Since it's possible we could visit every slot without having hit that + # probability yet, keep iterating until we do. + action_added = False + while not action_added and slots > 0: + for action in self: + if not action.left and not action.terminal and random.random() < 1/float(slots): + action.left = new_action + action_added = True + break + # We can only add to the right child if this action can introduce branching, + # such as (duplicate, fragment) + if not action.right and not action.terminal and action.branching and \ + random.random() < 1/float(slots): + action.right = new_action + action_added = True + break + return action_added + + def remove_one(self): + """ + Removes a random leaf from the tree. + """ + if not self.action_root: + return False + action = random.choice(self) + return self.remove_action(action) + + def choose_one(self): + """ + Picks a random element in the tree. + """ + # If this is an empty tree, return None + if not self.action_root: + return None + return random.choice(self) + + def get_parent(self, node): + """ + Returns the parent of the given node and direction of the child. + """ + # If we're given None, bail with None, None + if not node: + return None, None + for action in self: + if action.left == node: + return action, "left" + if action.right == node: + return action, "right" + return None, None + + def pretty_print_help(self, root, visual=False, parent=None): + """ + Pretty prints the tree. + - root is the highest-level node you wish to start printing + - [visual] controls whether a png should be created, by default, this is false. + - [parent] is an optional parameter specifying the parent of a given node, should + only be used by this function. + + Returns the root with its children as an anytree node. + """ + if not root: + return None + if visual: + newroot = anytree.Node(str(root) + "(" + str(root.ident) + ")", parent=parent) + else: + newroot = anytree.Node(str(root), parent=parent) + + if root.left: + newroot.left = self.pretty_print_help(root.left, visual, parent=newroot) + else: + if not root.terminal: + # Drop never sends packets + newroot.left = anytree.Node(' ===> ', parent=newroot) + + if root.right: + newroot.right = self.pretty_print_help(root.right, visual, parent=newroot) + else: + if (not root.terminal and root.branching): + # Tamper only has one child + newroot.right = anytree.Node(' ===> ', parent=newroot) + + return newroot + + def pretty_print(self, visual=False): + """ + Pretty prints the tree. + """ + if visual: + newroot = self.pretty_print_help(self.action_root, visual=True) + if newroot: + DotExporter(newroot).to_picture("tree.png") + else: + newroot = self.pretty_print_help(self.action_root, visual=False) + if not newroot: + return "" + # use an array and join so there's never an issue with newlines at the end + pretty_string = [] + for pre, _fill, node in anytree.RenderTree(newroot): + pretty_string.append(("%s%s" % (pre, node.name))) + return "%s\n%s" % (str(self.trigger), '\n'.join(pretty_string)) diff --git a/actions/trigger.py b/actions/trigger.py new file mode 100644 index 0000000..5453270 --- /dev/null +++ b/actions/trigger.py @@ -0,0 +1,153 @@ +import actions.utils +import random +import re + +FIXED_TRIGGER = None +GAS_ENABLED = True + +class Trigger(object): + def __init__(self, trigger_type, trigger_field, trigger_proto, trigger_value=0, environment_id=None, gas=None): + """ + Params: + - trigger_type: the type of trigger. Only "field" (matching based on a field value) is currently supported. + - trigger_field: the field the trigger should check in a packet to trigger on + - trigger_proto: the protocol the trigger should look in to retrieve the trigger field + - trigger_value: the value in the trigger_field that, upon a match, will cause the trigger to fire + - environment_id: environment_id the current trigger is running under. Used to retrieve previously saved packets + - gas: how many times this trigger can fire before it stops triggering. gas=None disables gas (unlimited triggers.) + """ + self.trigger_type = trigger_type + self.trigger_field = trigger_field + self.trigger_proto = trigger_proto + self.trigger_value = trigger_value + self.environment_id = environment_id + self.num_seen = 0 + self.gas_remaining = gas + # Bomb triggers act like reverse triggers. They run the action only after the action has been triggered x times + self.bomb_trigger = bool(gas and gas < 0) + self.ran = False + + @staticmethod + def get_gas(): + """ + Returns a random value for gas for this trigger. + """ + if GAS_ENABLED and random.random() < 0.2: + # Use gas in 20% of scenarios + # Pick a number for gas between 0 - 5 + gas_remaining = int(random.random() * 5) + else: + # Do not use gas + gas_remaining = None + return gas_remaining + + def is_applicable(self, packet, logger): + """ + Checks if this trigger is applicable to a given packet. + """ + will_run = False + self.num_seen += 1 + if not packet.haslayer(self.trigger_proto): + return False + + packet_value = packet.get(self.trigger_proto, self.trigger_field) + will_run = (self.trigger_value == packet_value) + + # Track if this action is used + if (not GAS_ENABLED or self.gas_remaining is None) and will_run: + self.ran = True + # If this is a normal trigger and we are out of gas, do not run + elif not self.bomb_trigger and will_run and self.gas_remaining == 0: + will_run = False + # If this is a bomb trigger, run once gas hits zero + elif self.bomb_trigger and will_run and self.gas_remaining == 0: + self.ran = True + # If this is a normal trigger and we still have gas remaining, run and + # decrement the gas + elif will_run and self.gas_remaining > 0: + # Gas is enabled and we haven't run out yet, subtract one from our gas + self.gas_remaining -= 1 + self.ran = True + # A bomb trigger has negative gas - it does not allow the action to run until the trigger + # matches x times + elif will_run and self.gas_remaining < 0: + self.gas_remaining += 1 + will_run = False + + return will_run + + def __str__(self): + """ + Returns a string representation of this trigger in the form: + ::: + or + :: + """ + if self.gas_remaining is not None: + return str(self.trigger_proto)+":"+str(self.trigger_field)+":"+str(self.trigger_value)+":"+str(self.gas_remaining) + else: + return str(self.trigger_proto)+":"+str(self.trigger_field)+":"+str(self.trigger_value) + + def add_gas(self, gas): + """ + Adds gas to this trigger, gas is an integer + """ + if self.gas_remaining is not None: + self.gas_remaining += gas + + def set_gas(self, gas): + """ + Sets the gas to the specified value + """ + self.gas_remaining = gas + + def disable_gas(self): + """ + Disables the use of gas. + """ + self.gas_remaining = None + + def enable_gas(self): + """ + Sets gas to 0 + """ + self.gas_remaining = 0 + + @staticmethod + def parse(string): + """ + Given a string representation of a trigger, define a new Trigger represented + by this string. + """ + if string: + string = string.strip() + if string and string.startswith("["): + string = string[1:] + if string and string.endswith("]"): + string = string[:-1] + + # Trigger is currently a 4-way data tuple of pieces separated by a ":" + m = re.match("(\S*):(\S*):(\S*):(\S*)", string) + has_gas = True + if not m: + has_gas = False + m = re.match("(\S*):(\S*):(\S*)", string) + if not m: + return None + + trigger_type = "field" + proto = m.group(1) + field = m.group(2) + value = m.group(3) + + # Parse out the given value if necessary + value = actions.packet.Packet.parse(proto, field, value) + + # Trigger gas is set to None if it is disabled + trigger_gas = None + if has_gas: + trigger_gas = int(m.group(4)) + + # Define the new trigger with these parameters + t = Trigger(trigger_type, field, proto, value, gas=trigger_gas) + return t diff --git a/actions/utils.py b/actions/utils.py new file mode 100644 index 0000000..5bad6ee --- /dev/null +++ b/actions/utils.py @@ -0,0 +1,231 @@ +import copy +import datetime +import importlib +import inspect +import logging +import os +import string +import sys +import random +import urllib.parse + +import actions.action +import actions.trigger +import actions.packet + +from scapy.all import TCP, IP, UDP, rdpcap +import netifaces + + +RUN_DIRECTORY = datetime.datetime.now().strftime("%Y-%m-%d_%H:%M:%S") + +# Hard coded options +FLAGFOLDER = "flags" + +# Holds copy of console file handler's log level +CONSOLE_LOG_LEVEL = logging.DEBUG + + +BASEPATH = os.path.dirname(os.path.abspath(__file__)) +PROJECT_ROOT = os.path.dirname(BASEPATH) + + +def parse(requested_trees, logger): + """ + Parses a string representation of a solution into its object form. + """ + # First, strip off any hanging quotes at beginning/end of the strategy + if requested_trees.startswith("\""): + requested_trees = requested_trees[1:] + if requested_trees.endswith("\""): + requested_trees = requested_trees[:-1] + + # Define a blank strategy to initialize with the user specified string + strat = actions.strategy.Strategy([], []) + + # Actions for the in and out forest are separated by a "\/". + # Split the given string by this token + out_in_actions = requested_trees.split("\\/") + + # Specify that we're starting with the out forest before we parse the in forest + out = True + direction = "out" + # For each string representation of the action directions, in or out + for str_actions in out_in_actions: + # Individual action trees always end in "|" to signify the end - split the + # entire action sequence into individual trees + str_actions = str_actions.split("|") + + # For each string representation of each tree in the forest + for str_action in str_actions: + # If it's an empty action, skip it + if not str_action.strip(): + continue + + assert " " not in str_action.strip(), "Strategy includes a space - malformed!" + + # Get rid of hanging whitespace from the splitting + str_action = str_action.strip() + + # ActionTree uses the last "|" as a sanity check for well-formed + # strategies, so restore the "|" that was lost from the split + str_action = str_action + "|" + new_tree = actions.tree.ActionTree(direction) + new_tree.parse(str_action, logger) + + # Once all the actions are parsed, add this tree to the + # current direction of actions + if out: + strat.out_actions.append(new_tree) + else: + strat.in_actions.append(new_tree) + # Change the flag to tell it to parse the IN direction during the next loop iteration + out = False + direction = "in" + return strat + + +def get_logger(basepath, log_dir, logger_name, log_name, environment_id, log_level=logging.DEBUG): + """ + Configures and returns a logger. + """ + if type(log_level) == str: + log_level = log_level.upper() + global CONSOLE_LOG_LEVEL + full_path = os.path.join(basepath, log_dir, "logs") + if not os.path.exists(full_path): + os.makedirs(full_path) + flag_path = os.path.join(basepath, log_dir, "flags") + if not os.path.exists(flag_path): + os.makedirs(flag_path) + # Set up a client logger + logger = logging.getLogger(logger_name + environment_id) + logger.setLevel(logging.DEBUG) + # Disable the root logger to avoid double printing + logger.propagate = False + + # If we've already setup the handlers for this logger, just return it + if logger.handlers: + return logger + fh = logging.FileHandler(os.path.join(basepath, log_dir, "logs", "%s.%s.log" % (environment_id, log_name))) + fh.setLevel(logging.DEBUG) + + log_prefix = "[%s] " % log_name.upper() + formatter = logging.Formatter("%(asctime)s %(levelname)s:" + log_prefix + "%(message)s", datefmt="%Y-%m-%d %H:%M:%S") + file_formatter = logging.Formatter(log_prefix + "%(asctime)s %(message)s") + fh.setFormatter(file_formatter) + logger.addHandler(fh) + + ch = logging.StreamHandler() + ch.setFormatter(formatter) + ch.setLevel(log_level) + CONSOLE_LOG_LEVEL = log_level + logger.addHandler(ch) + return logger + + +def close_logger(logger): + """ + Closes open file handles for a given logger. + """ + # Close the file handles so we don't hold a ton of file descriptors open + handlers = logger.handlers[:] + for handler in handlers: + if isinstance(handler, logging.FileHandler): + handler.close() + + +class Logger(): + """ + Logging class context manager, as a thin wrapper around the logging class to help + handle closing open file descriptors. + """ + def __init__(self, log_dir, logger_name, log_name, environment_id, log_level=logging.DEBUG): + self.log_dir = log_dir + self.logger_name = logger_name + self.log_name = log_name + self.environment_id = environment_id + self.log_level = log_level + self.logger = None + + def __enter__(self): + """ + Sets up a logger. + """ + self.logger = get_logger(PROJECT_ROOT, self.log_dir, self.logger_name, self.log_name, self.environment_id, log_level=self.log_level) + return self.logger + + def __exit__(self, exc_type, exc_value, tb): + """ + Closes file handles. + """ + close_logger(self.logger) + + + +def get_console_log_level(): + """ + returns log level of console handler + """ + return CONSOLE_LOG_LEVEL + + +def string_to_protocol(protocol): + """ + Converts string representations of scapy protocol objects to + their actual objects. For example, "TCP" to the scapy TCP object. + """ + if protocol.upper() == "TCP": + return TCP + elif protocol.upper() == "IP": + return IP + elif protocol.upper() == "UDP": + return UDP + + +def get_id(): + """ + Returns a random ID + """ + return ''.join([random.choice(string.ascii_lowercase + string.digits) for k in range(8)]) + + +def setup_dirs(output_dir): + """ + Sets up Geneva folder structure. + """ + ga_log_dir = os.path.join(output_dir, "logs") + ga_flags_dir = os.path.join(output_dir, "flags") + ga_packets_dir = os.path.join(output_dir, "packets") + ga_generations_dir = os.path.join(output_dir, "generations") + ga_data_dir = os.path.join(output_dir, "data") + for directory in [ga_log_dir, ga_flags_dir, ga_packets_dir, ga_generations_dir, ga_data_dir]: + if not os.path.exists(directory): + os.makedirs(directory, exist_ok=True) + return ga_log_dir + + +def get_from_fuzzed_or_real_packet(environment_id, real_packet_probability, enable_options=True, enable_load=True): + """ + Retrieves a protocol, field, and value from a fuzzed or real packet, depending on + the given probability and if given packets is not None. + """ + packets = actions.utils.read_packets(environment_id) + if packets and random.random() < real_packet_probability: + packet = random.choice(packets) + return packet.get_random() + return actions.packet.Packet().gen_random() + + +def get_interface(): + """ + Chooses an interface on the machine to use for socket testing. + """ + ifaces = netifaces.interfaces() + for iface in ifaces: + if "lo" in iface: + continue + info = netifaces.ifaddresses(iface) + # Filter for IPv4 addresses + if netifaces.AF_INET in info: + return iface diff --git a/citation.bib b/citation.bib new file mode 100644 index 0000000..b3fddd6 --- /dev/null +++ b/citation.bib @@ -0,0 +1,11 @@ +K. Bock, G. Hughey, X. Qiang, and D. Levin, "Geneva: Evolving Censorship Evasion," in ACM Conference on Computer and Communications Security (CCS), 2019 + +Latex format: + +@inproceedings{geneva, + title = {{$\mathsf{Geneva}$: Evolving Censorship Evasion}}, + author = {Kevin Bock and George Hughey and Xiao Qiang and Dave Levin}, + booktitle=CCS, + year = {2019}, +} + diff --git a/engine.py b/engine.py new file mode 100644 index 0000000..7f45e08 --- /dev/null +++ b/engine.py @@ -0,0 +1,338 @@ +""" +Engine + +Given a strategy and a server port, the engine configures NFQueue +so the strategy can run on the underlying connection. +""" + +import argparse +import logging +logging.getLogger("scapy.runtime").setLevel(logging.ERROR) +import os +import socket +import subprocess +import threading +import time + +import netfilterqueue + +from scapy.layers.inet import IP +from scapy.utils import wrpcap +from scapy.config import conf + +socket.setdefaulttimeout(1) + +import actions.packet +import actions.strategy +import actions.utils + +BASEPATH = os.path.dirname(os.path.abspath(__file__)) + + +class Engine(): + def __init__(self, server_port, string_strategy, environment_id=None, output_directory="trials", log_level="info"): + self.server_port = server_port + self.seen_packets = [] + # Set up the directory and ID for logging + if not output_directory: + output_directory = "trials" + actions.utils.setup_dirs(output_directory) + if not environment_id: + environment_id = actions.utils.get_id() + + self.environment_id = environment_id + # Set up a logger + self.logger = actions.utils.get_logger(BASEPATH, + output_directory, + __name__, + "engine", + environment_id, + log_level=log_level) + self.output_directory = output_directory + + # Used for conditional context manager usage + self.strategy = actions.utils.parse(string_strategy, self.logger) + # Setup variables used by the NFQueue system + self.out_nfqueue_started = False + self.in_nfqueue_started = False + self.running_nfqueue = False + self.out_nfqueue = None + self.in_nfqueue = None + self.out_nfqueue_socket = None + self.in_nfqueue_socket = None + self.out_nfqueue_thread = None + self.in_nfqueue_thread = None + self.censorship_detected = False + # Specifically define an L3Socket to send our packets. This is an optimization + # for scapy to send packets more quickly than using just send(), as under the hood + # send() creates and then destroys a socket each time, imparting a large amount + # of overhead. + self.socket = conf.L3socket(iface=actions.utils.get_interface()) + + def __enter__(self): + """ + Allows the engine to be used as a context manager; simply launches the + engine. + """ + self.initialize_nfqueue() + return self + + def __exit__(self, exc_type, exc_value, tb): + """ + Allows the engine to be used as a context manager; simply stops the engine + """ + self.shutdown_nfqueue() + + def mysend(self, packet): + """ + Helper scapy sending method. Expects a Geneva Packet input. + """ + try: + self.logger.debug("Sending packet %s", str(packet)) + self.socket.send(packet.packet) + except Exception: + self.logger.exception("Error in engine mysend.") + + def delayed_send(self, packet, delay): + """ + Method to be started by a thread to delay the sending of a packet without blocking the main thread. + """ + self.logger.debug("Sleeping for %f seconds." % delay) + time.sleep(delay) + self.mysend(packet) + + def run_nfqueue(self, nfqueue, nfqueue_socket, direction): + """ + Handles running the outbound nfqueue socket with the socket timeout. + """ + try: + while self.running_nfqueue: + try: + if direction == "out": + self.out_nfqueue_started = True + else: + self.in_nfqueue_started = True + + nfqueue.run_socket(nfqueue_socket) + except socket.timeout: + pass + except Exception: + self.logger.exception("Exception out of run_nfqueue() (direction=%s)", direction) + + def configure_iptables(self, remove=False): + """ + Handles setting up ipables for this run + """ + self.logger.debug("Configuring iptables rules") + + port1, port2 = "dport", "sport" + + out_chain = "OUTPUT" + in_chain = "INPUT" + + # Switch whether the command should add or delete the rules + add_or_remove = "A" + if remove: + add_or_remove = "D" + cmds = [] + for proto in ["tcp", "udp"]: + cmds += ["iptables -%s %s -p %s --%s %d -j NFQUEUE --queue-num 1" % + (add_or_remove, out_chain, proto, port1, self.server_port), + "iptables -%s %s -p %s --%s %d -j NFQUEUE --queue-num 2" % + (add_or_remove, in_chain, proto, port2, self.server_port)] + + for cmd in cmds: + self.logger.debug(cmd) + # If we're logging at DEBUG mode, keep stderr/stdout piped to us + # Otherwise, pipe them both to DEVNULL + if actions.utils.get_console_log_level() == logging.DEBUG: + subprocess.check_call(cmd.split(), timeout=60) + else: + subprocess.check_call(cmd.split(), stderr=subprocess.DEVNULL, stdout=subprocess.DEVNULL, timeout=60) + return cmds + + def initialize_nfqueue(self): + """ + Initializes the nfqueue for input and output forests. + """ + self.logger.debug("Engine created with strategy %s (ID %s) to port %s", + str(self.strategy).strip(), self.environment_id, self.server_port) + self.configure_iptables() + + self.out_nfqueue_started = False + self.in_nfqueue_started = False + self.running_nfqueue = True + # Create our NFQueues + self.out_nfqueue = netfilterqueue.NetfilterQueue() + self.in_nfqueue = netfilterqueue.NetfilterQueue() + # Bind them + self.out_nfqueue.bind(1, self.out_callback) + self.in_nfqueue.bind(2, self.in_callback) + # Create our nfqueue sockets to allow for non-blocking usage + self.out_nfqueue_socket = socket.fromfd(self.out_nfqueue.get_fd(), + socket.AF_UNIX, + socket.SOCK_STREAM) + self.in_nfqueue_socket = socket.fromfd(self.in_nfqueue.get_fd(), + socket.AF_UNIX, + socket.SOCK_STREAM) + # Create our handling threads for packets + self.out_nfqueue_thread = threading.Thread(target=self.run_nfqueue, + args=(self.out_nfqueue, self.out_nfqueue_socket, "out")) + + self.in_nfqueue_thread = threading.Thread(target=self.run_nfqueue, + args=(self.in_nfqueue, self.in_nfqueue_socket, "in")) + # Start each thread + self.in_nfqueue_thread.start() + self.out_nfqueue_thread.start() + + maxwait = 100 # 100 time steps of 0.01 seconds for a max wait of 10 seconds + i = 0 + # Give NFQueue time to startup, since it's running in background threads + # Block the main thread until this is done + while (not self.in_nfqueue_started or not self.out_nfqueue_started) and i < maxwait: + time.sleep(0.1) + i += 1 + self.logger.debug("NFQueue Initialized after %d", int(i)) + + def shutdown_nfqueue(self): + """ + Shutdown nfqueue. + """ + self.logger.debug("Shutting down NFQueue") + self.out_nfqueue_started = False + self.in_nfqueue_started = False + self.running_nfqueue = False + # Give the handlers two seconds to leave the callbacks before we forcibly unbind + # the queues. + time.sleep(2) + if self.in_nfqueue: + self.in_nfqueue.unbind() + if self.out_nfqueue: + self.out_nfqueue.unbind() + self.configure_iptables(remove=True) + + packets_path = os.path.join(BASEPATH, + self.output_directory, + "packets", + "original_%s.pcap" % self.environment_id) + + # Write to disk the original packets we captured + wrpcap(packets_path, [p.packet for p in self.seen_packets]) + + # If the engine exits before it initializes for any reason, these threads may not be set + # Only join them if they are defined + if self.out_nfqueue_thread: + self.out_nfqueue_thread.join() + if self.in_nfqueue_thread: + self.in_nfqueue_thread.join() + + # Shutdown the logger + actions.utils.close_logger(self.logger) + + def out_callback(self, nfpacket): + """ + Callback bound to the outgoing nfqueue rule to run the outbound strategy. + """ + if not self.running_nfqueue: + return + + packet = actions.packet.Packet(IP(nfpacket.get_payload())) + self.logger.debug("Received outbound packet %s", str(packet)) + + # Record this packet for a .pacp later + self.seen_packets.append(packet) + + # Drop the packet in NFQueue so the strategy can handle it + nfpacket.drop() + + self.handle_packet(packet) + + def handle_packet(self, packet): + """ + Handles processing an outbound packet through the engine. + """ + packets_to_send = self.strategy.act_on_packet(packet, self.logger, direction="out") + + # Send all of the packets we've collected to send + for out_packet in packets_to_send: + # If the strategy requested us to sleep before sending this packet, do so here + if out_packet.sleep: + # We can't block the main sending thread, so instead spin off a new thread to handle sleeping + threading.Thread(target=self.delayed_send, args=(out_packet, out_packet.sleep)).start() + else: + self.mysend(out_packet) + + def in_callback(self, nfpacket): + """ + Callback bound to the incoming nfqueue rule. Since we can't + manually send packets to ourself, process the given packet here. + """ + if not self.running_nfqueue: + return + packet = actions.packet.Packet(IP(nfpacket.get_payload())) + + self.seen_packets.append(packet) + + self.logger.debug("Received packet: %s", str(packet)) + + # Run the given strategy + packets = self.strategy.act_on_packet(packet, self.logger, direction="in") + + # GFW will send RA packets to disrupt a TCP stream + if packet.haslayer("TCP") and packet.get("TCP", "flags") == "RA": + self.logger.debug("Detected GFW censorship - strategy failed.") + self.censorship_detected = True + + # Branching is disabled for the in direction, so we can only ever get + # back 1 or 0 packets. If zero, drop the packet. + if not packets: + nfpacket.drop() + return + + # Otherwise, overwrite this packet with the packet the action trees gave back + nfpacket.set_payload(bytes(packets[0])) + + # If the strategy requested us to sleep before accepting on this packet, do so here + if packets[0].sleep: + time.sleep(packets[0].sleep) + + # Accept the modified packet + nfpacket.accept() + + +def get_args(): + """ + Sets up argparse and collects arguments. + """ + parser = argparse.ArgumentParser(description='The engine that runs a given strategy.') + parser.add_argument('--server-port', type=int, action='store', required=True) + parser.add_argument('--environment-id', action='store', help="ID of the current strategy under test. If not provided, one will be generated.") + parser.add_argument('--strategy', action='store', help="Strategy to deploy") + parser.add_argument('--output-directory', default="trials", action='store', help="Where to output logs, captures, and results. Defaults to trials/.") + parser.add_argument('--log', action='store', default="debug", + choices=("debug", "info", "warning", "critical", "error"), + help="Sets the log level") + + args = parser.parse_args() + return args + + +def main(args): + """ + Kicks off the engine with the given arguments. + """ + try: + eng = Engine(args["server_port"], + args["strategy"], + environment_id=args.get("environment_id"), + output_directory = args.get("output_directory"), + log_level=args["log"]) + eng.initialize_nfqueue() + while True: + time.sleep(0.5) + finally: + eng.shutdown_nfqueue() + + +if __name__ == "__main__": + main(vars(get_args())) diff --git a/examples/context_manager.py b/examples/context_manager.py new file mode 100644 index 0000000..b1fcd72 --- /dev/null +++ b/examples/context_manager.py @@ -0,0 +1,17 @@ +import os +import sys + +# Add the path to the engine so we can import it +BASEPATH = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) +sys.path.append(BASEPATH) + +import engine + +# Port to run the engine on +port = 80 +# Strategy to use +strategy = "[TCP:flags:A]-duplicate(tamper{TCP:flags:replace:R}(tamper{TCP:chksum:corrupt},),)-| \/" + +# Create the engine in debug mode +with engine.Engine(port, strategy, log_level="debug") as eng: + os.system("curl http://example.com?q=ultrasurf") diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000..256313d --- /dev/null +++ b/requirements.txt @@ -0,0 +1,6 @@ +scapy==2.4.3 +requests +netifaces +netfilterqueue +cryptography==2.5 +requests diff --git a/strategies.md b/strategies.md new file mode 100644 index 0000000..4ca519b --- /dev/null +++ b/strategies.md @@ -0,0 +1,29 @@ +The following is a library of strategies, the countries they work in, and their average success rates in those countries. See the readme or our paper for an explanation of the strategy DNA format. + + +| Strategy | China | Kazakhstan | India | +|------------------------------------------------------------------------------------------------------ |------- |------------ |------- | +| `[TCP:flags:PA]-duplicate(tamper{TCP:dataofs:replace:10}(tamper{TCP:chksum:corrupt},),)-\|` | 98% | 100% | 0% | +| `[TCP:flags:PA]-duplicate(tamper{TCP:dataofs:replace:10}(tamper{IP:ttl:replace:10},),)-\|` | 98% | 100% | 0% | +| `[TCP:flags:PA]-duplicate(tamper{TCP:dataofs:replace:10}(tamper{TCP:ack:corrupt},),)-\|` | 94% | 100% | 0% | +| `[TCP:flags:PA]-duplicate(tamper{TCP:options-wscale:corrupt}(tamper{TCP:dataofs:replace:8},),)-\|` | 98% | 100% | 0% | +| `[TCP:flags:PA]-duplicate(tamper{TCP:load:corrupt}(tamper{TCP:chksum:corrupt},),)-\|` | 80% | 100% | 0% | +| `[TCP:flags:PA]-duplicate(tamper{TCP:load:corrupt}(tamper{IP:ttl:replace:8},),)-\|` | 98% | 100% | 0% | +| `[TCP:flags:PA]-duplicate(tamper{TCP:load:corrupt}(tamper{TCP:ack:corrupt},),)-\|` | 87% | 100% | 0% | +| `[TCP:flags:S]-duplicate(,tamper{TCP:load:corrupt})-\|` | 3% | 100% | 0% | +| `[TCP:flags:PA]-duplicate(tamper{IP:len:replace:64},)-\|` | 3% | 0% | 100% | +| `[TCP:flags:A]-duplicate(,tamper{TCP:flags:replace:R}(tamper{TCP:chksum:corrupt},))-\|` | 95% | 0% | 0% | +| `[TCP:flags:A]-duplicate(,tamper{TCP:flags:replace:R}(tamper{IP:ttl:replace:10},))-\|` | 87% | 0% | 0% | +| `[TCP:flags:A]-duplicate(,tamper{TCP:options-md5header:corrupt}(tamper{TCP:flags:replace:R},))-\|` | 86% | 0% | 0% | +| `[TCP:flags:A]-duplicate(,tamper{TCP:flags:replace:RA}(tamper{TCP:chksum:corrupt},))-\|` | 80% | 0% | 0% | +| `[TCP:flags:A]-duplicate(,tamper{TCP:flags:replace:RA}(tamper{IP:ttl:replace:10},))-\|` | 94% | 0% | 0% | +| `[TCP:flags:A]-duplicate(,tamper{TCP:options-md5header:corrupt}(tamper{TCP:flags:replace:R},))-\|` | 94% | 0% | 0% | +| `[TCP:flags:A]-duplicate(,tamper{TCP:flags:replace:FRAPUEN}(tamper{TCP:chksum:corrupt},))-\|` | 89% | 0% | 0% | +| `[TCP:flags:A]-duplicate(,tamper{TCP:flags:replace:FREACN}(tamper{IP:ttl:replace:10},))-\|` | 96% | 0% | 0% | +| `[TCP:flags:A]-duplicate(,tamper{TCP:flags:replace:FRAPUN}(tamper{TCP:options-md5header:corrupt},))-\|` | 94% | 0% | 0% | +| `[TCP:flags:PA]-fragment{tcp:8:False}-\| [TCP:flags:A]-tamper{TCP:seq:corrupt}-\|` | 94% | 100% | 100% | +| `[TCP:flags:PA]-fragment{tcp:8:True}(,fragment{tcp:4:True})-\|` | 98% | 100% | 100% | +| `[TCP:flags:PA]-fragment{tcp:-1:True}-\| ` | 3% | 100% | 100% | +| `[TCP:flags:PA]-duplicate(tamper{TCP:flags:replace:F}(tamper{IP:len:replace:78},),)-\| ` | 53% | 0% | 100% | +| `[TCP:flags:S]-duplicate(tamper{TCP:flags:replace:SA},)-\|` | 3% | 100% | 0% | +| `[TCP:flags:PA]-tamper{TCP:options-uto:corrupt}-\| ` | 3% | 0% | 100% |