123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142 |
- Filename: 159-exit-scanning.txt
- Title: Exit Scanning
- Author: Mike Perry
- Created: 13-Feb-2009
- Status: Open
- Overview:
- This proposal describes the implementation and integration of an
- automated exit node scanner for scanning the Tor network for malicious,
- misconfigured, firewalled or filtered nodes.
- Motivation:
- Tor exit nodes can be run by anyone with an Internet connection. Often,
- these users aren't fully aware of limitations of their networking
- setup. Content filters, antivirus software, advertisements injected by
- their service providers, malicious upstream providers, and the resource
- limitations of their computer or networking equipment have all been
- observed on the current Tor network.
- It is also possible that some nodes exist purely for malicious
- purposes. In the past, there have been intermittent instances of
- nodes spoofing SSH keys, as well as nodes being used for purposes of
- plaintext surveillance.
- While it is not realistic to expect to catch extremely targeted or
- completely passive malicious adversaries, the goal is to prevent
- malicious adversaries from deploying dragnet attacks against large
- segments of the Tor userbase.
- Scanning methodology:
- The first scans to be implemented are HTTP, HTML, Javascript, and
- SSL scans.
- The HTTP scan scrapes Google for common filetype urls such as exe, msi,
- doc, dmg, etc. It then fetches these urls through Non-Tor and Tor, and
- compares the SHA1 hashes of the resulting content.
- The SSL scan downloads certificates for all IPs a domain will locally
- resolve to and compares these certificates to those seen over Tor. The
- scanner notes if a domain had rotated certificates locally in the
- results for each scan.
- The HTML scan checks HTML, Javascript, and plugin content for
- modifications. Because of the dynamic nature of most of the web, the
- scanner has a number of mechanisms built in to filter out false
- positives that are used when a change is noticed between Tor and
- Non-Tor.
- All tests also share a URL-based false positive filter that
- automatically removes results retroactively if the number of failures
- exceeds a certain percentage of nodes tested with the URL.
- Deployment Stages:
- To avoid instances where bugs cause us to mark exit nodes as BadExit
- improperly, it is proposed that we begin use of the scanner in stages.
- 1. Manual Review:
- In the first stage, basic scans will be run by a small number of
- people while we stabilize the scanner. The scanner has the ability
- to resume crashed scans, and to rescan nodes that fail various
- tests.
- 2. Human Review:
- In the second stage, results will be automatically mailed to
- an email list of interested parties for review. We will also begin
- classifying failure types into three to four different severity
- levels, based on both the reliability of the test and the nature of
- the failure.
- 3. Automatic BadExit Marking:
- In the final stage, the scanner will begin marking exits depending
- on the failure severity level in one of three different ways: by
- node idhex, by node IP, or by node IP mask. A potential fourth, less
- severe category of results may still be delivered via email only for
- review.
- BadExit markings will be delivered in batches upon completion
- of whole-network scans, so that the final false positive
- filter has an opportunity to filter out URLs that exhibit
- dynamic content beyond what we can filter.
- Specification of Exit Marking:
- Technically, BadExit could be marked via SETCONF AuthDirBadExit over
- the control port, but this would allow full access to the directory
- authority configuration and operation.
- The approved-routers file could also be used, but currently it only
- supports fingerprints, and it also contains other data unrelated to
- exit scanning that would be difficult to coordinate.
- Instead, we propose that a new badexit-routers file that has three
- keywords:
- BadExitNet 1*[exitpattern from 2.3 in dir-spec.txt]
- BadExitFP 1*[hexdigest from 2.3 in dir-spec.txt]
- BadExitNet lines would follow the codepaths used by AuthDirBadExit to
- set authdir_badexit_policy, and BadExitFP would follow the codepaths
- from approved-router's !badexit lines.
- The scanner would have exclusive ability to write, append, rewrite,
- and modify this file. Prior to building a new consensus vote, a
- participating Tor authority would read in a fresh copy.
- Security Implications:
- Aside from evading the scanner's detection, there are two additional
- high-level security considerations:
- 1. Ensure nodes cannot be marked BadExit by an adversary at will
- It is possible individual website owners will be able to target certain
- Tor nodes, but once they begin to attempt to fail more than the URL
- filter percentage of the exits, their sites will be automatically
- discarded.
- Failing specific nodes is possible, but scanned results are fully
- reproducible, and BadExits should be rare enough that humans are never
- fully removed from the loop.
- State (cookies, cache, etc) does not otherwise persist in the scanner
- between exit nodes to enable one exit node to bias the results of a
- later one.
- 2. Ensure that scanner compromise does not yield authority compromise
- Having a separate file that is under the exclusive control of the
- scanner allows us to heavily isolate the scanner from the Tor
- authority, potentially even running them on separate machines.
|