|
|
@@ -0,0 +1,45 @@
|
|
|
+# Belarus 2020--2021
|
|
|
+
|
|
|
+This repo contains data analysis related to the [Tor blocking events in
|
|
|
+Belarus from
|
|
|
+2020-2021](https://gitlab.torproject.org/tpo/anti-censorship/censorship-analysis/-/blob/main/reports/2020/belarus/2020-belarus-report.md).
|
|
|
+In particular, in late February 2021, the censor apparently enumerated
|
|
|
+the set of obfs4 bridges that were distributed via email and blocked
|
|
|
+those bridges. Given the set of 1890 bridges that were distributed in
|
|
|
+February 2021 prior to the 22nd and the subset of 93 email-distributed
|
|
|
+obfs4 bridges, our goal was to detect based on low connection counts
|
|
|
+from Belarus that the 93 email-distributed obfs4 bridges were blocked,
|
|
|
+while avoiding false detections that the other bridges were blocked.
|
|
|
+
|
|
|
+Our main finding is that the 93 email-distributed obfs4 bridges that
|
|
|
+were blocked were not commonly used in Belarus prior to the censorship
|
|
|
+event. Very low connection counts (e.g., 0) were common, so we were
|
|
|
+unable to infer censorship of these bridges based on this signal.
|
|
|
+
|
|
|
+## Reproducing our results
|
|
|
+
|
|
|
+Dependencies:
|
|
|
+- bash
|
|
|
+- curl
|
|
|
+- python3
|
|
|
+- numpy
|
|
|
+
|
|
|
+To reproduce our results, just run `./run.sh`. **This will take a long
|
|
|
+time (about 12.5 hours on my device) and require a few GB of free
|
|
|
+space.** The reason is that it needs to download all extra-info records
|
|
|
+for all bridges from 2020-07 to 2021-04 from the Tor Project's CollecTor
|
|
|
+service (734 MB compressed, about 20 GB uncompressed), extract that
|
|
|
+data, and obtain the needed information (bridge fingerprint, record
|
|
|
+date, and connection count from Belarus) from each record. There are A
|
|
|
+LOT of these records.
|
|
|
+
|
|
|
+If you want to avoid spending all the time to do that, you can run
|
|
|
+`./run.sh --fast`, which starts with a 6.7 MB archive (87 MB
|
|
|
+uncompressed) containing only the information we need.
|
|
|
+
|
|
|
+## Julian dates
|
|
|
+
|
|
|
+Note: The Julian date conversion in get-bridge-data.sh and get-stats.py
|
|
|
+is 1 day earlier than reported by, e.g.,
|
|
|
+https://aa.usno.navy.mil/data/JulianDate. What matters is that these two
|
|
|
+files use the *same representation* of a date, which they do.
|