README.geoip 3.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990
  1. README.geoip -- information on the IP-to-country-code file shipped with tor
  2. ===========================================================================
  3. The IP-to-country-code file in src/config/geoip is based on MaxMind's
  4. GeoLite Country database with the following modifications:
  5. - Those "A1" ("Anonymous Proxy") entries lying inbetween two entries with
  6. the same country code are automatically changed to that country code.
  7. These changes can be overriden by specifying a different country code
  8. in src/config/geoip-manual.
  9. - Other "A1" entries are replaced with country codes specified in
  10. src/config/geoip-manual, or are left as is if there is no corresponding
  11. entry in that file. Even non-"A1" entries can be modified by adding a
  12. replacement entry to src/config/geoip-manual. Handle with care.
  13. 1. Updating the geoip file from a MaxMind database file
  14. -------------------------------------------------------
  15. Download the most recent MaxMind GeoLite Country database:
  16. http://geolite.maxmind.com/download/geoip/database/GeoIPCountryCSV.zip
  17. Run `python deanonymind.py` in the local directory. Review the output to
  18. learn about applied automatic/manual changes and watch out for any
  19. warnings.
  20. Possibly edit geoip-manual to make more/fewer/different manual changes and
  21. re-run `python deanonymind.py`.
  22. When done, prepend the new geoip file with a comment like this:
  23. # Last updated based on $DATE Maxmind GeoLite Country
  24. # See README.geoip for details on the conversion.
  25. 2. Verifying automatic and manual changes using diff
  26. ----------------------------------------------------
  27. To unzip the original MaxMind file and look at the automatic changes, run:
  28. unzip GeoIPCountryCSV.zip
  29. diff -U1 GeoIPCountryWhois.csv AutomaticGeoIPCountryWhois.csv
  30. To look at subsequent manual changes, run:
  31. diff -U1 AutomaticGeoIPCountryWhois.csv ManualGeoIPCountryWhois.csv
  32. To manually generate the geoip file and compare it to the automatically
  33. created one, run:
  34. cut -d, -f3-5 < ManualGeoIPCountryWhois.csv | sed 's/"//g' > mygeoip
  35. diff -U1 geoip mygeoip
  36. 3. Verifying automatic and manual changes using blockfinder
  37. -----------------------------------------------------------
  38. Blockfinder is a powerful tool to handle multiple IP-to-country data
  39. sources. Blockfinder has a function to specify a country code and compare
  40. conflicting country code assignments in different data sources.
  41. We can use blockfinder to compare A1 entries in the original MaxMind file
  42. with the same or overlapping blocks in the file generated above and in the
  43. RIR delegation files:
  44. git clone https://github.com/ioerror/blockfinder
  45. cd blockfinder/
  46. python blockfinder -i
  47. python blockfinder -r ../GeoIPCountryWhois.csv
  48. python blockfinder -r ../ManualGeoIPCountryWhois.csv
  49. python blockfinder -p A1 > A1-comparison.txt
  50. The output marks conflicts between assignments using either '*' in case of
  51. two different opinions or '#' for three or more different opinions about
  52. the country code for a given block.
  53. The '*' conflicts are most likely harmless, because there will always be
  54. at least two opinions with the original MaxMind file saying A1 and the
  55. other two sources saying something more meaningful.
  56. However, watch out for '#' conflicts. In these cases, the original
  57. MaxMind file ("A1"), the updated MaxMind file (hopefully the correct
  58. country code), and the RIR delegation files (some other country code) all
  59. disagree.
  60. There are perfectly valid cases where the updated MaxMind file and the RIR
  61. delegation files don't agree. But each of those cases must be verified
  62. manually.