104-short-descriptors.txt 4.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101
  1. Filename: 104-short-descriptors.txt
  2. Title: Long and Short Router Descriptors
  3. Version: $Revision$
  4. Last-Modified: $Date$
  5. Author: Nick Mathewson
  6. Created:
  7. Status: Open
  8. Overview:
  9. This document proposes moving unused-by-clients information from regular
  10. router descriptors into a special "long form" router descriptor.
  11. It presents options; it is not yet a complete proposal.
  12. Proposal:
  13. Some of the costliest fields in the current directory protocol are ones
  14. that no client actually uses. In particular, the "read-history" and
  15. "write-history" fields are used only by the authorities for monitoring the
  16. status of the network. If we took them out, the size of a compressed list
  17. of all the routers would fall by about 60%. (No other disposable field
  18. would save more than 2%.)
  19. One possible solution here is that routers should generate and upload a
  20. short-form and long-form descriptor. Only the short-form descriptor should
  21. ever be used by anybody for routing. The long-form descriptor should be
  22. used only for analytics and other tools. (If we allowed people to route
  23. with long descriptors, we'd have to ensure that they stayed in sync with
  24. the short ones somehow. So let's not do that.) We can ensure that the
  25. short descriptors are used by only recommending those in the network
  26. statuses.
  27. Another possible solution would be to drop these fields from descriptors,
  28. and have them uploaded as a part of a separate "bandwidth report" to the
  29. authorities. This could help prevent the mistake of using long descriptors
  30. in the place of short ones.
  31. Other disposable fields:
  32. Clients don't need these fields, but removing them doesn't help bandwidth
  33. enough to be worthwhile.
  34. contact (save about 1%)
  35. fingerprint (save about 3%)
  36. We could represent these fields more succinctly, but removing them would
  37. only save 1%. (!)
  38. reject
  39. accept
  40. (Apparently, exit polices are highly compressible.)
  41. Issues:
  42. Indexing long descriptor or bandwidth reports presents an issue: right now
  43. the way to make sure you have the same copy of a descriptor as everyone
  44. else is to request the descriptor by its digest, and to make sure to that
  45. the digest you request is the one that the authorities like.
  46. Authorities should presumably list the digests of short descriptors, since
  47. that's what most everybody will be using. Including a second digest for
  48. long descriptors/bandwidth reports in the networkstatus would only bloat it
  49. with information nobody wants.
  50. Possible solutions are:
  51. - Drop the property that you can be sure of having the same long
  52. descriptor as others. This seems unoptimal.
  53. - Have a separate extra-information-status that also gets generated by the
  54. authorities; use it to tell which long descriptors others have. Also a
  55. pain.
  56. - Have short descriptors include a hash of the corresponding long
  57. descriptor/extra-info. This would keep the same order of magnitude
  58. performance increase (~59.2% savings as opposed to 61% savings.)
  59. This would require longdesc/extra-info downloaders to fetch
  60. router data before they could know which longdescs/extra info to fetch.
  61. - Have each authority make a signed concatenated "extra info" document,
  62. and hope we never need to reconcile them.
  63. - ????
  64. Migration:
  65. For long/short descriptor approach:
  66. * First:
  67. * Authorities should accept both, now, and silently drop short
  68. descriptors.
  69. * Routers should upload both once authorities accept them.
  70. * There should be a "long descriptor" url and the current "normal" URL.
  71. Authorities should serve long descriptors from both URLs.
  72. * Once tools that want long descriptors support fetching them from the
  73. "long descriptor" URL:
  74. * Have authorities remember short descriptors, and serve them from the
  75. 'normal' URL.
  76. For bandwidth info approach:
  77. * First:
  78. * Rename it; it won't be just bandwidth forever.
  79. * Authorities should accept bandwidth info
  80. * Routers should upload bandwidth info once authorities accept it.
  81. * There should be a way to download bandwidth info
  82. * Once tools that want bandwidth info support fetching it:
  83. * Have routers stop including bandwidth info in their router
  84. descriptors.