123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247 |
- Filename: xxx-what-uses-sha1.txt
- Title: Where does Tor use SHA-1 today?
- Authors: Nick Mathewson, Marian
- Created: 30-Dec-2008
- Status: Meta
- Introduction:
- Tor uses SHA-1 as a message digest. SHA-1 is showing its age:
- theoretical attacks for finding collisions against it get better
- every year or two, and it will likely be broken in practice before
- too long.
- According to smart crypto people, the SHA-2 functions (SHA-256, etc)
- share too much of SHA-1's structure to be very good. RIPEMD-160 is
- also based on flawed past hashes. Some people think other hash
- functions (e.g. Whirlpool and Tiger) are not as bad; most of these
- have not seen enough analysis to be used yet.
- Here is a 2006 paper about hash algorithms.
- http://www.sane.nl/sane2006/program/final-papers/R10.pdf
- (Todo: Ask smart crypto people.)
- By 2012, the NIST SHA-3 competition will be done, and with luck we'll
- have something good to switch too. But it's probably a bad idea to
- wait until 2012 to figure out _how_ to migrate to a new hash
- function, for two reasons:
- 1) It's not inconceivable we'll want to migrate in a hurry
- some time before then.
- 2) It's likely that migrating to a new hash function will
- require protocol changes, and it's easiest to make protocol
- changes backward compatible if we lay the groundwork in
- advance. It would suck to have to break compatibility with
- a big hard-to-test "flag day" protocol change.
- This document attempts to list everything Tor uses SHA-1 for today.
- This is the first step in getting all the design work done to switch
- to something else.
- This document SHOULD NOT be a clearinghouse of what to do about our
- use of SHA-1. That's better left for other individual proposals.
- Why now?
- The recent publication of "MD5 considered harmful today: Creating a
- rogue CA certificate" by Alexander Sotirov, Marc Stevens, Jacob
- Appelbaum, Arjen Lenstra, David Molnar, Dag Arne Osvik, and Benne de
- Weger has reminded me that:
- * You can't rely on theoretical attacks to stay theoretical.
- * It's quite unpleasant when theoretical attacks become practical
- and public on days you were planning to leave for vacation.
- * Broken hash functions (which SHA-1 is not quite yet AFAIU)
- should be dropped like hot potatoes. Failure to do so can make
- one look silly.
- Triage
- How severe are these problems? Let's divide them into these
- categories, where H(x) is the SHA-1 hash of x:
- PREIMAGE -- find any x such that a H(x) has a chosen value
- -- A SHA-1 usage that only depends on preimage
- resistance
- * Also SECOND PREIMAGE. Given x, find a y not equal to
- x such that H(x) = H(y)
- COLLISION<role> -- A SHA-1 usage that depends on collision
- resistance, but the only party who could mount a
- collision-based attack is already in a trusted role
- (like a distribution signer or a directory authority).
- COLLISION -- find any x and y such that H(x) = H(y) -- A
- SHA-1 usage that depends on collision resistance
- and doesn't need the attacker to have any special keys.
- There is no need to put much effort into fixing PREIMAGE and SECOND
- PREIMAGE usages in the near-term: while there have been some
- theoretical results doing these attacks against SHA-1, they don't
- seem to be close to practical yet. To fix COLLISION<code-signing>
- usages is not too important either, since anyone who has the key to
- sign the code can mount far worse attacks. It would be good to fix
- COLLISION<authority> usages, since we try to resist bad authorities
- to a limited extent. The COLLISION usages are the most important
- to fix.
- Kelsey and Schneier published a theoretical second preimage attack
- against SHA-1 in 2005, so it would be a good idea to fix PREIMAGE
- and SECOND PREIMAGE usages after fixing COLLISION usages or where fixes
- require minimal effort.
- http://www.schneier.com/paper-preimages.html
- Additionally, we need to consider the impact of a successful attack
- in each of these cases. SHA-1 collisions are still expensive even
- if recent results are verified, and anybody with the resources to
- compute one also has the resources to mount a decent Sybil attack.
- Let's be pessimistic, and not assume that producing collisions of
- a given format is actually any harder than producing collisions at
- all.
- What Tor uses hashes for today:
- 1. Infrastructure.
- A. Our X.509 certificates are signed with SHA-1.
- COLLSION
- B. TLS uses SHA-1 (and MD5) internally to generate keys.
- PREIMAGE?
- * At least breaking SHA-1 and MD5 simultaneously is
- much more difficult than breaking either
- independently.
- C. Some of the TLS ciphersuites we allow use SHA-1.
- PREIMAGE?
- D. When we sign our code with GPG, it might be using SHA-1.
- COLLISION<code-signing>
- * GPG 1.4 and up have writing support for SHA-2 hashes.
- This blog has help for converting:
- http://www.schwer.us/journal/2005/02/19/sha-1-broken-and-gnupg-gpg/
- E. Our GPG keys might be authenticated with SHA-1.
- COLLISION<code-signing-key-signing>
- F. OpenSSL's random number generator uses SHA-1, I believe.
- PREIMAGE
- 2. The Tor protocol
- A. Everything we sign, we sign using SHA-1-based OAEP-MGF1.
- PREIMAGE?
- B. Our CREATE cell format uses SHA-1 for: OAEP padding.
- PREIMAGE?
- C. Our EXTEND cells use SHA-1 to hash the identity key of the
- target server.
- COLLISION
- D. Our CREATED cells use SHA-1 to hash the derived key data.
- ??
- E. The data we use in CREATE_FAST cells to generate a key is the
- length of a SHA-1.
- NONE
- F. The data we send back in a CREATED/CREATED_FAST cell is the length
- of a SHA-1.
- NONE
- G. We use SHA-1 to derive our circuit keys from the negotiated g^xy
- value.
- NONE
- H. We use SHA-1 to derive the digest field of each RELAY cell, but that's
- used more as a checksum than as a strong digest.
- NONE
- 3. Directory services
- [All are COLLISION or COLLISION<authority> ]
- A. All signatures are generated on the SHA-1 of their corresponding
- documents, using PKCS1 padding.
- * In dir-spec.txt, section 1.3, it states,
- "SIGNATURE" Object contains a signature (using the signing key)
- of the PKCS1-padded digest of the entire document, taken from
- the beginning of the Initial item, through the newline after
- the Signature Item's keyword and its arguments."
- So our attacker, Malcom, could generate a collision for the hash
- that is signed. Thus, a second pre-image attack is possible.
- Vulnerable to regular collision attack only if key is stolen.
- If the key is stolen, Malcom could distribute two different
- copies of the document which have the same hash. Maybe useful
- for a partitioning attack?
- B. Router descriptors identify their corresponding extra-info documents
- by their SHA-1 digest.
- * A third party might use a second pre-image attack to generate a
- false extra-info document that has the same hash. The router
- itself might use a regular collision attack to generate multiple
- extra-info documents with the same hash, which might be useful
- for a partitioning attack.
- C. Fingerprints in router descriptors are taken using SHA-1.
- * The fingerprint must match the public key. Not sure what would
- happen if two routers had different public keys but the same
- fingerprint. There could perhaps be unpredictable behaviour.
- D. In router descriptors, routers in the same "Family" may be listed
- by server nicknames or hexdigests.
- * Does not seem critical.
- E. Fingerprints in authority certs are taken using SHA-1.
- F. Fingerprints in dir-source lines of votes and consensuses are taken
- using SHA-1.
- G. Networkstatuses refer to routers identity keys and descriptors by their
- SHA-1 digests.
- H. Directory-signature lines identify which key is doing the signing by
- the SHA-1 digests of the authority's signing key and its identity key.
- I. The following items are downloaded by the SHA-1 of their contents:
- XXXX list them
- J. The following items are downloaded by the SHA-1 of an identity key:
- XXXX list them too.
- 4. The rendezvous protocol
- A. Hidden servers use SHA-1 to establish introduction points on relays,
- and relays use SHA-1 to check incoming introduction point
- establishment requests.
- B. Hidden servers use SHA-1 in multiple places when generating hidden
- service descriptors.
- * The permanent-id is the first 80 bits of the SHA-1 hash of the
- public key
- ** time-period performs caclulations using the permanent-id
- * The secret-id-part is the SHA-1 has of the time period, the
- descriptor-cookie, and replica.
- * Hash of introduction point's identity key.
- C. Hidden servers performing basic-type client authorization for their
- services use SHA-1 when encrypting introduction points contained in
- hidden service descriptors.
- D. Hidden service directories use SHA-1 to check whether a given hidden
- service descriptor may be published under a given descriptor
- identifier or not.
- E. Hidden servers use SHA-1 to derive .onion addresses of their
- services.
- * What's worse, it only uses the first 80 bits of the SHA-1 hash.
- However, the rend-spec.txt says we aren't worried about arbitrary
- collisons?
- F. Clients use SHA-1 to generate the current hidden service descriptor
- identifiers for a given .onion address.
- G. Hidden servers use SHA-1 to remember digests of the first parts of
- Diffie-Hellman handshakes contained in introduction requests in order
- to detect replays. See the RELAY_ESTABLISH_INTRO cell. We seem to be
- taking a hash of a hash here.
- H. Hidden servers use SHA-1 during the Diffie-Hellman key exchange with
- a connecting client.
- 5. The bridge protocol
- XXXX write me
-
- A. Client may attempt to query for bridges where he knows a digest
- (probably SHA-1) before a direct query.
- 6. The Tor user interface
- A. We log information about servers based on SHA-1 hashes of their
- identity keys.
- COLLISION
- B. The controller identifies servers based on SHA-1 hashes of their
- identity keys.
- COLLISION
- C. Nearly all of our configuration options that list servers allow SHA-1
- hashes of their identity keys.
- COLLISION
- E. The deprecated .exit notation uses SHA-1 hashes of identity keys
- COLLISION
|