Let's set some kind of enforced minimum of hashes for an object upload to be accepted, which downloaders can then expect to find.

For the moment, we define sha256 and sha512, but don't say anything about when they must be defined. Over the years, we'll probably want to add hash algorithms, and eventually older hash algorithms will be deprecated (as md5 and sha1 have been), so we should have a policy that allows some kind of evolution. Perhaps:

  • must include at least one of [algos still considered secure] (e.g., sha256, sha512)
  • should include [most secure algo] (e.g., sha512)
  • can also include [deprecated algos] (e.g., md5, sha1)

Downloaders should be expected to verify all recognized algorithms for which there are hashes, and to verify that at least one of the first set are present.

This gives enough time for some evolution: assuming sha1024 is the next best algorithm, uploaders can start adding it but also including sha256 and sha512. During that time, downloaders will validate sha256 and sha512, and new implementations will also validate sha1024. Then by the time sha256 is deprecated, uploaders and downloaders will already be using sha512 and sha1024, and can drop sha256.

I think this would be a docs-only fix, for now.



Oh, we should probably also update schemas so that the hash algorithms are named for downloads, too. Right now upload specifies the two algorithms, while download is just a string->string map.


I like the idea of making it strict and enforcing user to provide secure hashes (plus possibly disallowing weak ones). Question here would be if we want to allow objects to be uploaded without hashes, or objects with incomplete set of hashes?


I think we should disallow uploads without hashes -- there's absolutely no reason not to have a hash for an uploaded file. Hashes can be provided after the upload is completed, so hashing doesn't even require reading a data stream twice.

I think incomplete sets are OK, since that allows us to slowly phase-in new algorithms, and phase in new algorithms. So we'll sort of have a sliding window of acceptable algorithms, with at least one required at all times.

© 2022 pullanswer.com - All rights reserved.