@@ -907,55 +907,51 @@ efficiently transfer consistent snapshots from PyPI.
907
907
Producing Consistent Snapshots
908
908
------------------------------
909
909
910
- When a new project release is uploaded to PyPI, PyPI MUST update the *bin-n*
911
- metadata responsible for the target files of the project release. Remember that
912
- target files are sorted into bins by their filename hashes. Consequentially,
913
- PyPI MUST update *snapshot* to account for the updated *bin-n* metadata, and
914
- *timestamp* to account for the updated *snapshot* metadata. These updates
915
- SHOULD be handled by automated processes, e.g. one or more *transaction
916
- processes* and one *snapshot process*.
917
-
918
- Each transaction process keeps track of a project upload, adds all new target
919
- files to the most recent, relevant *bin-n* metadata and informs the
920
- snapshot process to produce a consistent snapshot. Each project release SHOULD
921
- be handled in an atomic transaction, so that a given consistent snapshot
922
- contains all target files of a project release. However, transaction processes
923
- MAY be parallelized under the following constraints:
924
-
925
- - Pairs of transaction processes MUST NOT concurrently work on the same project.
926
- - Pairs of transaction processes MUST NOT concurrently work on projects that
927
- belong to the same *bin-n* role.
928
-
929
- When a transaction process is finished updating the relevant *bin-n* metadata
930
- it informs the snapshot process to generate a new consistent snapshot. The
931
- snapshot process does so by taking the updated *bin-n* metadata, incrementing
932
- their respective version numbers, signing them with the *bin-n* role key(s),
933
- and writing them to *VERSION_NUMBER.bin-N.json*.
934
-
935
- Similarly, the snapshot process then takes the most recent *snapshot* metadata,
936
- updates its *bin-n* metadata version numbers, increments its own version
937
- number, signs it with the *snapshot* role key, and writes it to
938
- *VERSION_NUMBER.snapshot.json*.
910
+ When a new distribution file is uploaded to PyPI, PyPI MUST update the
911
+ responsible *bin-n* metadata. Remember that all target files are sorted into
912
+ bins by their filename hashes. PyPI MUST also update *snapshot* to account for
913
+ the updated *bin-n* metadata, and *timestamp* to account for the updated
914
+ *snapshot* metadata. These updates SHOULD be handled by an automated *snapshot
915
+ process*.
916
+
917
+ File uploads MAY be handled in parallel, however, consistent snapshots MUST be
918
+ produced in a strictly sequential manner. Furthermore, as long as distribution
919
+ files are self-contained, a consistent snapshot MAY be produced for each
920
+ uploaded file. To do so upload processes place new distribution files into a
921
+ concurrency-safe FIFO queue and the snapshot process reads from that queue one
922
+ file at a time and performs the following tasks:
923
+
924
+ First, it adds the new file path to the relevant *bin-n* metadata, increments
925
+ its version number, signs it with the *bin-n* role key, and writes it to
926
+ *VERSION_NUMBER.bin-N.json*.
927
+
928
+ Then, it takes the most recent *snapshot* metadata, updates its *bin-n*
929
+ metadata version numbers, increments its own version number, signs it with the
930
+ *snapshot* role key, and writes it to *VERSION_NUMBER.snapshot.json*.
939
931
940
932
And finally, the snapshot process takes the most recent *timestamp* metadata,
941
933
updates its *snapshot* metadata hash and version number, increments its own
942
934
version number, sets a new expiration time, signs it with the *timestamp* role
943
935
key, and writes it to *timestamp.json*.
944
936
945
- The snapshot process MUST generate consistent snapshots sequentially, reading
946
- the notifications received from the transaction process(es) from a
947
- concurrency-safe FIFO queue. Fortunately, the operation of signing is fast
948
- enough that this may be done a thousand or more times per second.
937
+ When updating *bin-n* metadata for a consistent snapshot, the snapshot process
938
+ SHOULD also include any new or updated hashes of simple index pages in the
939
+ relevant *bin-n* metadata. Note that, simple index pages may be generated
940
+ dynamically on API calls, so it is important that their output remains stable
941
+ throughout the validity of a consistent snapshot.
949
942
950
- If there are multiple files in a release, a project MAY release these files in
951
- separate transactions. For example, a project MAY release files for Windows in
952
- one transaction, and the files for Linux in another transaction. However, a project
953
- SHOULD release files that must belong together in order for everything to work
954
- in the same transaction.
943
+ Since the snapshot process MUST generate consistent snapshots in a strictly
944
+ sequential manner it constitutes a bottleneck. Fortunately, the operation of
945
+ signing is fast enough that this may be done a thousand or more times per
946
+ second.
955
947
956
- At any rate, PyPI SHOULD use a `transaction log`__ to record project
957
- transaction processes and the snapshot queue for auditing and to recover from
958
- errors after a server failure.
948
+ Moreover, PyPI MAY serve distribution files to clients before the corresponding
949
+ consistent snapshot metadata is generated. In that case the client software
950
+ SHOULD inform the user that full TUF protection is not yet available but will
951
+ be shortly.
952
+
953
+ PyPI SHOULD use a `transaction log`__ to record upload processes and the
954
+ snapshot queue for auditing and to recover from errors after a server failure.
959
955
960
956
__ https://en.wikipedia.org/wiki/Transaction_log
961
957
0 commit comments