Sync now skips most steps if the feed metadata has not changed.#676
Sync now skips most steps if the feed metadata has not changed.#676mhrivnak merged 1 commit intopulp:2.6-devfrom
Conversation
There was a problem hiding this comment.
The idea here is that if a previous sync has the skip_list set to skip a type, and then the user removes that from the skip list and does a sync, we don't want to skip the sync just because the metadata didn't change.
So if anything was removed from the previous skip list, we will do a full sync.
|
assigning back to @mhrivnak per IRC convo |
|
@mhrivnak Would it make more sense to check for example: <?xml version="1.0" encoding="UTF-8"?>
<repomd xmlns="http://linux.duke.edu/metadata/repo" xmlns:rpm="http://linux.duke.edu/metadata/rpm">
<revision>1429551055</revision>
<data type="filelists">
<checksum type="sha256">5cc15042c277bfbf8b04e8b1eadc9cf998016529663d504e93a79e0bff0246fa</checksum>
<open-checksum type="sha256">df79c6949d62ecaff4f53bd4758b02b445f7eddd3a5b92eb8e6bdccfcd6203dd</open-checksum>
<location href="repodata/5cc15042c277bfbf8b04e8b1eadc9cf998016529663d504e93a79e0bff0246fa-filelists.xml.gz"/>
<timestamp>1429551057</timestamp>
<size>312</size>
<open-size>522</open-size>
</data>
Another thought is that we could just save a hash of the repomd.xml itself and save it in the scratch pad 🐨. If the hash doesn't change, it would be OK to skip the other steps. This would free us from having to worry about specifics of repomd.xml creation. |
|
Good thoughts. I didn't go with the timestamp on primary.xml because it only covers RPMs. In a case where there are drpms, errata, groups, or any other custom metadata files, we need to know if any of those changed either.
In any case, I can't find an example of a yum repo where the revision is not an integer. I think we should give that a try, and if we find a use case where someone is committed to doing something different, we can consider fall-back strategies, like using the checksum. We could additionally use cache-related HTTP headers to determine when this file, or others, hasn't changed, but that would be gravy. |
|
👍 sounds good, I am OK with using I don't remember if we need more changes or not; just assign back to me when it's time to re-review. |
|
Great. I am adding the improvement that it will default to full-sync if the revision can't be parsed as an int. |
db2169f to
9a140c7
Compare
|
I like the "if zero then always sync" approach. LGTM |
9a140c7 to
183cb1e
Compare
Sync now skips most steps if the feed metadata has not changed.
closes #2
https://pulp.plan.io/issues/2