-
-
Notifications
You must be signed in to change notification settings - Fork 345
Invalidate stale cached files from GitHub in Netkan #2337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalidate stale cached files from GitHub in Netkan #2337
Conversation
I haven't reviewed this, but it's a great concept! It probably gets closer to achieving this KSP-CKAN/NetKAN-bot#2, I'll raise a feature for my idea. |
So we are allowing mod authors to destroy the integrity of their release version numbers?How do our users know which revision of a given mod release they have installed? Could we force an extra version level when we do this (similar to epochs, but automatic) somehow? |
I don't think "this is a new version but I used the same version number" is what happened here. The use case is that the original upload is in error (in GPP's case, the KSP-AVC file), and it is replaced to fix that error. The author wishes to retract the original version of the upload. And if they manage it before the netkan bot gets around to their mod, they are in fact able to do so. It's also worth noting that since Though I see your point about users who install the original version before the update. They should be prompted to upgrade to the fixed version, which I assume would not happen if the version number is identical. (Though again, this situation simply would have been ignored in versions of CKAN prior to 1.24.0—these users would have stayed on the original retracted version) What about putting the timestamp in the .ckan file, such as in a |
Yes, I agree that in this case, there would only be the new file being indexed, and no user would install the original zip file, but I'm thinking about future changes. In the past, we have tended to assume that a given URL is a specific file (and we negotiated with Sarbian in order to specifically achieve that for ModuleManager) As you say, users should see a replaced file as an upgrade (The mod author could have changed anything) How I'd like to see it handled is some kind of appended tag when the zip file has changed. If I understand the code you've added correctly, you're comparing the remote timestamp against the timestamp of the cached file, which we can't rely on for the client, since some users clear out their caches. We could have a new field "download_revision" that we introduce and increment and use for upgrade checking but not for compatibility checking? After all, the original version of the download is not available, so we don't really want extra CKAN files. Either that, or the CKAN client just compares the download_hash of the installed_modules against their repo download_hash and offer any changes as an "upgrade" without explanation. That would mean no spec change is required. I don't like this, but I guess we have to face reality. Mod authors should just push a new version when they make changes, rather than in-place modify a release, but this is not the first case, and if we can find a way to silently handle it, it saves the @KSP-CKAN/wranglers arguments with authors. |
Users clearing their caches won't affect this; these changes affect netkan.exe only. The cache that we're purging is on the netkan bot server, which is the part of the architecture that's in charge of incorporating new files into the repository. We can (and probably should) do a future change to handle this for the client, but this isn't that. |
Yes, I realise this PR is only handling the netkan side of things, but you're adding a new field into the .ckan files, which should be in the spec, and while this change is great for the specific case of an invalid download being replaced with a valid download, it's going to cause problems if a mod author changes a valid download to a new valid download. Any chance you could jump on IRC? I'm not sure I'm seeing this correctly. |
I may have misunderstood; you're talking about possible future functionality to prompt users to reinstall the same version? I would picture this working by comparing a new Heading to IRC in a moment... |
After-action summary, the first netkan bot run with these changes found no mods with replaced downloads, and it only took ~1 hour to run, going by the status page, so I don't think we accidentally caused it to re-download every mod. So far so good. |
Success! I purposely replaced SmartTank's download with one with a new .version file, since the current DLL seems to work fine on KSP 1.4.0, and the bot picked it up just fine: |
Background
Both CKAN and Netkan cache module downloads to save network bandwidth and time. Downloaded files are placed in a cache folder with names like
A1B2C3D4-Modname-v1.0.zip
, with the first eight characters being the beginning of the download URL's SHA1 in hexadecimal.In #2243, we added validation to make sure that invalid or corrupted files do not end up in the CKAN cache. This uses the
download_hash
property to check that the file we downloaded matches the file that Netkan indexed. Thanks to this change,download_hash
properties must be correct and up to date for CKAN to work properly.Problem
download_hash
property of the registry for the new version of GPP (it still matched the original upload)Cause
The module cache currently assumes that a given URL will always correspond to the exact same download file. Netkan will download each URL once, and then every time it tries to access the same URL after that, it will re-use the same downloaded file. If the upstream file has changed, then the older version is still used, and the
download_hash
values in CKAN's registry do not match the current download, so users can't install it.After thirty days, the file is removed from the cache, so it would be redownloaded at that time if needed.
Changes
Now for GitHub downloads specifically, we check when the download file was last updated via the API. This feeds into new properties
GithubRelease.AssetUpdated
andModel.Metadata.RemoteTimestamp
, which are passed to a new parameter ofCore.Net.NetFileCache.GetCachedFileName
, which is responsible for looking up cached files. If this function finds a file that looks like a match, it checks the local timestamp, and if it's older than the timestamp from GitHub, it deletes the file instead of using it, forcing a re-download. This will ensure that stale files do not remain in the cache if they are replaced on GitHub.We could probably do this for other download hosts as well, but that is not included in this pull request. You can consider GitHub downloads to be a sort of pilot program for this functionality.