New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PICARD-300: Dont save files if metadata is unchanged #535
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the inline comments, how does this handle the various settings on the Tags options page that influence how tags are written and which tags are removed from files, especially the ID3 version etc?
orig_values = orig_metadata.getall(name) | ||
if new_values != orig_values: | ||
return False | ||
if orig_metadata.length != new_metadata.length: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not correct, see #109 and PICARD-342.
new_metadata = self.new_metadata | ||
orig_metadata = self.orig_metadata | ||
tags = set(new_metadata.keys() + orig_metadata.keys()) | ||
for name in filter(lambda x: not x.startswith("~") and self.supports_tag(x), tags): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x.startswith("~")
is not a correct check. In some cases we do actually write ~
tags to files (I think ~rating
(which also depends on the email adress) and ~id3
-prefixed tags (look in ID3File._save
for those)).
Updated as per review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the ID3 version, text encoding etc. settings? I gave this a try and if I change them, the tags are still not saved.
@@ -62,6 +62,13 @@ | |||
</widget> | |||
</item> | |||
<item> | |||
<widget class="QCheckBox" name="dont_save_on_unchanged"> | |||
<property name="text"> | |||
<string>Don't save files if the tags are uncahnged</string> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"unchanged"
Hmm, yes, for a file specific don't save on unchanged, I am thinking I will add the settings that affect saving as "~" tags in the file metadata when its loaded and compare them to determine if the settings have changed. Any suggestions? |
Updated as requested. Now saves if settings are changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This completely ignores image tags.
What if I save a file, quit Picard and a few months later want to retag that same file with different settings? If it's loaded after the preferences are changed, orig_metadata
will contain the wrong initial value and the file won't be saved.
(Unfortunately I don't have a complete list of settings and scenarios to cover in my head, I write them down as I think of them :-( )
orig_values = orig_metadata.getall(name) | ||
if new_values != orig_values: | ||
return False | ||
if abs(float(orig_metadata.length) - float(new_metadata.length)) > 2000: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2000 here (and in the tagdiff code) should be moved to a new constant in picard/const/__init__.py
, like MAX_INSIGNIFICANT_LENGTH_DIFFERENCE
or something like that.
if new_metadata[name] != orig_metadata[name]: | ||
return False | ||
for name in filter(lambda x: (not x.startswith("~") | ||
or x.startswith("~id3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition is unique to formats using ID3 tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we need to add a check specifically for id3 tags since, we are already 'AND' the above result with self.supports_tag
def _load_preserved_config(self, metadata): | ||
# Adding additional config to metadata to prevent unecessary saving | ||
metadata['~config:rating_steps'] = config.setting['rating_steps'] | ||
metadata['~config:rating_user_email'] = config.setting['rating_user_email'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this embed user email in file tags ?
If yes, it leads to privacy issues.
I would make only one hash of config options, stored in ~config:hash
and recalculate/compare on load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding ~ metadata is not in itself a privacy issue as ~ data is not (with some exceptions) saved to media files.
However, email addresses are saved to the id3 rating tag - this is how the rating tag is defined in the id3 standard, not a choice made by Picard.
BUT, I counsel against storing config or indeed non-track related data in ~ metadata. Just because it is not visible in the standard UI, doesn't make this a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling here that we are over complicating this solution by trying to make it perfect for all combinations of changing option settings and saving combinations of files. e.g. if we change options and save one file and change them back again before saving the rest. This seems to be why we are saving configuration settings in file metadata.
Most people will have a standard set of settings that they use daily - and won't change them frequently. The easiest solution in my opinion is to preserve globally the settings when Picard is started, and if current settings have changed from these, then we save the files. If we want to get a bit cleverer, then we can have a list of settings that we check or ignore (so that e.g. script settings changed don't trigger global saves because script changes of note will have changed metadata which will trigger a save).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S. It seems to me that a list of settings which change media tags without changes to Picard's metadata
is quite small. Images, ratings, what else? So we create a list of these settings, and preserve the startup values and then compare the startup values to current values when user saves files.
What do people think of this idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want the best of both, I can just define a new variable called in the File class called config and store the above there?
Removes all the above associated problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we are always saving it for the first time a save is being called on it and then skipping on saving it afterwards if no settings have been changed.
Me and @mineo discussed it over irc that we should avoid saving any tags to the file and that a first save is done on a new file regardless is the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think the right approach is to compare old and new tags every time, but if @mineo still thinks your way is better after reviewing my most recent comment, then I will certainly defer to his greater knowledge and experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also currently in the above code, it only prevents writing metadata to files. I haven't implemented prevent_save for other operations.(Saving image to files, rename files, move files, move additional files, delete empty dirs)
I can include it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I think this would be overkill. If you don't save a file because nothing has changed you don't rename files, delete directories etc.
BUT, if you previously saved images to files and now save them to tags (or vice versa) then the file has changed. We would be better concentrating on loading file metadata more comprehensively, and then checking whether it has changed - so for images we should load files and tags (noting their source at the same time) and then see if the images have changed c.f. downloaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, true. Will update the PR to reflect these changes as well :)
Updated PR. Not saving non metadata related things to self.metadata now. |
|
file_config['remove_ape_from_mp3'] = config.setting['remove_ape_from_mp3'] | ||
if self._File != mutagen.aiff.AIFF: | ||
file_config['write_id3v1'] = config.setting['write_id3v1'] | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As previously commented (and now hidden) IMO this is not the way to note that tags have changed because settings have changed.
The only reliable way to check whether tags have changed is to compare the original tag with the new tag - we do that for metadata, but not for special tags like rating and images.
So I think we should concentrate on:
a. Saving the existing ratings tag in hidden metadata and then comparing it to the new one at save time to see if it has changed.
b. Determining the coding for a file (which may not be possible) and comparing it at save time. If we cannot determine encoding from tag content, then we should save the encoding as new MUSICBRAINZ_ENCODING metadata in order to be able to see if it has changed.
c. I think Join_With may already be handled (because the existing ID3v23 tags are stored in Metadata as a string) and we already generate a string for multi-value tags when checking if the metadata has changed.
d. I am not certain whether we can determine whether file is ID3v23 or ID3v24 in all cases. certainly we should be able to tell if it e.g. has TMCL tags, but there is no guarantee that these tags will be used. Again, perhaps we need a MUSICBRAINZ_ID3V2x tag.
e. We should try to identify whether the existing file has APE or ID3v1 metadata and if so, set a hidden metadata flag.
f. The code does not handle changes to settings to how images are stored. As previously suggested, I think we should load images from both tags and files when we load the media file into Picard (noting their source) and then determine whether downloaded images are different when determine whether the file has changed.
Returns False if no related configs""" | ||
raise False | ||
|
||
def _test_same(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call this _is_metadata_same or (with opposite return values) _has_metadata_changed?
|
I think we are complicating it even more. Currently I will implement only the file tags related changes. Further caching of other methods(move files, rename, etc) can be implemented in a different PR after discussion as a lot of config settings come into play then. |
Saving config settings to files is complicating things - but more importantly it won't do what it is supposed to if you changed the settings before the file was loaded. This is NOT the right approach - the only approach which will work properly is to look at the existing tags and compare them to the new ones. |
@Sophist-UK it will actually. We are ALWAYS saving the file the first time it is saved after being loaded. We are skipping on a complicated implementation with the downside that we are saving it once regardless of the fact that the tags have been changed or not(or the related settings for that matter) |
@samj1912 said: "1. will require changing a lot of icons if the number of files loaded is large." If you changed settings before loading, then when the file is loaded the icon will be set. If you change settings after loading, then yes - you need to run through all albums and files and change those that now need saving as a result of changing the setting. But, that could be done as a queued background task (in the same way that loading and saving files and doing web service calls are handled in the background). |
@samj1912 said:
That in itself makes this enhancement MUCH less useful. When I load my library, I currently use the Remove Perfect Albums plugin to get rid of any which haven't changed, and then save the rest. If this PR doesn't work on first save (and avoid having to use that plugin), it probably doesn't give much benefit. I think you may need to bite the bullet and think about doing a more complicated but proper job. |
It is possible to include a much more complicated way to cache the saving process but I think it is out of the scope of this PR as it will involve looking at format specific ways to determine the file's existing configuration and comparing it with the settings. This PR just serves as a base to prevent useless saving when - for eg - You did a lot of small changes across albums and some of them are already saved. You select all and press 'Save'. I agree that it is not perfect but it is better than the current implementation and requires very less implementation. We can implement a much more elaborate save cache in v2.0 maybe but seeing as we are approaching v1.4 deadline, I think a simple augmentation like this might save on some unnecessary IO. |
Added image tag related settings and made the code more modular. |
@samj1912 said:
I am not the ultimate arbiter of whether PRs get merged or not - but IMO as it stands this PR should not be merged because: A. In the above use case you can already select which albums/tracks to save in the UI; B. In my use case refreshing metadata on my entire library, it does not eliminate the need for using Remove Perfect Albums; C. Does not handle the use case where settings were changed before files were loaded; D. This does not solve the general problem of properly identifying which files have non-metadata tags that have changed; E. Does not properly display in the UI files that now need saving because settings have changed Consequently I think that this introduces code which takes Picard in the wrong direction, and results in Picard doing different things in different circumstances in a way which will be confusing to users. |
Link to JIRA issue: PICARD-300 |
A few comments -
|
Also, if the option name in the ui seems confusing maybe we can change that? :) |
To address your points:
But the bottom line is that as currently scoped, IMO this PR does not address the full need, the bits it does address have an alternative existing way of being achieved, its only addressing the needs of a very minor use case but is hacking the Picard innards to achieve it in a somewhat unaesthetic way. I know its hard to hear someone say that they don't think your hard-work in coding this is useful - and I know because I have been there myself - but I have to be honest about this. And it is harder for me to do because I can see the effort you have put into all the other improvements and fixes you have recently done, and the last thing I want to do is to dent your enthusiasm. But, as I have said before, in the end it is not by decision - it is @mineo or @zas who need to merge this, and my comments are only my personal opinion. But I also have to be honest about whether the functionality is right and whether this is taking Picard in the right direction - and in this case IMO it isn't. |
Closing this PR as I don't see this working out. |
No description provided.