New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MBS-6378 & MBS-5736: Rework URLCleanup tests to support validation #352
Conversation
Some stats without validation tests (last commit):
|
About TAP tests structure:
Sub-tests are displayed nicely either running @mwiencek Is it possible to update Jenkins TAP plug-in to version 2.0.1 on https://ci.metabrainz.org/? |
We're actually not using the TAP plugin at all (it's not installed), we're using JUnit and running prove with |
FWIW, TAP plug-in depends on JUnit plug-in. |
Forced push to clean mismatch of headers, to improve failure labels, and to document the above information into commit messages (Use "View as plain text" to have fully-detailed test results.) |
i_url: 'http://www.allmusic.com/album/here-comes-the-sun-mw0002303439/releases', | ||
i_ent: 'release_group', | ||
match: 'allmusic', | ||
clean: 'http://www.allmusic.com/album/mw0002303439', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's confusing to use a different naming system for inputs and outputs. :( Why not o_rel
instead of match
and o_url
instead of clean
? Or using 'e_' (for expected). Having a comment in the source code (rather than just in the commit message) that explains what these mean would also be welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, use longer/more descriptive property names that are self-explanatory (input_url
, output_relationship
, etc.). I like that they're all 5 chars but you can always use whitespace to align them. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I avoided to use the term output because both match
and clean
are actually used as input by the two last tests.
I tried to keep tests conciseness and to add self-explanatory property names (but it looks like I failed that final point).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I propose the following (lengthy but self-explanatory) (might be less readable though) renaming:
i_url
->input_url
i_ent
->input_entity_type
match
->expected_relationship_type
clean
->expected_clean_url
i_rel
->input_relationship_type
valid
->only_valid_entity_types
wrong
->only_wrong_entity_types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those seem a lot clearer to me. Hopefully they don't hurt readability.
One more forced push to fix coding style and remove the wrong feature, thanks to @mwiencek review. |
Introduce support for new URL Cleanup tests that can check: - Guessed type of relationship for a given URL and entity type - Cleanup of a given URL - Guessed type of relationship after cleanup - Validation of an URL (after cleanup) for every entity types Tests specification is programmatically checked, to a certain extent, in order to prevent misleading results. Supported self-explanatory test properties: - `input_url`: mandatory - `input_entity_type`: required by `expected_relationship_type` - `expected_relationship_type`: optional, - `expected_clean_url`: optional - `input_relationship_type`: optional to `only_valid_entity_types` - `only_valid_entity_types`: optional, require either `expected_relationship_type` or `input_relationship_type` Allowed combinations of test properties: - `input_url input_entity_type expected_relationship_type` - `input_url [input_entity_type expected_relationship_type] expected_clean_url` - `input_url input_entity_type expected_relationship_type [expected_clean_url] only_valid_entity_types` - `input_url [input_entity_type expected_relationship_type] [expected_clean_url] input_relationship_type only_valid_entity_types` Usage (beyond equality tests) of test properties: - Match is performed w.r.t. `input_url` and `input_entity_type` - Clean is performed w.r.t. `input_url` - Validation is performed with respect to: - `input_relationship_type`, else `expected_relationship_type` - `expected_clean_url`, else actual clean URL Additional note for type-free and inappropriate URLs: - `expected_relationship_type` can be set to `undefined` - `only_valid_entity_types` can be set to `[]` As examples, two tests are provided within this initial commit: - AllMusic URL for release-groutp relationship replaces both 'Guess type' and 'Cleanup' tests and extends to validation test. - ResidentAdvisor review URL for release-group replaces former 'Guess type' test and extend to invalidation test (MBS-8148). To conclude with, these new tests are streamed as TAP subtests. Subtests labels are not redundant with assertions labels. Note that the current Jenkins setup (without Jenkins TAP plug-in) at ci.metabrainz.org only displays assertions labels in "Test Result", use "View as plain text" for full-featured TAP stream (including actual and expected values in the case of assertion failure).
'Guess type' tests conversion is straithforward: - 1st items are converted to `input_entity_type` properties - 2nd items are converted to `input_url` properties - 3rd items are converted to `expected_relationship_type` properties 'Cleanup' tests are not that easily converted: - 1st items are converted to `input_url` properties - 2nd items are converted to `expected_clean_url` properties - 3rd items are converted to `input_entity_type` properties and are used to deduce the `expected_relationship_type` properties However, these lattest optional 3rd items of 'Cleanup' tests were actually not reliable. Because they were used for display only, they were not checked and sometimes erroneous. Thus the resulting `input_entity_type` and `expected_relationship_type` properties can be misleading or erroneous too. All of these uncertainties and errors are marked with FIXME comments that are addressed and removed in the following commits of this pull request. Tests are reordered by target website, and grouped below its name in comment. All other comments have been removed since they were redundant with either test content or issue tracker. See preceding commit for comprehensive examples of test conversion. Additionally, this conversion of all tests in one commit allows: - to merge tests together (removing 78 test entries) - to check and fix the 3rd item (entity type) of 'Cleanup' tests - to automatically verify type consistency of cleaned-up URLs. Merges, checks and fixes take place in the eight following commits.
Prior to this pull request, 'Type guess' tests and 'Cleanup' tests were distinct although they were sometimes based on the same URLs. Tests using the same `input_url` (and `input_entity_type` if any) are merged into one. (This actually removes 20 tests.)
Prior to this pull request, clean URL was not checked to match the same relationship type as the input URL. This is now auto-checked. Match-only tests using the same `input_url` as the `expected_clean_url` of another test are merged into it. (This actually removes 47 tests.)
Prior to this pull request, entity type was used for display only. Now, it is used to check the auto-selected relationship type. This patch checks that no relationship type is auto-selected.
Prior to this pull request, entity type was used for display only. For some cleanup tests, it was wrongly set to the `streamingmusic` relationship type. Now that both types are available to cleanup tests as well, this patch fixes both entity and relationship types to these tests.
Previously, this test was duplicated (it goes back to MBS-6432) and erroneously set to nonexistent `lyrics.release` link type (which silently passed the test function prior to this pull request). This patch deduplicates and updates it to `lyrics.release_group`.
These entity types have been checked since.
Prior to this pull request, entity type was used for display only. For these two tests, it was erroneously set to another entity type. This patch just fixes consistency of these tests. Adding validation tests would require to fix the URL cleanup component first.
Prior to this pull request, clean URL was not checked to match the same relationship type as the input URL. This is now auto-checked. This patch removes and merges some tests which are redundant by their entity type, relationship type, and input/clean URL patterns, all at once. (This actually removes 10 tests.)
These comes as additional examples for MBS-6378 and MBS-5736. This commit ends the pull request that adresses these issues.
Thank you. :) Just a minor note for future PRs, for JavaScript we use camelCase for variable names EXCEPT_FOR_CONSTANTS. It's not a big deal but it's nice to have some consistency across files. |
Thanks for your time at review. 👍 I updated PR #354 to follow this naming convention. |
Previously, only
guessType
andcleanURL
functions of URLCleanup component were tested, separately.This patch merges both tests and adds support to test
validationRules
functions. It also tests that clean URL still matches the same relationship type as input URL.