Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix invalid HTML #2314

Merged
merged 9 commits into from Apr 20, 2018
Merged

Fix invalid HTML #2314

merged 9 commits into from Apr 20, 2018

Conversation

selfthinker
Copy link
Collaborator

This fixes some invalid HTML which mostly sneaked in with #2286.
The semantics of the search results page are a bit tricky. I nearly changed the existing description list to a normal unordered list, but I think it still makes sense the way it is now, i.e. a dl with 1 dt and 2 dd. Semantically I think it might make more sense to move the "[number of] Hits" also into its own dd or add it to the 'last modified' line.

The search buttons from the search form on the top of the page
as well as the search page both produce invalid HTML
due to the name attribute being empty.
As the name is required for each of the PHP functions
producing the buttons and is passed as an empty string,
this makes sure that parameters with empty strings get skipped.
@selfthinker selfthinker mentioned this pull request Apr 15, 2018
21 tasks
@selfthinker selfthinker added this to the 🐱 Greebo milestone Apr 15, 2018
The advanced search options list links with an invalid empty target
attribute. This fixes that by skipping empty parameters.
Recent changes to the search results page added various divs
to the pre-existing description list, which is invalid.
This fixes that by a) removing a couple of unnecessary elements
and b) interpret the 'last modified' line as another 'dd'.

$eventData = [
'resultHeader' => $resultHeader,
'resultBody' => [$metaLine, $snippet],
'page' => $id,
];
trigger_event('SEARCH_RESULT_FULLPAGE', $eventData);
$html .= '<div class="search_fullpage_result">';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation behind removing this div? I find it helpful for styling purposes. And it is valid HTML per WHATWG Living Standard: https://html.spec.whatwg.org/multipage/grouping-content.html#the-dl-element

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this will be valid in the coming HTML 5.2 spec (and the validator already says it's fine as well). The main reason why I removed it was because I couldn't see it being used anywhere and generally like to keep code clear of redundant things
On the one hand I wouldn't mind moving it back as it can make styling a dl easier, but on the other hand this is a relative simple list and I'm not sure if anyone is ever going to use it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is currently not used in the default template, one use-case would be to give the search results a slightly different background color. Or some box-shadow in order to create a more card-like / material appearance. This might be possible without such a wrapping element, but it might be somewhat arcane/non-trivial usage

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we stay with a DL I would prefer to keep this div.

@micgro42
Copy link
Collaborator

Thank you for checking and fixing the invalid html. Having this as part of the CI would indeed be very helpful! 👍
I feel also a bit strange about this being an description list, both semantically and with the restriction in puts on the html that is allowed.
We should also modify the new SEARCH_RESULT_FULLPAGE event so it wraps the elements of the resultBody key in dd tags. Since they are basically the only valid element in this context, it doesn't make sense to leave that up to the plugin author.

@selfthinker
Copy link
Collaborator Author

Being a description list is fine-ish, although it's a bit of an edge case and some other semantics could fit as well or maybe even better. I think two things are important for the semantics here: 1. the results are some form of list and 2. the page name is highlighted as being the descriptor of each item. That relationship can be achieved either with dt and dd or a heading in an li. Just divs would also work but would be inferior.

The "[number of] Hits" and "Last modified" are sort of meta data, although the number of hits is more important as that informs the importance (and order) of the results. I guess that can only really be conveyed via context. Whichever the semantics, I think the number of hits and last modified date should be on the same line.

What do you think are the kind of things people would put into the resultBody of SEARCH_RESULT_FULLPAGE? I wonder if it makes more sense to only allow to add stuff to the line between the "heading" and the snippet?

@micgro42
Copy link
Collaborator

micgro42 commented Apr 16, 2018

The "[number of] Hits" and "Last modified" are sort of meta data, although the number of hits is more important as that informs the importance (and order) of the results. I guess that can only really be conveyed via context. Whichever the semantics, I think the number of hits and last modified date should be on the same line.

Currently, we still show the number of hits for all results but we drop the last modified when we drop the snippet (iirc the cutoff is 15). Do you think it would be better to show both last mod and number of hits for all results or drop both for the later results?

What do you think are the kind of things people would put into the resultBody of SEARCH_RESULT_FULLPAGE? I wonder if it makes more sense to only allow to add stuff to the line between the "heading" and the snippet?

One specific implementation would be the watchcycle maintenance plugin, which shows just an icon to indicate maintenace status on some pages.
Other small examples could be the rating of the rater plugin, quality score by the qc plugin or tags from a tagging plugin. However, some struct plugin might also add some struct data to the search results. This could look similar to the wikipedia result of the Avengers film:

screenshot-2018-4-16 avengers - google search
https://www.google.com/search?q=Avengers.

There is also precedence for even more added information/links:

screenshot-2018-4-16 dokuwiki - google search
https://www.google.com/search?q=DokuWiki

Now that I think of it, it might be sensible to add the search result's position to the event data, since some modifications are maybe only intended for the first n results? 🤔

@splitbrain
Copy link
Collaborator

👍 for the position in the event data

@selfthinker
Copy link
Collaborator Author

Because we can't be sure what to expect in the resultBody, I guess it makes more sense to change the dl to a ul or ol. (As the order of the results mean something, I'm leaning towards ol, although the numbers wouldn't be showing.)

Do you think it would be better to show both last mod and number of hits for all results or drop both for the later results?

I think the latter makes more sense. When you only have the page name (or first heading, depending on config) to go by because the snippet has been removed, the number of hits and last modified date help to decide which results might be relevant.

it might be sensible to add the search result's position to the event data, since some modifications are maybe only intended for the first n results?

Maybe that can be simplified by only passing if they will be part of the first, more prominent results before the cutoff or after?

If no-one objects or has a better idea I will make the following changes tomorrow:

  • change dl to ol
  • move number of hits and last modified date onto the same line (independent of snippet cutoff)
  • change dt to heading
  • fix heading hierarchy while I'm at it (current h3s should be h2, making the new item headings h3s)

@splitbrain
Copy link
Collaborator

  1. Personally, I think the DL fits the search results better. They are ordered, yes but this order is not necessarily correct (the correct hit for what the user was searching for might be the 5th result, not the 1st) and everything currently in DDs does describe the DT (the hit). With the change proposed by Michael to not pass the DD wrapper inside the event data but instead wrap the results coming from the event in DDs, we ensure the DL stays correct.
  2. I'm fine with moving the hits into the same line as the lastmod (I would rename the class from lastmod to meta)

$snippet = '<dd>' . ft_snippet($id, $highlight) . '</dd>';
$lastMod = '<span class="search_results__lastmod">' . $lang['lastmod'] . ' ';
$snippet = '<dd class="snippet">' . ft_snippet($id, $highlight) . '</dd>';
$lastMod = $lang['lastmod'] . ' ';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to wrap this label into a span, so template authors can hide it or replace it by an icon?


$eventData = [
'resultHeader' => $resultHeader,
'resultBody' => [$metaLine, $snippet],
'page' => $id,
];
trigger_event('SEARCH_RESULT_FULLPAGE', $eventData);
$html .= '<div class="search_fullpage_result">';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we stay with a DL I would prefer to keep this div.

This implements some of the changes discussed in #2314:

1. last mod and number of hits are on the same line and always shown
2. There is a class around the last mod string so it can be
hidden/replaced by an icon
3. All items in the resultBody returned by the event are wrapped in
`<dd>` tags to ensure validity. They get their array key as class
4. There is still a wrapper div around each result for styling purposes
In #2286 the 2nd level heading in the intro was removed, so the h3
should be promoted to h2.
splitbrain added a commit that referenced this pull request Apr 17, 2018
Checks are done against the w3.org validator API. Currently we have some
fails. One will be fixed in #2314, others still need to be fixed.

This integration test also reveals some code bugs because texts do not
suppress warnings and deprecation messages.
@splitbrain splitbrain mentioned this pull request Apr 17, 2018
4 tasks
@selfthinker
Copy link
Collaborator Author

Sorry, I thought I would have time today, but it looks more like I will make the changes Thursday or Friday. (I'm aware this is the last issue left in the milestone.)

@micgro42
Copy link
Collaborator

@selfthinker Just a quick heads-up: We have already made some changes to this pull request and we have partially implemented the above points. Please check the last commits on this branch so that no work is performed twice. 🙂

Punctuation is more natural, makes sense with CSS switched off
and makes screenreaders add a little pause.
$mtime = filemtime(wikiFN($id));
$lastMod = '<span class="lastmod">' . $lang['lastmod'] . '</span> ';
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is all of this outside of the if ($cnt !== 0)? Should anything be displayed if the count is 0?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are circumstances where $cnt === 0 but the page still matches the search. (Right now, I don't recall what causes them :/)
In these cases we display the pagename, but no hits and no snippet. I think it would make sense to at least show the last mod date as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might have to do with plugins that add additional content to the search (e.g. the discussion plugin does this) - though I think the discussion plugin should at least also provide snippets but I haven't checked.


if ($snippet) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this here at all? Could it not be added to $resultBody['snippet'] in the same line in which the snippet gets created a few lines above?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original reason is that the order of the keys is important, but I refactored it to fix it.

@selfthinker
Copy link
Collaborator Author

I'm nearly happy with how it is now. I would prefer an ol or ul but don't feel very strongly about it.

There are only two things left: Apart from the comments I just left, I also changed the meta items to be separated by punctuation instead of just spacing. That is more natural, makes sense with CSS switched off and makes screenreaders add a little pause.

I don't think this is really important, but it looks weird that the "Last modified" starts with uppercase (although it already looked weird before) and I was tempted to add a lcfirst but then I thought it could make something worse in other languages and didn't do it.

@@ -562,12 +562,14 @@ protected function getFulltextResultsHTML($data, $highlight)
}

$html = '<div class="search_fulltextresult">';
$html .= '<h3>' . $lang['search_fullresults'] . ':</h3>';
$html .= '<h2>' . $lang['search_fullresults'] . ':</h2>';

$html .= '<dl class="search_results">';
$num = 1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the comparison in line 591, this should probably be initialized as 0.


$html .= '<dl class="search_results">';
$num = 1;
$position = 1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this initialization to not start with position 2.

Also, for consistency this initializes also $num with 0, while ensuring
that the first FT_SNIPPET_NUMBER results with a $cnt>0 have a snippet.
@splitbrain splitbrain merged commit 6ea61f3 into master Apr 20, 2018
@splitbrain splitbrain deleted the fix-html branch April 20, 2018 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants