-
Notifications
You must be signed in to change notification settings - Fork 460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sphinx-style permalinks filter #2743
Conversation
Permalinks are not supposed to change every time a HTML page is regenerated, right? But isn't that what's happening here because |
nikola/filters.py
Outdated
for h in ['h1', 'h2', 'h3', 'h4']: | ||
nodes = doc.findall('*//%s' % h) | ||
for node in nodes: | ||
new_node = lxml.html.fragment_fromstring('<a id="{0}" href="#{0}" class="headerlink" title="Permalink to this headline"> π</a>'.format(uuid.uuid4())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re doing UUIDs too much. I would:
- check if the header has an
id
already, and use that - otherwise, create a slugified form of the header and use that, watching out for conflicts within the same page and appending numbers (example for Django)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what @felixfontein said about permalinks being, um, permanent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I don't care about this enough to do the extra work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can take over if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kwpolska that would be great, thanks!
@felixfontein that is a very good point. So, the id generation here is useless. |
LGTM with @Kwpolska's latest fix |
I’m not done yet, I want to at least implement the blacklist (or perhaps whitelist) |
Since the filter cannot/doesn't have persistent state, the link generation algorithm cannot be changed anymore after merging this (or at least from the next release on). Otherwise this filter doesn't generate permalinks, but code-dependent links which will change as soon as this code is touched in a non-trivial way! So we really shouldn't merge this one too early! |
The link generation algorithm I have in mind will be pretty stable, so no worries. (The links being permanent also depend on user actions) |
25f737c
to
7b6d73c
Compare
@ralsina, @felixfontein: this is ready for review. Header/id deduplication will be implemented in another PR, because it needs to be done for all IDs, not only the few headers we touch here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but kind of incomplete without that extra deduplication PR. Also, how permanent these permalinks will be depends a lot on the deduplication algorithm and what kind of page this is. (If it is an index where newer posts will appear before older ones, this will be a big problem.)
nikola/filters.py
Outdated
xpath_list = ['*//div[@class="e-content entry-content"]//{hx}'] | ||
for xpath_expr in xpath_list: | ||
for hx in ['h1', 'h2', 'h3', 'h4', 'h5', 'h6']: | ||
nodes = doc.findall(xpath_expr.format(hx=hx)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case xpath_expr
does not contain hx
, wouldn't this add a permalink six times for every matching node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Maybe first put the formatted expressions into a set, then convert the set to a sorted list, and iterate over it. That should fix this problem.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will. This sounds like an edge case, but are you aware of any good solutions for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made it a set. There’s still the case of the same header being returned by multiple expressions, should I also work around that?
Signed-off-by: Chris Warrick <kwpolska@gmail.com>
Signed-off-by: Chris Warrick <kwpolska@gmail.com>
This prevents permalinks being created where they shouldn’t be, eg. post or site titles. Signed-off-by: Chris Warrick <kwpolska@gmail.com>
Signed-off-by: Chris Warrick <kwpolska@gmail.com>
3b748b0
to
1298a14
Compare
The code is waiting on the
The deduplication algorithm will run primarily on indexes. We can discuss whether or not it should be changing links from the bottom-up in that PR. |
Bottom-up would be a good idea, I think. This will at least make the links permanent when |
+1 to bottom-up |
@ralsina: review welcome. @felixfontein: alright, done on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than the typo in the doc. I can't officially +1 because it's my PR :-)
.headerlink { opacity: 0.1; margin-left: 0.2em; } | ||
.headerlink:hover { opacity: 1; text-decoration: none; } | ||
|
||
Additionally, you can provide a custom list of XPath expressions which should be used for finding headers (``{hx}}`` is replaced by headers h1 through h6). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unbalanced braces
Merging, thanks for the reviews. |
Pull Request Checklist
Description
A filter implementing Sphinx-style header links. Bonus: it works for any compiler, since it's done as HTML post-processing.