Sphinx-style permalinks filter #2743

ralsina · 2017-05-04T20:55:18Z

Pull Request Checklist

I’ve read the guidelines for contributing.
I updated AUTHORS.txt and CHANGES.txt (if the change is non-trivial) and documentation (if applicable).
I tested my changes.

Description

A filter implementing Sphinx-style header links. Bonus: it works for any compiler, since it's done as HTML post-processing.

felixfontein · 2017-05-05T05:22:29Z

Permalinks are not supposed to change every time a HTML page is regenerated, right? But isn't that what's happening here because uuid.uuid4() always generates a new value which was never generated before?

Kwpolska · 2017-05-05T13:48:52Z

nikola/filters.py

+    for h in ['h1', 'h2', 'h3', 'h4']:
+        nodes = doc.findall('*//%s' % h)
+        for node in nodes:
+            new_node = lxml.html.fragment_fromstring('<a id="{0}" href="#{0}" class="headerlink" title="Permalink to this headline">&nbsp;&pi;</a>'.format(uuid.uuid4()))


You’re doing UUIDs too much. I would:

check if the header has an id already, and use that

otherwise, create a slugified form of the header and use that, watching out for conflicts within the same page and appending numbers (example for Django)

Also, what @felixfontein said about permalinks being, um, permanent.

Honestly, I don't care about this enough to do the extra work.

I can take over if you want.

@Kwpolska that would be great, thanks!

ralsina · 2017-05-05T13:50:19Z

@felixfontein that is a very good point. So, the id generation here is useless.

ralsina · 2017-05-08T00:47:28Z

LGTM with @Kwpolska's latest fix

Kwpolska · 2017-05-08T15:32:43Z

I’m not done yet, I want to at least implement the blacklist (or perhaps whitelist)

felixfontein · 2017-05-08T20:11:54Z

Since the filter cannot/doesn't have persistent state, the link generation algorithm cannot be changed anymore after merging this (or at least from the next release on). Otherwise this filter doesn't generate permalinks, but code-dependent links which will change as soon as this code is touched in a non-trivial way! So we really shouldn't merge this one too early!

Kwpolska · 2017-05-09T13:17:55Z

The link generation algorithm I have in mind will be pretty stable, so no worries. (The links being permanent also depend on user actions)

Kwpolska · 2017-05-14T14:44:23Z

@ralsina, @felixfontein: this is ready for review. Header/id deduplication will be implemented in another PR, because it needs to be done for all IDs, not only the few headers we touch here.

felixfontein

LGTM, but kind of incomplete without that extra deduplication PR. Also, how permanent these permalinks will be depends a lot on the deduplication algorithm and what kind of page this is. (If it is an index where newer posts will appear before older ones, this will be a big problem.)

felixfontein · 2017-05-14T15:26:32Z

nikola/filters.py

+        xpath_list = ['*//div[@class="e-content entry-content"]//{hx}']
+    for xpath_expr in xpath_list:
+        for hx in ['h1', 'h2', 'h3', 'h4', 'h5', 'h6']:
+            nodes = doc.findall(xpath_expr.format(hx=hx))


In case xpath_expr does not contain hx, wouldn't this add a permalink six times for every matching node?

(Maybe first put the formatted expressions into a set, then convert the set to a sorted list, and iterate over it. That should fix this problem.)

Yes, it will. This sounds like an edge case, but are you aware of any good solutions for this?

Made it a set. There’s still the case of the same header being returned by multiple expressions, should I also work around that?

Signed-off-by: Chris Warrick <kwpolska@gmail.com>

This prevents permalinks being created where they shouldn’t be, eg. post or site titles. Signed-off-by: Chris Warrick <kwpolska@gmail.com>

Signed-off-by: Chris Warrick <kwpolska@gmail.com>

Kwpolska · 2017-05-14T16:20:45Z

LGTM, but kind of incomplete without that extra deduplication PR.

The code is waiting on the deduplicate-headers branch until this lands.

Also, how permanent these permalinks will be depends a lot on the deduplication algorithm and what kind of page this is. (If it is an index where newer posts will appear before older ones, this will be a big problem.)

The deduplication algorithm will run primarily on indexes. We can discuss whether or not it should be changing links from the bottom-up in that PR.

felixfontein · 2017-05-14T17:20:42Z

Bottom-up would be a good idea, I think. This will at least make the links permanent when INDEXES_STATIC is True and you're not on the front page. If INDEXES_STATIC is False, all hope is lost anyway...

ralsina · 2017-05-14T17:22:35Z

+1 to bottom-up

Kwpolska · 2017-05-14T17:26:24Z

@ralsina: review welcome.

@felixfontein: alright, done on header-deduplication branch.

ralsina

LGTM other than the typo in the doc. I can't officially +1 because it's my PR :-)

ralsina · 2017-05-14T17:27:45Z

docs/manual.txt

+      .headerlink { opacity: 0.1; margin-left: 0.2em; }
+      .headerlink:hover { opacity: 1; text-decoration: none; }
+
+   Additionally, you can provide a custom list of XPath expressions which should be used for finding headers (``{hx}}`` is replaced by headers h1 through h6).


Unbalanced braces

Kwpolska · 2017-05-14T17:56:10Z

Merging, thanks for the reviews.

Kwpolska reviewed May 5, 2017

View reviewed changes

Kwpolska force-pushed the add-title-permalinks branch from 25f737c to 7b6d73c Compare May 14, 2017 14:42

felixfontein reviewed May 14, 2017

View reviewed changes

Kwpolska mentioned this pull request May 14, 2017

How to avoid duplicated ids and h1 on the blog index ? #2570

Closed

Roberto Alsina and others added 6 commits May 14, 2017 18:18

Fix #2636

2c2efc2

added docstring

70f393b

add_header_permalinks: use existing IDs or slugify titles

51abcce

Signed-off-by: Chris Warrick <kwpolska@gmail.com>

Make apply_to_file decorators work with other arguments

56ef861

Signed-off-by: Chris Warrick <kwpolska@gmail.com>

XPath whitelist for header permalinks

8f166ba

This prevents permalinks being created where they shouldn’t be, eg. post or site titles. Signed-off-by: Chris Warrick <kwpolska@gmail.com>

Use a set of XPath expressions to process

1298a14

Signed-off-by: Chris Warrick <kwpolska@gmail.com>

Kwpolska force-pushed the add-title-permalinks branch from 3b748b0 to 1298a14 Compare May 14, 2017 16:19

ralsina commented May 14, 2017

View reviewed changes

Kwpolska merged commit 3270735 into master May 14, 2017

Kwpolska deleted the add-title-permalinks branch May 14, 2017 17:56

Kwpolska mentioned this pull request May 14, 2017

Header/id deduplication #2763

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sphinx-style permalinks filter #2743

Sphinx-style permalinks filter #2743

ralsina commented May 4, 2017 •

edited by Kwpolska

Loading

felixfontein commented May 5, 2017

Kwpolska May 5, 2017

Kwpolska May 5, 2017

ralsina May 5, 2017

Kwpolska May 5, 2017

ralsina May 5, 2017

ralsina commented May 5, 2017

ralsina commented May 8, 2017

Kwpolska commented May 8, 2017

felixfontein commented May 8, 2017

Kwpolska commented May 9, 2017

Kwpolska commented May 14, 2017

felixfontein left a comment

felixfontein May 14, 2017

felixfontein May 14, 2017

Kwpolska May 14, 2017

Kwpolska May 14, 2017

Kwpolska commented May 14, 2017

felixfontein commented May 14, 2017

ralsina commented May 14, 2017

Kwpolska commented May 14, 2017

ralsina left a comment

ralsina May 14, 2017

Kwpolska commented May 14, 2017

Sphinx-style permalinks filter #2743

Sphinx-style permalinks filter #2743

Conversation

ralsina commented May 4, 2017 • edited by Kwpolska Loading

Pull Request Checklist

Description

felixfontein commented May 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ralsina commented May 5, 2017

ralsina commented May 8, 2017

Kwpolska commented May 8, 2017

felixfontein commented May 8, 2017

Kwpolska commented May 9, 2017

Kwpolska commented May 14, 2017

felixfontein left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Kwpolska commented May 14, 2017

felixfontein commented May 14, 2017

ralsina commented May 14, 2017

Kwpolska commented May 14, 2017

ralsina left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Kwpolska commented May 14, 2017

ralsina commented May 4, 2017 •

edited by Kwpolska

Loading