Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending and refactoring WordPress importer #1867

Merged
merged 23 commits into from Jul 12, 2015

Conversation

felixfontein
Copy link
Contributor

Beginning with some refactoring and extending of the WordPress importer.
Done so far:

  • Feeding all posts together with post format through transform_content, and let it decide extension.
  • No longer using extension .wp for WordPress-to-markdown converted posts.
  • Exporting wp-status and excerpt (if available).
  • Allowing to exclude private items and include empty items.
  • Exporting attachment-post/page relation as JSON file.
  • Exporting categories (if there's only one per post) including category hierarchy.
  • Exporting comments.
  • Add support for WordPress page compiler plugin (both to make new site use it, or to use it to convert WordPress posts to .html files).

Review on Reviewable

@Kwpolska
Copy link
Member

Kwpolska commented Jul 7, 2015

Add support for WordPress page compiler plugin (both to make new site use it, or to use it to convert WordPress posts to .html files).

Should be simple to implement by just talking to the plugin plugin and loading the new plugin (can you call collectPlugins twice without everything going down in flames?). Make sure to add a command-line option to pick the usage mode, and if there is no value specified, make sure to ask — please include a note about the compiler’s license for anti-GPL zealots.

if out_folder_slug:
self.posts_pages[post_id] = (post_type, out_folder_slug[0], out_folder_slug[1])

def write_attachments_info(self, path, attachments):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who would use those files? Would it actually be useful to give Nikola some real post attachments support, btw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These files could be used by plugins for the WordPress page compiler. For example, if you include attachments into a post as a gallery, you have something like
[gallery columns="3" link="file" ids="52,53,54,55,56,57,58"]
inside the post -- here 52 up to 58 are IDs for attachments. If you have this .json file, you can map the IDs to actual images.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About giving Nikola proper post attachments functionality: I don't really care enough about that feature to implement/want it.

@Kwpolska
Copy link
Member

Kwpolska commented Jul 7, 2015

Reviewed 1 of 2 files at r1.
Review status: 1 of 2 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.


Comments from the review on Reviewable.io

@Kwpolska
Copy link
Member

Kwpolska commented Jul 7, 2015

nikola/plugins/command/import_wordpress.py, line 443 [r1] (raw file):
Is there any other post format available? Would that be HTML, or what?


Comments from the review on Reviewable.io

@felixfontein
Copy link
Contributor Author

Regarding the other post formats: I don't know. That was already implemented before I touched it; apparently there's a markdown plugin for WordPress. (I looked it up: #484)

@felixfontein
Copy link
Contributor Author

Ok, I now looked at the WordPress plugin which allows to use different text formats, and added more explicit support, including an error when an unsupported format is encountered.

@felixfontein
Copy link
Contributor Author

Regarding calling collectPlugins() more than once: the documentation doesn't explicitly disallow this, and the source seems to have no problem with it. Yapsy has a map for every category in which is indexed by the plugin info file's filename (PluginManager._category_file_mapping), so if you call the function again, it will go through all plugins again, load them, go through all categories to which they fit, and ignore them (assuming neither the plugin nor the categories changed since the last call of collectPlugins().

So installing a plugin and then calling collectPlugins() a second time will not hurt, but find the new plugin.

for cat, path in cat_map.items():
self._category_paths[cat] = utils.join_hierarchical_category_path(path)

def _adjust_config_template(self, channel, rendered_template):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huge 👎

Instead, you should change the input to the template and the template itself, just like I did in 0f4b1e2 (relevant lines: 57/216). Don’t mess with regex. (It was done before, this code is probably ancient)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I took the old code and modified it. I've removed it now and did it the way you wanted. (Actually I prefer it that way, too :) )

@felixfontein
Copy link
Contributor Author

Ok, I think I'm done for now.

@felixfontein felixfontein added this to the v7.6.1 milestone Jul 11, 2015
* Allowing to use WordPress page compiler on imported site instead of
converting posts to markdown with --use-wordpress-compiler
* Allowing to automatically install the WordPress page compiler when
needed with --install_wordpress_compiler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use - instead of _
also, you should advertise the compiler if the user does not ask for it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was a copy'n'paste typo. Fixed.

@Kwpolska
Copy link
Member

  1. Merge master (or rebase and force-push this branch).
  2. Put your changes at the top of the Features section.
  3. Test with some wordpress dumps if it works properly, if you haven’t already.

@@ -1908,13 +1908,23 @@ the following:

This is also useful for DISQUS thread migration!

* Allows you to export your comments with each post
* Exports information on attachments per post
* Will try to convert the content of your posts. This is *not* error free, because
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put info about your compiler here, drop this paragraph (you should really push the wordpress compiler)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote the documentation a bit. Can you take another look?

…, and let it decide extension.

No longer using extension .wp for WordPress-to-markdown converted posts.
Exporting wp-status and excerpt (if available).
Allowing to exclude private items and include empty items.
@felixfontein
Copy link
Contributor Author

Ok, all tests successfull (with two of my WordPress dumps and different settings, and automatic ones). Squashing a bit, and merging.

felixfontein added a commit that referenced this pull request Jul 12, 2015
Extending and refactoring WordPress importer.
@felixfontein felixfontein merged commit 1a7c983 into master Jul 12, 2015
@felixfontein felixfontein deleted the import-wordpress-refactoring branch July 12, 2015 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants