Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve blogger directory structure #157

Merged
merged 10 commits into from Jul 25, 2016
Merged

Conversation

ChillarAnand
Copy link
Contributor

No description provided.

self.write_content(
os.path.join(self.output_folder, out_folder, slug + '.html'),
content)
regex = re.compile('/(?P<year>\d{4})/(?P<month>\d{2})/')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other URL structures possible with Blogger? This looks like a hack nevertheless. Also, the slug is very likely to still be incorrect (YYYYMMrest-of-slug)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is default structure. Not sure if any other structures are possible.

regex = re.compile('/(?P<year>\d{4})/(?P<month>\d{2})/(?P<slug>.*)')
match = regex.search(link_path)
if match:
year, month, slug = match.group('year'), match.group('month'), match.group('slug')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don’t need to name your groups in this case and you could just use match.groups()

Please check if there are other structures available — and perhaps refrain from using regular expressions, instead basing yourself on splitting on /?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using split seems much simpler. Blogger allows custom slugs but url structure is fixed.

@ChillarAnand ChillarAnand force-pushed the master branch 2 times, most recently from f144d91 to 4251d62 Compare July 24, 2016 11:29
os.path.join(self.output_folder, out_folder, slug + '.html'),
content)
link_fragments = link_path.split('/')
if len(link_fragments) == 4:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • this check is unnecessary; however, you should have slug pre-determined and set to the correct value much earlier (line 161)
  • remove the unnecessary re import

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about drafts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detecting drafts does not depend on this as far as I can see. You need to write the correct path to the URL map, for example.

self.write_content(
os.path.join(self.output_folder, out_folder, slug + '.html'),
content)
if is_draft:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

@@ -154,8 +156,11 @@ def import_item(self, item, out_folder=None):

if link_path.lower().endswith('.html'):
link_path = link_path[:-5]
link_path = link_path.lstrip('/')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do this unconditionally.

@Kwpolska
Copy link
Member

LGTM, but I’d like a second opinion just in case (we went through so many changes here.) @punchagan , @ralsina?

(thanks for listening to all my complaints. In the future, it might be better not to amend and force-push)

@@ -182,21 +187,19 @@ def import_item(self, item, out_folder=None):
else:
is_draft = False

self.url_map[link] = self.context['SITE_URL'] + '/' + \
out_folder + '/' + slug + '.html'
self.url_map[link] = os.path.join(
Copy link
Member

@Kwpolska Kwpolska Jul 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can’t use os.path.join for URLs (that would give backslashes on Windows), just use slashes manually or do os.path.join().replace(os.sep, '/')

out_folder + '/' + slug + '.html'

self.url_map[link] = self.context['SITE_URL'] + out_folder + \
'/' + link_path + '.html'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better style (flake8 complains about indent):

        self.url_map[link] = (self.context['SITE_URL'] + out_folder +
                              '/' + link_path + '.html')

self.url_map[link] = self.context['SITE_URL'] + '/' + \
out_folder + '/' + slug + '.html'

self.url_map[link] = (self.context['SITE_URL'] + out_folder +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be safer to ensure that SITE_URL ends with a '/' when populating the context. Some blogs probably don't return a url ending in '/' and that may be why we had it in our code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Kwpolska
Copy link
Member

Merging, thanks for your contribution!

@Kwpolska Kwpolska merged commit ee13131 into getnikola:master Jul 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants