Preserve blogger directory structure #157

ChillarAnand · 2016-07-24T11:02:13Z

No description provided.

Kwpolska · 2016-07-24T11:11:44Z

v7/import_blogger/import_blogger.py

-            self.write_content(
-                os.path.join(self.output_folder, out_folder, slug + '.html'),
-                content)
+            regex = re.compile('/(?P<year>\d{4})/(?P<month>\d{2})/')


Are there any other URL structures possible with Blogger? This looks like a hack nevertheless. Also, the slug is very likely to still be incorrect (YYYYMMrest-of-slug)

This is default structure. Not sure if any other structures are possible.

Kwpolska · 2016-07-24T11:16:09Z

v7/import_blogger/import_blogger.py

+            regex = re.compile('/(?P<year>\d{4})/(?P<month>\d{2})/(?P<slug>.*)')
+            match = regex.search(link_path)
+            if match:
+                year, month, slug = match.group('year'), match.group('month'), match.group('slug')


You don’t need to name your groups in this case and you could just use match.groups()

Please check if there are other structures available — and perhaps refrain from using regular expressions, instead basing yourself on splitting on /?

Using split seems much simpler. Blogger allows custom slugs but url structure is fixed.

Kwpolska · 2016-07-24T11:31:21Z

v7/import_blogger/import_blogger.py

-                os.path.join(self.output_folder, out_folder, slug + '.html'),
-                content)
+            link_fragments = link_path.split('/')
+            if len(link_fragments) == 4:


this check is unnecessary; however, you should have slug pre-determined and set to the correct value much earlier (line 161)

remove the unnecessary re import

What about drafts?

Detecting drafts does not depend on this as far as I can see. You need to write the correct path to the URL map, for example.

Kwpolska · 2016-07-24T11:37:34Z

v7/import_blogger/import_blogger.py

-            self.write_content(
-                os.path.join(self.output_folder, out_folder, slug + '.html'),
-                content)
+            if is_draft:


Kwpolska · 2016-07-24T11:56:08Z

v7/import_blogger/import_blogger.py

@@ -154,8 +156,11 @@ def import_item(self, item, out_folder=None):

        if link_path.lower().endswith('.html'):
            link_path = link_path[:-5]
+            link_path = link_path.lstrip('/')


Do this unconditionally.

Kwpolska · 2016-07-24T11:58:14Z

LGTM, but I’d like a second opinion just in case (we went through so many changes here.) @punchagan , @ralsina?

(thanks for listening to all my complaints. In the future, it might be better not to amend and force-push)

Kwpolska · 2016-07-24T12:19:19Z

v7/import_blogger/import_blogger.py

@@ -182,21 +187,19 @@ def import_item(self, item, out_folder=None):
        else:
            is_draft = False

-        self.url_map[link] = self.context['SITE_URL'] + '/' + \
-            out_folder + '/' + slug + '.html'
+        self.url_map[link] = os.path.join(


You can’t use os.path.join for URLs (that would give backslashes on Windows), just use slashes manually or do os.path.join().replace(os.sep, '/')

Kwpolska · 2016-07-24T12:46:02Z

v7/import_blogger/import_blogger.py

-            out_folder + '/' + slug + '.html'
-
+        self.url_map[link] = self.context['SITE_URL'] + out_folder + \
+                             '/' + link_path + '.html'


better style (flake8 complains about indent):

self.url_map[link] = (self.context['SITE_URL'] + out_folder + '/' + link_path + '.html')

punchagan · 2016-07-25T04:28:41Z

v7/import_blogger/import_blogger.py

-        self.url_map[link] = self.context['SITE_URL'] + '/' + \
-            out_folder + '/' + slug + '.html'
-
+        self.url_map[link] = (self.context['SITE_URL'] + out_folder +


It might be safer to ensure that SITE_URL ends with a '/' when populating the context. Some blogs probably don't return a url ending in '/' and that may be why we had it in our code.

Kwpolska · 2016-07-25T11:23:09Z

Merging, thanks for your contribution!

Preserve blogger directory structure

52ea8aa

Kwpolska reviewed Jul 24, 2016
View reviewed changes

Update slug

9a89668

Kwpolska reviewed Jul 24, 2016
View reviewed changes

ChillarAnand force-pushed the master branch 2 times, most recently from f144d91 to 4251d62 Compare July 24, 2016 11:29

Kwpolska reviewed Jul 24, 2016
View reviewed changes

ChillarAnand force-pushed the master branch from 4251d62 to f37a5b1 Compare July 24, 2016 11:33

Use string split instead of regex

6892f22

ChillarAnand force-pushed the master branch from f37a5b1 to 6892f22 Compare July 24, 2016 11:35

Kwpolska reviewed Jul 24, 2016
View reviewed changes

Refactor slugs

6c574fe

Kwpolska reviewed Jul 24, 2016
View reviewed changes

ChillarAnand added 3 commits July 24, 2016 17:29

Fix url_map links

fceb22b

Always strip leading slash

ec79562

Add out_folder in url map links

b75cfcf

Kwpolska reviewed Jul 24, 2016
View reviewed changes

Manually join url fragments

9999ccd

Kwpolska reviewed Jul 24, 2016
View reviewed changes

Fix flake8 error

4450436

punchagan reviewed Jul 25, 2016
View reviewed changes

Ensure site url ends with leading slash

d4187c3

Kwpolska merged commit ee13131 into getnikola:master Jul 25, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve blogger directory structure #157

Preserve blogger directory structure #157

ChillarAnand commented Jul 24, 2016

Kwpolska Jul 24, 2016

ChillarAnand Jul 24, 2016

Kwpolska Jul 24, 2016

ChillarAnand Jul 24, 2016

Kwpolska Jul 24, 2016

ChillarAnand Jul 24, 2016

Kwpolska Jul 24, 2016

Kwpolska Jul 24, 2016

Kwpolska Jul 24, 2016

Kwpolska commented Jul 24, 2016

Kwpolska Jul 24, 2016 •

edited

Kwpolska Jul 24, 2016

punchagan Jul 25, 2016

ChillarAnand Jul 25, 2016

Kwpolska commented Jul 25, 2016

Preserve blogger directory structure #157

Preserve blogger directory structure #157

Conversation

ChillarAnand commented Jul 24, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Kwpolska commented Jul 24, 2016

Kwpolska Jul 24, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Kwpolska commented Jul 25, 2016

Kwpolska Jul 24, 2016 •

edited