Similarity plugin (WIP) #230

ralsina · 2017-05-22T19:55:23Z

Kwpolska · 2017-05-23T10:44:36Z

v7/similarity/similarity.py

+
+from __future__ import print_function, unicode_literals
+
+import gensim


needs a requirements.txt file

Kwpolska · 2017-05-23T10:45:37Z

v7/similarity/similarity.py

+            vec_lsi = lsi[vec_bow]
+            sims = index[vec_lsi]
+            sims = sorted(enumerate(sims), key=lambda item: -item[1])
+            print(i, sims[:10])


Hm, shouldn’t this write to eg. a JSON file? Or otherwise make available to others?

Also, you have no doit tasks here.

That was very early code :)

Kwpolska · 2017-05-24T12:55:21Z

v7/similarity/similarity.py

+                    'name': out_name,
+                    'targets': [out_name],
+                    'actions': [(write_similar, (out_name, related))],
+                    # 'file_dep': ['####MAGIC####TIMELINE'],


I think you need to depend on the render_posts task for magic timeline to work.

ahhhh good point. I sort of need to do it differently, so I may have to extend that support.

Kwpolska · 2017-05-24T15:01:52Z

v7/similarity/README.md

+    similarity data for **all** posts.
+  * The more translations you have, the longer it takes.
+  * My test site contains 1300 posts of varied lengths in 2 languages,
+    and initialization takes ~90 seconds.


Is it really worth it?

How powerful is your machine?

Is there a way to save the startup data? Even with something like manual pickle/unpickle?

Yes, there is, and I can make this better using setup tasks, which is a feature in doit I did not know/recall.

Kwpolska · 2017-05-24T18:14:52Z

The $64,000 question is: how can WordPress plugins generate a list like that dynamically on every request? Perhaps this approach is a bit overdone?

ralsina · 2017-05-24T18:19:41Z

Oh, they have all the text in a DB with FTS. With that, it's easier.

Roberto Alsina added 2 commits May 22, 2017 16:49

Similarity plugin (WIP)

e7fb38e

lint

0516994

Kwpolska reviewed May 23, 2017

View reviewed changes

Roberto Alsina added 4 commits May 23, 2017 17:41

getting there

fdc0f59

lint

561fcf7

missing dep

ed41e64

dep handling (broken magictimeline)

32d8702

Kwpolska reviewed May 24, 2017

View reviewed changes

Roberto Alsina added 2 commits May 24, 2017 11:37

Added README

919ffcc

remove FIXME

10bc2cb

Kwpolska reviewed May 24, 2017

View reviewed changes

Roberto Alsina added 7 commits May 24, 2017 12:34

Startup hit down to 20 secs in most cases

fdd431c

doh

191d757

Fix obvious bug

967c6fc

tweaks

eb279cb

Don't do nonsense

659c3fd

Don't consider a post related to itself

565858e

This plugin has weird deps

b8fa6c8

Describe template changes

95ef884

ralsina merged commit d235b5e into master May 24, 2017

ralsina deleted the similarity branch May 24, 2017 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Similarity plugin (WIP) #230

Similarity plugin (WIP) #230

ralsina commented May 22, 2017 •

edited

Loading

Kwpolska May 23, 2017

Kwpolska May 23, 2017

ralsina May 24, 2017

ralsina May 24, 2017

Kwpolska May 24, 2017

ralsina May 24, 2017

Kwpolska May 24, 2017

ralsina May 24, 2017

Kwpolska commented May 24, 2017

ralsina commented May 24, 2017


		from __future__ import print_function, unicode_literals

		import gensim

Similarity plugin (WIP) #230

Similarity plugin (WIP) #230

Conversation

ralsina commented May 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Kwpolska commented May 24, 2017

ralsina commented May 24, 2017

ralsina commented May 22, 2017 •

edited

Loading