-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Similarity plugin (WIP) #230
Conversation
v7/similarity/similarity.py
Outdated
|
||
from __future__ import print_function, unicode_literals | ||
|
||
import gensim |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs a requirements.txt file
v7/similarity/similarity.py
Outdated
vec_lsi = lsi[vec_bow] | ||
sims = index[vec_lsi] | ||
sims = sorted(enumerate(sims), key=lambda item: -item[1]) | ||
print(i, sims[:10]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, shouldn’t this write to eg. a JSON file? Or otherwise make available to others?
Also, you have no doit tasks here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was very early code :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was very early code :)
v7/similarity/similarity.py
Outdated
'name': out_name, | ||
'targets': [out_name], | ||
'actions': [(write_similar, (out_name, related))], | ||
# 'file_dep': ['####MAGIC####TIMELINE'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to depend on the render_posts
task for magic timeline to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhhh good point. I sort of need to do it differently, so I may have to extend that support.
similarity data for **all** posts. | ||
* The more translations you have, the longer it takes. | ||
* My test site contains 1300 posts of varied lengths in 2 languages, | ||
and initialization takes ~90 seconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Is it really worth it?
- How powerful is your machine?
- Is there a way to save the startup data? Even with something like manual pickle/unpickle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is, and I can make this better using setup tasks, which is a feature in doit I did not know/recall.
The $64,000 question is: how can WordPress plugins generate a list like that dynamically on every request? Perhaps this approach is a bit overdone? |
Oh, they have all the text in a DB with FTS. With that, it's easier. |
getnikola/nikola#1702