Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP python37Packages.textacy: fixing broken package #65022

Closed
wants to merge 2 commits into from

Conversation

mattmelling
Copy link
Contributor

Motivation for this change

Added missing dependency and fixed a broken package version restriction.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nix-review --run "nix-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

Copy link
Contributor

@jonringer jonringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package is in an invalid state.

I tried enabling tests, and found that it was broken by scikit-learn

issue submitted upstream here:
chartbeat-labs/textacy#260

@risicle
Copy link
Contributor

risicle commented Jul 18, 2019

Builds for me non-nixos linux x86_64

This package is in an invalid state.

What do you mean by this?

@jonringer
Copy link
Contributor

scikit-learn 0.21 breaks how the package is used. If you enable the tests you will get this error:

Really long error msg
>           assert isinstance(TopicModel(model).model, expected)

tests/test_topic_model.py:85:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/nix/store/kzspdnr2wnah99s59d6fkb3zfaf48sb6-python3.7-textacy-0.8.0/lib/python3.7/site-packages/textacy/tm/topic_model.py:125: in __init__
    self.init_model(model, n_topics=n_topics, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[AttributeError("'TopicModel' object has no attribute 'model'") raised in repr()] TopicModel object at 0x7fffdd7362d0>
model = 'lda', n_topics = 10, kwargs = {}

    def init_model(self, model, n_topics=10, **kwargs):
        if model == "nmf":
            self.model = NMF(
                n_components=n_topics,
                alpha=kwargs.get("alpha", 0.1),
                l1_ratio=kwargs.get("l1_ratio", 0.5),
                max_iter=kwargs.get("max_iter", 200),
                random_state=kwargs.get("random_state", 1),
                shuffle=kwargs.get("shuffle", False),
            )
        elif model == "lda":
            self.model = LatentDirichletAllocation(
                n_topics=n_topics,
                max_iter=kwargs.get("max_iter", 10),
                random_state=kwargs.get("random_state", 1),
                learning_method=kwargs.get("learning_method", "online"),
                learning_offset=kwargs.get("learning_offset", 10.0),
                batch_size=kwargs.get("batch_size", 128),
>               n_jobs=kwargs.get("n_jobs", 1),
E               TypeError: __init__() got an unexpected keyword argument 'n_topics'

It builds because we dont do any tests, but if you were to use the package, usage of the TopicModel class will break without us using a pinned version of scikit-learn 0.20

@risicle
Copy link
Contributor

risicle commented Jul 18, 2019

Ahh good well caught.

@jonringer
Copy link
Contributor

What this is also means is that textacy on master is also broken, you just wont know it until you use it :)

@mattmelling
Copy link
Contributor Author

@jonringer Thanks for the feedback.

I was poking around in the awscli package recently and noticed that it overrides some python dependencies such as pyyaml which is a couple of minor versions behind. Does anyone have any comments on whether that would be an appropriate solution in this case to pin scitkit-learn to <0.21?

@FRidh
Copy link
Member

FRidh commented Jul 19, 2019

No, we cannot have multiple versions of a library in python-packages.nix

@mattmelling
Copy link
Contributor Author

Makes sense. I was hoping to get this package in to a working state then have a go at upgrading it to newer versions (they are up at 0.8.x now). Will leave this PR open and circle back once upstream supports scikit 0.21.

@FRidh FRidh changed the title python37Packages.textacy: fixing broken package WIP python37Packages.textacy: fixing broken package Jul 19, 2019
@mattmelling
Copy link
Contributor Author

Closing as this has all been taken care of by @jonringer in #68093 :)

@mattmelling mattmelling closed this Sep 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants