Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pythonPackages.spacy_models: add more models #91217

Merged
merged 3 commits into from Jun 28, 2020

Conversation

danieldk
Copy link
Contributor

Motivation for this change

Add the following models:

da_core_news_lg, da_core_news_md, da_core_news_sm, de_core_news_lg, el_core_news_lg, es_core_news_lg, fr_core_news_lg, it_core_news_lg, it_core_news_md, lt_core_news_lg, lt_core_news_md, nb_core_news_lg, nb_core_news_md, nl_core_news_lg, nl_core_news_md, pl_core_news_lg, pl_core_news_md, pl_core_news_sm, pt_core_news_lg, pt_core_news_md, ro_core_news_sm, ro_core_news_md, ro_core_news_lg, zh_core_web_lg, zh_core_web_md, zh_core_web_sm

I also checked the licenses and fixed them where necessary.

Some extra notes:

  • I had to add a derivation for pkuseg, which is used by the zh models. Unfortunately, it seems to be proprietary due to the included data.
  • I haven't included the Japanese models yet. These require sudachipy, however, that is its own can of worms to package, since it likes it sets symlinks in its Python path during runtime to switch dictionaries. Maybe later...
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

This change adds the missing language-specific spaCy models:

da_core_news_lg, da_core_news_md, da_core_news_sm, de_core_news_lg,
el_core_news_lg, es_core_news_lg, fr_core_news_lg, it_core_news_lg,
it_core_news_md, lt_core_news_lg, lt_core_news_md, nb_core_news_lg,
nb_core_news_md, nl_core_news_lg, nl_core_news_md, pl_core_news_lg,
pl_core_news_md, pl_core_news_sm, pt_core_news_lg, pt_core_news_md,
ro_core_news_sm, ro_core_news_md, ro_core_news_lg, zh_core_web_lg,
zh_core_web_md, zh_core_web_sm

I also checked the licenses and fixed them where necessary.
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review/3032/202

Copy link
Contributor

@jonringer jonringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Result of nixpkgs-review pr 91217 1

2 packages built: - python37Packages.pkuseg - python38Packages.pkuseg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants