Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/languagetool-http: added services.languageToolHttp option #84115

Closed
wants to merge 10 commits into from

Conversation

pasqui23
Copy link
Contributor

@pasqui23 pasqui23 commented Apr 2, 2020

Motivation for this change
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review/3032/179

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review/3032/331

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review/3032/465

"${pkgs.languagetool}/bin/languagetool-http-server ${cli.toGNUCommandLineShell {} cfg.args}";
};
};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing final new line.

@stale
Copy link

stale bot commented Aug 13, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Aug 13, 2021
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review/3032/574

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Aug 14, 2021
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review/3032/641

@SuperSandro2000
Copy link
Member

Please rebase this PR on master.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-already-reviewed/2617/426

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-already-reviewed/2617/436

@bbigras
Copy link
Contributor

bbigras commented Mar 29, 2022

Any progress on this?

@pasqui23
Copy link
Contributor Author

Any progress on this?

It works and I'm using it on my system

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review/3032/827

allowBrowserPluginAccess = mkEnableOption "access from the browser plugin";
settings = mkOption {
description = ''
Contents of the configuration file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please link to specification. After a very quick look at the github project I didn't see anything about it.

in
{
options.services.languageToolHttp = {
enable = mkEnableOption "the LanguagTool http server";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems relevant to include this in the description here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
enable = mkEnableOption "the LanguagTool http server";
enable = mkEnableOption "the LanguageTool http server";

config = mkIf cfg.enable {
services.languageToolHttp.args = {
allow-origin = mkIf cfg.allowBrowserPluginAccess (mkDefault "*");
config = builtins.toString (format.generate "LanguageTool.cfg" cfg.settings);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toString shouldn't be needed: adjust args to take a path as well. Maybe?

@pasqui23 pasqui23 requested a review from aanderse April 30, 2022 15:45
@pasqui23
Copy link
Contributor Author

I have hardened the service

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review/3032/1049

@pasqui23
Copy link
Contributor Author

ok removed the confinement

Copy link
Contributor

@CRTified CRTified left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging, it would be great to have a (comfortable) way to allow access to language models.

in
{
options.services.languageToolHttp = {
enable = mkEnableOption "the LanguagTool http server";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
enable = mkEnableOption "the LanguagTool http server";
enable = mkEnableOption "the LanguageTool http server";

PrivateTmp = true;
NoNewPrivileges = true;
PrivateDevices = true;
PrivateMounts = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the hardening, it is not possible to just put a path into settings for e.g. pre-downloaded languageModel.

A solution should be to add a StateDirectory or to use DynamicUser (i.e. a less confined environment) to have a place for/access to these models.
ReadOnlyPaths didn't work for me with this config (Resulting in a directory not found):

  services.languageToolHttp = {
    enable = true;

    args = {
      port = 8081;
    };    
    
    settings = {
      fasttextBinary = "${pkgs.fasttext}/bin/fasttext";
      fasttextModel = "/var/lib/languagetool/lid.176.bin";

      # Approx. 10GB.
      # Data from:
      # https://languagetool.org/download/ngram-data/
      languageModel = "/var/lib/languagetool/languageModel";
      
      # Download all data here! Otherwise, it fails to start
      # Data from:
      # https://languagetool.org/download/word2vec/
      word2vecModel = "/var/lib/languagetool/word2vec";
      
      pipelinePrewarming = true;
      prometheusMonitoring = true;
      prometheusPort = 9301;
    };
  };

  systemd.services.languagetool-http.serviceConfig = {
    ReadOnlyPaths = [
      "/var/lib/languagetool/"
    ];
  };

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would ask you, as I do not use that functionality, what would be a good interface.

In the interim, have you tried BindReadOnlyPaths=?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BindReadOnlyPaths= did not work for me, with the same result:

Exception in thread "main" java.lang.RuntimeException: LanguageModel directory not found or is not a directory: "/var/lib/languagetool/languageModel"

But I now noticed the problem: TOML is the wrong format, the configuration is actually a Java properties file, similar to what minecraft server.properties does. Comparing the outputs configurations, the main difference is the absence of quotations. So obviously, the path "/var/lib/... will not exist, because it tries to open the (relative) directory ./"/var/lib/....

With this change in mind, BindReadOnlyPaths works - actually, no additional configuration is needed at all with the current state of this PR, i.e. with DynamicUser=true.

RestrictAddressFamilies = [ "AF_INET" "AF_INET6" ];
};
};
users.users.langtool = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal preference, but I'd rather call the user languagetool. It's not harmful to use the longer name (it's automated, after all) and it's clearer for what the user/group is intended, as lang is (at least for me) a common abbreviation for language - thus I don't associate the name langtool with languagetool.

Again - just my personal preference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

'';
settings = mkOption {
description = ''
Contents of the configuration file. ${confDoc}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Contents of the configuration file. ${confDoc}
Content of the configuration file. ${confDoc}

Comment on lines 44 to 46
allow-origin = mkIf cfg.allowBrowserPluginAccess (mkDefault "*");
config = format.generate "LanguageTool.cfg" cfg.settings;
public = mkIf cfg.public "";
Copy link
Contributor

@CRTified CRTified Aug 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the documentation says the following:

  --allow-origin [ORIGIN] set the Access-Control-Allow-Origin header in the HTTP response,
                         used for direct (non-proxy) JavaScript-based access from browsers.
                         Example: --allow-origin "https://my-website.org"
                         Don't set a parameter for `*`, i.e. access from all websites.

But also note that using mkDefault "" will cause Allow-Origin to be empty.

toGNUCommandLineShell should translate bools to flags:

nix-repl> lib.cli.toGNUCommandLineShell {} {foo = true;}               
"'--foo'"

but I wasn't able to make it work here.

with lib;
let
cfg = config.services.languageToolHttp;
format = pkgs.formats.toml { };
Copy link
Contributor

@CRTified CRTified Aug 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the config file is not TOML, but

a Java property file (one key=value entry per line)

(According to languagetool-http-server --help). This explains the problem with paths in the config, as described below.

@CRTified
Copy link
Contributor

#187147 got merged, so I don't know whether this needs to stay open.

@pasqui23
Copy link
Contributor Author

#187147 got merged, so I don't know whether this needs to stay open.

You're right, that is a better pr

@pasqui23 pasqui23 closed this Oct 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants