[RFC] Python library wrappers #53816

timokau · 2019-01-11T21:43:00Z

Motivation for this change

Some python libraries require specific environment variables to be set. One example are those that use gtk libraries. On most other distros, those environment variables would either be set in the global environment or the dependencies would be found in standard paths. Since that is not the case for nixos, we need to hardcode those environment variables into the libraries.

We have makeWrapper for binaries. Here I try (purely as a proof of concept, kept as simple as possible) to add a makePythonWrapper function that does the same for python libraries. In particular, this would make the gtk3 matplotlib backend usable without using nix-shell and without manually setting environment variables.

python3 -c 'import matplotlib.pyplot as p; p.switch_backend("gtk3agg")'

Should now succeed. This works because python always executes all parent modules before importing a module. So by adding some code to the top of the topmost module (taking care to leave module docstrings in tact), we can set environment variables as soon as a module is imported.

Some next steps/improvements would be:

factor out makePythonWrapper
provide the full functionality makeWrapper provides
automatically determine the right __init__.py to modify
provide a GTK setup hook or maybe adapt the existing one

What do you think?

CC @jtojnar @FRidh

(This is not about enabling the gtk backend in matplotlib by default, that is just for easier testing)

FRidh · 2019-01-12T15:46:35Z

Note it's not a wrapper, but a function that injects code.

wrapPythonPrograms both wraps and injects code in executables. That functionality should be separated. Should we, then, wrap, or inject code? Injecting code is not easy to do correctly (see e.g. the magicalSedExpression, we need to consider shebangs and __future__ imports that occur before the docstring) but is in-line with non-Python code.

I think the direction we should go is that packages declare their requirements on the environment and what they offer. E.g., in this case, matplotlib could declare it requires GI_TYPELIB_PATH, and gtk3 declares it provides GI types. We then have e.g. a buildEnv "on steroids" that knows how to handle this. The buildEnv would go through a list of functions, a function per requirement, thereby building up the wrappers or code that needs to be injected. These functions could also be used directly when constructing a derivation.

timokau · 2019-01-12T17:04:53Z

Note it's not a wrapper, but a function that injects code.

Yeah I'm playing a bit loose with the terminology here. Although it may not be accurate to the letter, I think it communicates the intention and is in line with the naming of similar functions in nixpkgs.

wrapPythonPrograms both wraps and injects code in executables. That functionality should be separated. Should we, then, wrap, or inject code? Injecting code is not easy to do correctly (see e.g. the magicalSedExpression, we need to consider shebangs and __future__ imports that occur before the docstring) but is in-line with non-Python code.

Is wrapping any easier? How does one wrap a library module without changing the way it is imported and without messing with automated tooling detecting docs etc.?

I forgot about shebangs. How do those interact with module docstrings? The python docs state that they have to occur first in the file, are shebangs exempt from this?

Why do __future__ imports have to be handled differently?

I think the direction we should go is that packages declare their requirements on the environment and what they offer. E.g., in this case, matplotlib could declare it requires GI_TYPELIB_PATH, and gtk3 declares it provides GI types. We then have e.g. a buildEnv "on steroids" that knows how to handle this. The buildEnv would go through a list of functions, a function per requirement, thereby building up the wrappers or code that needs to be injected. These functions could also be used directly when constructing a derivation.

I agree, I was thinking in a similar direction. This would play nice with your "declarative wrapper" idea.

FRidh · 2019-01-13T08:32:32Z

Is wrapping any easier?

Yes and no. In principle the utility we have for it makes it easy, however, before sys.argv[0] may change we can cause trouble (wrapper solution for this #25985).

How does one wrap a library module without changing the way it is imported and without messing with automated tooling detecting docs etc.?

A library can't be wrapped, we can only inject code in it. That's why we patch references to executables to make them absolute paths.

Why do __future__ imports have to be handled differently?

From the Python docs:

A future statement must appear near the top of the module. The only lines that can appear before a future statement are:

the module docstring (if any),
comments,
blank lines, and
other future statements.

The items are thus in the following order:

shebang
__future__ imports or docstrings
other imports

When injecting code, it needs to be done after a __future__ import, but before other imports, and ignoring docstrings. In your code you use a certain docstring delimiter, but there are multiple ones. And we have to consider they can be nested as well.

timokau · 2019-01-13T22:23:06Z

How does one wrap a library module without changing the way it is imported and without messing with automated tooling detecting docs etc.?

A library can't be wrapped, we can only inject code in it. That's why we patch references to executables to make them absolute paths.

For the purposes of this PR only working for executables is kind of a non-starter.

Why do __future__ imports have to be handled differently?

From the Python docs:

A future statement must appear near the top of the module. The only lines that can appear before a future statement are:
the module docstring (if any),
comments,
blank lines, and
other future statements.

The items are thus in the following order:
1. shebang

2. `__future__` imports or docstrings

3. other imports

Oh I didn't know that. Though those were just normal imports.

When injecting code, it needs to be done after a __future__ import, but before other imports, and ignoring docstrings. In your code you use a certain docstring delimiter, but there are multiple ones. And we have to consider they can be nested as well.

I think I managed to put together a good solution using pythons ast module. That way we can run the injection with the proper python and always use the right ast parser. In contrast to regex / string compare based solution it should also be robust enough to be used basically everywhere (which we'd need when we want to automatically propagate env vars).

I've run out of time for now, but I'll polish and test that when I have more time.

timokau · 2019-01-14T23:03:02Z

@FRidh what do you think? https://github.com/timokau/python-inject

It injects a call to exec(), which makes it possible to insert arbitrary python code anywhere and execute it without polluting the environment.

FRidh · 2019-01-15T08:52:51Z

@timokau that tool looks quite nice. I suppose we could put the code to inject in $out/nix-support, however, consider this can also be used for executables, I think it would be better to inject the actual code instead of a path, because there one might want to inject different things.

timokau · 2019-01-15T16:49:05Z

I don't understand the point about executables. Why wouldn't exec work there? Seems much more flexible to me in either case.

We could just use writeTextFile to store the file to exec in the nix store.

FRidh · 2019-01-15T16:54:43Z

Then you're going to need two components, a Nix function that creates the file, and a bash function that injects the code.

timokau · 2019-01-15T17:03:43Z

Right. The point is that it is agnostic to where the file is stored. exec has two big advantages

completely separates the environment, which means we can use variables and do imports without potentially messing with the execution of the model
is a simple statement, which means it can be injected anywhere (from __future__ import print_function; print(42)) without worrying about semicolons vs. newlines, indentation etc.

Why would that be problematic in the binary case? It could still be put in nix-support or similar (I don't know if nix-support has special propagation semantics).

timokau · 2019-01-15T21:59:26Z

In which of the nix phases are the .pyc files generated?

timokau · 2019-06-22T13:28:52Z

This library might be useful: https://github.com/google/pasta

nixos-discourse · 2020-04-11T15:58:13Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/declarative-wrappers-another-idea/6661/1

stale · 2021-06-06T18:57:33Z

I marked this as stale due to inactivity. → More info

nixos-discourse · 2022-08-23T09:26:16Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/low-effort-and-high-impact-tasks/21095/41

nixos-discourse · 2023-10-01T05:08:20Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/terminal-emulator-leaks-environment-variables-to-shell/33673/11

python3.pkgs.matplotlib: add gtk wrapper

305c7c7

timokau requested a review from FRidh as a code owner January 11, 2019 21:43

timokau changed the title ~~[RFC]~~ [RFC] Python library wrappers Jan 11, 2019

GrahamcOfBorg added the 6.topic: python label Jan 11, 2019

GrahamcOfBorg requested a review from lovek323 January 11, 2019 21:54

GrahamcOfBorg added 10.rebuild-darwin: 11-100 10.rebuild-linux: 11-100 labels Jan 11, 2019

FRidh mentioned this pull request Mar 23, 2019

Remove double wrapping from Python packaging #58130

Open

timokau mentioned this pull request Mar 24, 2019

[WIP] sage: make sage usable as a python library #58230

Closed

10 tasks

rnhmjoj mentioned this pull request Feb 21, 2020

pyplot not working in ipython #80147

Open

ryantm added 2.status: merge conflict and removed 2.status: merge conflict labels Oct 3, 2020

stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 6, 2021

stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Oct 1, 2023

wegank added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 19, 2024

wegank marked this pull request as draft March 20, 2024 15:01

stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 20, 2024

wegank added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Python library wrappers #53816

[RFC] Python library wrappers #53816

timokau commented Jan 11, 2019

FRidh commented Jan 12, 2019

timokau commented Jan 12, 2019

FRidh commented Jan 13, 2019

timokau commented Jan 13, 2019

timokau commented Jan 14, 2019

FRidh commented Jan 15, 2019

timokau commented Jan 15, 2019

FRidh commented Jan 15, 2019

timokau commented Jan 15, 2019

timokau commented Jan 15, 2019

timokau commented Jun 22, 2019

nixos-discourse commented Apr 11, 2020

stale bot commented Jun 6, 2021

nixos-discourse commented Aug 23, 2022

nixos-discourse commented Oct 1, 2023

[RFC] Python library wrappers #53816

Are you sure you want to change the base?

[RFC] Python library wrappers #53816

Conversation

timokau commented Jan 11, 2019

Motivation for this change

FRidh commented Jan 12, 2019

timokau commented Jan 12, 2019

FRidh commented Jan 13, 2019

timokau commented Jan 13, 2019

timokau commented Jan 14, 2019

FRidh commented Jan 15, 2019

timokau commented Jan 15, 2019

FRidh commented Jan 15, 2019

timokau commented Jan 15, 2019

timokau commented Jan 15, 2019

timokau commented Jun 22, 2019

nixos-discourse commented Apr 11, 2020

stale bot commented Jun 6, 2021

nixos-discourse commented Aug 23, 2022

nixos-discourse commented Oct 1, 2023