Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Moving conda packages to a single source #154

Open
PiotrZierhoffer opened this issue Nov 30, 2020 · 14 comments
Open

[RFC] Moving conda packages to a single source #154

PiotrZierhoffer opened this issue Nov 30, 2020 · 14 comments

Comments

@PiotrZierhoffer
Copy link

This issue aims to discuss moving to LiteX Hub packages. It was already executed in symbiflow-examples, and PRs are open for fpga-perf-tool and symbiflow-arch-defs.

The packages are hosted on the following repos:

The goal of this change is to create a single space for Conda-based dependencies in the open FPGA environment, with easier building procedures, based on GitHub Actions, cross-platform whenever possible, etc.

Packages are currently hosted on LiteX-Hub and this is because our work originally started there. The name of the channel is not as important as the fact that packages for many projects related to SymbiFlow, LiteX, LiteX BuildEnv or Skywater PDK will be unified, and thus easier to maintain.

Currently all packages required by symbiflow-arch-defs, symbiflow-examples and fpga-tool-perf (and many more) are available on LiteX Hub.

Most of the packages in SymbiFlow/conda-packages can be simply moved to LiteX-Hub with minimal changes. Due to low stability of Travis builds (the migration to GitHub Actions is in progress) we decided not to move the packages in batch, but do it one by one, on a per-need basis.

Packages are following specific branches (just like in this repo), and each subsequent build can bump the package version. Some previous versions are also available on the channel - whenever the user project required a specific older version.

The build procedure is transitioning to use the conda-build-prepare tool, allowing for easy and isolated local builds - more information on that will follow.

The litex-conda-* repositories organize packages in stages. This aims to divide packages with regard to dependencies they rely upon.

The first stage is "No dependencies" - this is, quite obviously, for packages without deps on the same repository. For example litex-conda-eda/verilator does not depend on any other litex-conda-eda package.

"Has first level dependencies" covers packages that have dependencies from the same repo, but these dependencies are in the "No dependencies" section. E.g. symbiflow-yosys-plugins depends on symbiflow-yosys.

And so on.

Packages built by the CI running on a PR are uploaded to a conda label, and are made available for subsequent stages. This allows us to verify if a change does not break other builds, at least within a single repo. The current conda-packages scripting will not detect such breaking changes until the package is uploaded and the next Travis run is started.

The packages are moved to the main label when the CI is green.

CC @mithro @kgugala @acomodi @tmichalak @mkurc-ant @ajelinski @tjurtsch

@litghost
Copy link

litghost commented Dec 1, 2020

@HackerFoo This is relevant will become relevant your PGO work.

@mithro
Copy link
Member

mithro commented Dec 1, 2020

Older design doc around this stuff -- https://docs.google.com/document/d/1BZcSzU-ur0J02uO5FSGHdJHYGnRfr4n4Cb7PMubXOD4/edit

Google Docs
EDDA - "Conda for EDA tools"�Conda based system for FPGA and ASIC Dev bit.ly/edda-conda-eda-spec Packaging status can be found at j.mp/edda-status Tasks Split apart the litex-hub/litex-conda-packages repository into compilers, eda, etc. Defork litex-hub/litex-conda-packages, SymbiFlow/conda-pac...

@mithro
Copy link
Member

mithro commented Dec 1, 2020

Related doc from @umarcor - https://docs.google.com/document/d/10_MqFjTIYVVuOJlusJydsp4KOcmrrHk03__7ME5thOI/edit

Google Docs
PARENT: https://docs.google.com/document/d/1gQLdQgCTnZPCfou6_eCV_ZOMbApaoaNHr8ERcCL2HAo Building, packaging and installing�Open Source EDA tooling�for mixed HDL designs Packaging + Distribution Systems 2 Overall Goals 2 Distribution: Canonical package managers (apk, apt, dnf, pacman, etc.) 3 ...

@PiotrZierhoffer
Copy link
Author

Today's meeting discussion revolved around package pinning, versioning and marking things as stable.

A broad idea is to explore using Anaconda labels as indicators of package status.

There could be a bleeding-edge label (main?) that holds packages built from master branches of the relevant project.

There should be also a stable label that holds packages verified to actually work with user projects: arch-defs, tool-perf, examples, etc.

I believe the user repos could have a nightly job that builds everything with the bleeding-edge label.

There are several questions:

  • is it possible to have a GH action verifying results of such nightly builds (across gh organizations)?
  • do all the user repos rely on same versions of packages?
  • probably more questions to come

One thing to mention is that Anaconda client is... not perfect. Handling of labels requires a lot of manual parsing, some commands simply don't work as expected/described, so it's very manual. This feature was definitely not created with such workflows in mind and we're abusing it a bit.
I am not sure yet, but it might be that we'd have to write a tool hooking into anaconda client code to handle this.

@umarcor
Copy link
Member

umarcor commented Dec 2, 2020

Some thoughts:

  • The list of tools in litex-hub/litex-conda-eda and litex-hub/litex-conda-prog is (not surprisingly) similar to the lists in open-tool-forge/fpga-toolchain, hdl/containers or hdl/MINGW-packages, and probably other repos too. We should have a table that shows and links tools provided by each project:

    conda-eda conda-prog fpga-toolchain fpga-toolchain-progtools containers MINGW-packages
    yosys stable head - head - head snapshot
    icestorm stable head - head - head snapshot WIP
    iceprog - stable head - head head snapshot WIP
    ...

    Users should be able to execute any design/example using any of the toolchain/package providers. Therefore, it would be useful for the community to be aware of the available solutions, in order to pick the one that best fits each use case.

    I would propose creating hdl/packages for this purpose. That'd be a location for building a (probably single page) site, which can be updated semi-automatically. However, that repo only makes sense if different agents/maintainers collaborate in keeping "their" lists updated.

    NOTE: AFAIAA, the conda based approach is the only one which is expected to provide/maintain different versions of the packages/tools. open-tool-forge/fpga-toolchain and hdl/containers provide heads, even though pinning/retrieving an specific release is possible. MSYS2 is a rolling project, just as Arch Linux; hence, older releases can be retrieved, but are not guaranteed to work.

  • hdl/smoke-tests was created for sharing a minimal set of individual tests that each packager should check. There are many packagers who test several tools at the same time by e.g., running actual HDL examples. Conversely, this repo is expected to contain tests for a single tool only. If required, "golden" sources and outputs are to be added, but no additional dependencies should be required besides the tool to be tested. Because of the nature of this repo, the list of tests is expected to match the list of tools in the table above.

  • @PiotrZierhoffer, regarding the dependency layers/levels/stages, see the graph in https://hdl.github.io/containers/#_graph. Each black box (subgraph) is a CI (GitHub Actions) workflow. I would expect diagrams for other packaging schemes to be very similar. The graph is currently handwritten, but I expect to generate it from a JSON/YAML file, similar to how the table of tools and images is generated (https://github.com/hdl/containers/blob/main/doc/gen_tool_table.py#L5-L101).

    Triggering workflows is possible from other workflows, across repos in an org, and across orgs. For instance, at the end of the CI run in ghdl/ghdl, a workflow is triggered in ghdl/docker, which initiates a chain of workflow executions. That was not enabled in hdl/containers yet. Each workflow has a repository_dispatch and there is a trigger script; but the trigger step was not added to workflows yet. Nevertheless, it is to be noted that cross-triggering workflows across organisations might not be desirable from a security perspective, because granularity is coarse.

  • As explained in the GoogleDocs document, containers are useful not only for final usage, but also for distributing portable development environments. That is, instead of installing all the conda dependencies and other external dependencies in each CI workflow of this org, it might be desirable to have a symbiflow/containers repo for maintaining the build environments. Then, in GitHub Actions, container jobs or container steps can be used. The containers of symbiflow/containers can be based on hdl/build:* images (Debian Buster), or can be completely independent.

    NOTE: we have the symbiflow namespace in dockerhub, but it's empty ATM.

@PiotrZierhoffer
Copy link
Author

@umarcor thanks for the input.

I agree with the general thought that this effort should converge.

Regarding smoke-tests, if I'm not mistaken, this is something that could probably be used at a very early stage of package building - instead of per-package test described in meta.yaml.

Layers should, obviously align with what you have in the containers repo for sure.

Cross-action triggers are an idea that emerged yesterday - so that we can build everything as usual, but only move packages to stable layer whenever the actual users verify them to work. Of course the question is which users will be part of it, are they using the same versions etc etc. So this is a concept right now.

@PiotrZierhoffer
Copy link
Author

PiotrZierhoffer commented Dec 2, 2020

There is one issue with this approach that I can identify right now, which is lack of git history changes.

Whenever you build an older commit of symbiflow-arch-defs that is not compatible with the newest tools, you fail.

One approach to address it would be to use specific labels for different users. The flow could be:

  • build packages, push to main
  • nightly run a CI on a user repo (e.g. arch-defs), changing the used label to main
  • if the test works and package versions changed recently, copy packages to a build-specific label, like litex-hub/label/symbiflow-arch-defs-{DATE}
  • create a PR to symbiflow-arch-defs changing the used label to the one just created

This might lead to inconsistencies between users, but I don't know if it's a problem or not.

cc @HackerFoo @litghost @mithro

@umarcor
Copy link
Member

umarcor commented Dec 2, 2020

Regarding smoke-tests, if I'm not mistaken, this is something that could probably be used at a very early stage of package building - instead of per-package test described in meta.yaml.

Yes. The main point is that we are building tools with different compiler/linker options and sometimes using compatible but not exactly equal dependencies as the ones used upstream. Smoke-tests are just for ensuring that nothing is fundamentally wrong with the binaries that were generated. Hence, checking that tool --help or tool --version provide aceptable results (don't segfault), tools which are expected to read/write files can actually do so (and not produce empty files), tools have the expected built-in plugins/modules, etc. Other more complete tests should be run afterwards (elsewhere), since those depend on how packagers decide to group the tools. See the plumbing in https://github.com/hdl/containers/tree/main/test and a "complete" test in https://github.com/antonblanchard/microwatt/blob/master/Makefile#L33-L36 (this one is roughly equivalent to the conda/symbiflow examples).

Cross-action triggers are an idea that emerged yesterday - so that we can build everything as usual, but only move packages to stable layer whenever the actual users verify them to work. Of course the question is which users will be part of it, are they using the same versions etc etc. So this is a concept right now.

Tha issue with cross-triggers is that the user/bot which triggers a workflow in a repo must have write access to it. I believe they should change it (if they didn't do it already), but last time I checked a personal access token was required. Therefore, if a bot is allowed to trigger workflows in the whole organisation, it can also break everything...

GitHub
Building and deploying container images for open source EDA - hdl/containers
GitHub
A tiny Open POWER ISA softcore written in VHDL 2008 - antonblanchard/microwatt

@mithro
Copy link
Member

mithro commented Dec 2, 2020

FYI - @enjoy-digital

@mithro
Copy link
Member

mithro commented Dec 2, 2020

BTW I don't think the conda approach is expected to maintain specific versions of things, we are aiming for a rolling head there too.

@mithro
Copy link
Member

mithro commented Dec 2, 2020

The primary idea is to make sure the rolling head is alway in a usable state.

@PiotrZierhoffer
Copy link
Author

PiotrZierhoffer commented Dec 3, 2020

Reiterating over yesterday discussion with @mithro 👍

  • we need to have specific dep versions for specific commits of user repos
  • Anaconda tool is in no way fit to handle complex labeling scenarios (e.g. I don't think it's possible to add a label to a package - you have to remove it and reupload)
  • https://github.com/conda-incubator/conda-lock <- this could be helpful
  • a bot, in this case, would be specific to the user repo (no cross-org communication, just a nightly job)
  • CI scenario:
    • drop the versions lock file
    • run a CI on current HEAD versions
    • if CI passes, create a new lock file
    • create a PR
GitHub
Lightweight lockfile for conda environments. Contribute to conda-incubator/conda-lock development by creating an account on GitHub.

@mithro
Copy link
Member

mithro commented Dec 3, 2020

I would suggest the lock update is done via;

  • Step 1 - Generate a new lock file with the latest versions in the conda repo.
  • Step 2 - Send a pull request.
  • Step 3 - If CI passes, merged the updated lock file pull request.

@PiotrZierhoffer
Copy link
Author

Right, this is easier and yields same results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants