Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a filter for maintainers in the jobset-eval view #743

Open
wants to merge 40 commits into
base: master
Choose a base branch
from

Conversation

Ma27
Copy link
Member

@Ma27 Ma27 commented Apr 27, 2020

Summary

Continuation of my efforts from #685.

During the release-managers' talk from last NixCon it has been suggested to add a search-feature to list all failing packages a user maintains. Theoretically this is already possible by introspecting the job-list in the dashboard of a Hydra-user, but not everyone owns a google-account (which is usually needed for the upstream-Hydra), also not everyone uses their google-email for NixOS stuff (I don't, for instance).

To summarize, the following things changed:

  • Instead of saving all maintainers' emails as a comma-separated string in the builds table (and therefore creating a lot of duplications), all maintainers are stored in a separate table and are associated to a build using a many-to-many relationship in DBIx. Whenever maintainers are needed in a controller-action, those can be fetched using an additional JOIN. For the REST-API, the maintainers are fetched by default when requesting details of a build.

    This has the additional benefit that if the GitHub-handle of a user changes, Hydra updates the existing maintainer entry.

  • The filter-form in the jobset-eval view now contains an additional select-field to choose whether to filter by name or by maintainer (for backwards-compatibility, this is optional and by default it's filtered by job-name).

For a detailed overview, I suggest to review the commits including their messages.

Screenshots
  • The jobset-eval overview has been slightly changed to display and filter maintainers:
    2020-04-28-000224_screenshot

  • The detail-view of a build now displays the github-handles by default:
    2020-04-27-233855_screenshot

  • If a maintainer doesn't have a github-handle, the email is shown as fallback:
    2020-04-27-234030_screenshot

Pending work
  • Provide a way to upgrade. The migration I hacked together just drops the maintainers field from all entries in the builds-table which is bad if you need to search older jobsets or want to receive notifications if a job is restarted. I had the following ideas (though at least @grahamc might have more experience here):
    • add a migration-approach as implemented for the jobset_id-fixes
    • re-add the old maintainers-column and merge the data from this column with the data from the maintainer-table
  • Investigate if there's a way to retrieve the symbols of a maintainer (e.g. lib.maintainers.ma27). In that case it's even possible for Hydra to keep track of changes to the GitHub handle and email-address of a maintainer.
  • (bonus-points) add support for this in hydra-check

cc @edolstra @grahamc

@Ma27 Ma27 mentioned this pull request Apr 27, 2020
@Ma27 Ma27 force-pushed the jobset-search-maintainer branch from 43ed0d7 to c314c86 Compare May 6, 2020 21:03
@gilligan
Copy link
Contributor

Are you still working on this?

I guess you still need help on the database migration matter? How about the part with retrieving maintainer info?

@Ma27
Copy link
Member Author

Ma27 commented May 15, 2020

Are you still working on this?

Yes.

I guess you still need help on the database migration matter?

Well, the question is much rather, how do we want to migrate? Should we still keep the maintainers-column for backwards-compatibility or should we do some sort of hydra-backfill-maintainers-thing just like it happened with the jobset_id migration?

Or do you have an even better idea? :)

How about the part with retrieving maintainer info?

I'm not even sure if that's possible tbh. Does anybody have anough experience with the C++ API to accomplish that? (or, as a dirty hack, one may fetch the key from lib.maintainers if email/github match, however this doesn't work if a 3rdparty-project has its own list of maintainers).

@Ma27
Copy link
Member Author

Ma27 commented May 20, 2020

@gilligan what do you think of my suggestions regarding the DB-migration? And who else should I also ping for a review?

While I'm happy to continue working on this, I'd prefer to get some feedback on this first :)

@grahamc
Copy link
Member

grahamc commented May 20, 2020

@Ma27 do you have a few hundred gigabytes of free space to try out the migration on the prod dataset? If so, I can get you a copy.

@Ma27
Copy link
Member Author

Ma27 commented May 22, 2020

@grahamc,

do you have a few hundred gigabytes of free space

will be a bit close, but should be possible :)


But first of all we should agree on how to implement the migration (I posted two ideas in the original comment).

@Ma27
Copy link
Member Author

Ma27 commented Jun 1, 2020

Just for the record, I'm happy to continue working on this (including a rebase to master :p), but first of all, I'd like to get some feedback and to get the migration-topic resolved.

@Ma27 Ma27 force-pushed the jobset-search-maintainer branch 2 times, most recently from da4beb1 to 34a4848 Compare June 4, 2020 11:08
@Ma27
Copy link
Member Author

Ma27 commented Jun 4, 2020

@grahamc in case you have time, could you please take a look at my comments about the migration needed for this change?

@Ma27
Copy link
Member Author

Ma27 commented Jun 19, 2020

ping @grahamc @gilligan

@Ma27
Copy link
Member Author

Ma27 commented Jul 4, 2020

ping @grahamc how should we proceed with the migration question?

Copy link
Member

@grahamc grahamc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me I like the idea of it. One point of caution is the github handle can change often. I notice you cover this already: nicely done.

For the migration, in my most humble opinion it would be really good to retain the historical data. I think we might not have chosen to if we wrote the code ourselves for hydra.nixos.org, but I think overall other users of Hydra would find it important to have the data preserved.

If you agree, to that end we could do this a bit more relaxed than #710, but still in two releases:

  1. Create the new tables and start writing data to them
  2. Use the old column for all of the queries
  3. Extend hydra-backfill-ids to operate over the builds table, reading and writing data to the maintainers and pivot table. I think the historical values of this field are all emails in the format of Name <email@domain> and can just be written directly in to the email field.

In a second release, swap reads over to the new table and then drop the old column.

A different route we could also take would be a single-release, but has a trade-off that the API and website would show null data for maintainers until the backfiller completes:

  1. Create the new tables and start writing data to them
  2. Use the new table for all of the queries, just as you've done here
  3. Extend hydra-backfill-ids to operate over the builds table, reading and writing data to the maintainers and pivot table. I think the historical values of this field are all emails in the format of Name <email@domain> and can just be written directly in to the email field.
  4. At the end of hydra-backfill-ids, drop the old column.

Do you have any opinions or ideas on this? I have a preference for the simplicity of the second one, but also am not so keen on the maintainer data being NULL. I'd also want to see the backfiller tested thoroughly, since that final drop would be inconvenient if it did the wrong thing.

Also, when I wrote #710 I wrote down these notes about migrations: https://www.notion.so/grahamc/How-to-make-a-database-change-123c16a7ea414b5eb45d27f8e11ca2d3

src/lib/Hydra/Schema/Builds.pm Outdated Show resolved Hide resolved
<em>[% maintainer(m) %]</em>[% UNLESS loop.last %], [% END %]
[% END %]
[% END %]
</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which page is this rendered on? I'm a bit unsure about how often we want to query and display this data, in particular I'm thinking tables which are very heavy with many rows of builds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which page is this rendered on?

  • Jobset evals (this are probably the biggest issue if ?full=1 is set).
  • Search page
  • Status page (running builds etc)
  • "My builds" (if your email matches the one from your Hydra account you can already see maintained derivation-builds).

I'm a bit unsure about how often we want to query and display this data, in particular I'm thinking tables which are very heavy with many rows of builds.

While the biggest problem will probably be the jobset-eval page, this is the one where I really want to have this feature. I suggest to test this with live-data and try to optimize the query (and caching) if necessary.

src/script/hydra-eval-jobset Outdated Show resolved Hide resolved
src/sql/hydra.sql Show resolved Hide resolved
@Ma27
Copy link
Member Author

Ma27 commented Jul 9, 2020

@grahamc Thanks a lot for the detailed feedback!

For the migration, in my most humble opinion it would be really good to retain the historical data. I think we might not have chosen to if we wrote the code ourselves for hydra.nixos.org, but I think overall other users of Hydra would find it important to have the data preserved.

Agreed. That's why I brought this up :)

Do you have any opinions or ideas on this? I have a preference for the simplicity of the second one, but also am not so keen on the maintainer data being NULL.

I agree. Rather than having to keep the upgrade-path code in nixpkgs (NixOS/nixpkgs#83600) and to add more complexity to this change, it's preferable to let users wait for a short amount of time until the migration is fully finished IMHO. In the end, this isn't a mission-critical thing anyway from my PoV.

I'd also want to see the backfiller tested thoroughly, since that final drop would be inconvenient if it did the wrong thing.

This is actually the part I'm mostly anxious about. As soon as there's a working draft for hydra-backfill-ids we could test this with live data from hydra.nixos.org for instance as you suggested a while ago.

@Ma27
Copy link
Member Author

Ma27 commented Jul 9, 2020

Btw I'll try to implement the migration during the next weeks (as soon as I have time to) as we seem to agree to go for the second option. As soon as this is done, I'll try to do some live-testing to make sure that this doesn't explode with real-world data.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rss-feed-for-newly-failing-builds-from-hydra/8514/7

@siraben
Copy link
Member

siraben commented Dec 5, 2020

Any updates on this? It would be nice to have and closes NixOS/nixpkgs#101741

@Ma27
Copy link
Member Author

Ma27 commented Dec 5, 2020

Just left a comment in the issue you've linked: I don't think that it should be blocked by this. When it's ready we can still adapt the script.

Regarding the updates: it's still missing a sane migration path and I'm sorry, but I didn't have the time yet to work on that. Hopefully, I'll be able to finish that in December :)

@Ma27
Copy link
Member Author

Ma27 commented Dec 28, 2020

Just pushed a small batch of changes related to the DB migration. Currently, the following things are tbd to consider this ready:

  • hydra-update-maintainers should perform DB operations in batches rather than everything at once.
  • I need to address the comments from @grahamc
  • @grahamc btw, how should we test a migration with productive data? E.g. via a subset of Hydra's live data?

@Ma27
Copy link
Member Author

Ma27 commented Dec 29, 2020

@grahamc would you mind taking a look at the migration so we can make a plan on how to test it? When I have more time & motivation for this again, I'd take care of your review comments from July :)

@Ma27 Ma27 requested a review from grahamc December 29, 2020 20:28
@Ma27
Copy link
Member Author

Ma27 commented Jan 4, 2021

@edolstra would you mind leaving a review here? :)

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review/3032/439

@Ma27
Copy link
Member Author

Ma27 commented Jan 17, 2021

cc @NixOS/hydra-triage any chance to get a review here? :)

@Ma27
Copy link
Member Author

Ma27 commented Feb 6, 2021

@grahamc did you have a chance to take a look? :)

In order to search for maintainers (emails or github handles), both
values need to be stored. Also, it's probably not desirable to do this
in a string-separated list which is fairly hard to query.

Instead, those values will be stored in a separate table named
`Maintainer` which is associated to the builds-table using a many-to-many[1]
relationship. Please note that I didn't implement a migration for older
builds yet, we probably want to implement an approach similar to the
`hydra-backfill-ids`-implementation[2] here.

Another benefit of this is that if a maintainer with a given email
changes their GitHub-handle, Hydra will recognize this and display
the new handle.

[1] https://metacpan.org/pod/DBIx::Class::Relationship
[2] f692601
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/calling-all-hydra-users/11851/7

@Ma27
Copy link
Member Author

Ma27 commented Apr 17, 2021

@grahamc may I ask if you'd have some time for a review / test of this? I guess I may have some more timeslots to in the near future to e.g. take care of the next rebase and respond to comments / change requests :)

 Conflicts:
        flake.lock
        flake.nix
        src/lib/Hydra/View/NixExprs.pm
        src/root/jobset-eval.tt
        src/script/hydra-eval-jobset
        t/maintainers-upgrade-test.nix

Decided against a rebase here and t resolve everything in one go.
…tine

Also explain which relation is resolved here. IMHO this DBIx DSL isn't
the nicest one for this job.
@Ma27
Copy link
Member Author

Ma27 commented Apr 26, 2021

This could use tests for at least the channel output and Email notifications btw. But the current testcase should cover most of the stuff that's relevant already :)

Ma27 added 3 commits May 8, 2021 16:56
This isn't needed since latest Hydra is quite recent and works fine with
latest nix.
@Ma27
Copy link
Member Author

Ma27 commented May 8, 2021

@cole-h @grahamc merged latest master into this :)

@Ma27
Copy link
Member Author

Ma27 commented May 28, 2021

@cole-h @grahamc @edolstra any chance to get a review? Would really like to proceed with this :)

@Ma27
Copy link
Member Author

Ma27 commented Jul 26, 2021

@cole-h @grahamc do you currently have the capacity to move this forward?

 Conflicts:
        flake.lock
        src/lib/Hydra/Controller/Build.pm
        src/lib/Hydra/Plugin/EmailNotification.pm
        src/lib/Hydra/Schema/Result/Builds.pm
        src/lib/Hydra/Schema/Result/Users.pm
        src/sql/update-dbix.pl
        src/sql/upgrade-75.sql
@Ma27
Copy link
Member Author

Ma27 commented Nov 6, 2021

Since I was rebasing other branches anyways, I decided to just update this one as well :)

cc @grahamc @cole-h if you'd like to review this and need some assistance, let me know!

Note: even if we're finding the time to get this ready, I'd suggest to not use it for this ZHF period since it requires some database migration tasks that I'd rather not do at the time of the year, where the hydra.nixos.org has its most users :)

@Ma27
Copy link
Member Author

Ma27 commented Feb 6, 2022

@grahamc should we try to get this into a mergable state at some point now? %)

 Conflicts:
        flake.lock
        flake.nix
        src/lib/Hydra/Controller/JobsetEval.pm
        src/lib/Hydra/Controller/Project.pm
        src/lib/Hydra/Schema/Result/Builds.pm
        src/sql/upgrade-79.sql
 Conflicts:
        src/hydra-eval-jobs/hydra-eval-jobs.cc
        src/sql/upgrade-82.sql
@Ma27
Copy link
Member Author

Ma27 commented Sep 19, 2022

@grahamc, so I just did the following things (as we discussed.... quite a while ago):

IIRC the next step is to test this somewhere against hno, correct? In that case I'd probably need a test-instance since I'm pretty sure that none of my servers has sufficient disk-space (though I don't know the exact numbers, but I'd expect the db to be >100G at least, right)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants