Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Labels #216

Closed
wants to merge 2 commits into from
Closed

[WIP] Labels #216

wants to merge 2 commits into from

Conversation

Ekleog
Copy link
Member

@Ekleog Ekleog commented Aug 10, 2018

Continuation of #164

This PR adds an @ofborg label [label-id] command that tags the PR/issue as label (after a label-id → label translation phase), via a dedicated worker so that response can be instantaneous even if the builders are all in use.

Still missing:

  • Authentication (ie. restricting access to only people who have contributed at least one commit or PR owner, see githubcommentfilter.rs:140)
  • Loading the label-id -> tag-for-github association from a file (see here)
  • Having someone who actually knows a bit about AMQP proofread the code

Review should be easier commit-by-commit :)

@timokau
Copy link
Member

timokau commented Aug 29, 2018

Can I somehow help push this forward (without knowing too much about ofBorgs interna)?

@LnL7
Copy link
Member

LnL7 commented Aug 29, 2018

Is this still relevant given that ofborg labels pull requests automatically now?

@timokau
Copy link
Member

timokau commented Aug 29, 2018

Yes, for something like NixOS/rfcs#30. The automatic labels are a different use case than the user-supplied ones.

@Ekleog
Copy link
Member Author

Ekleog commented Aug 30, 2018

@timokau If you know Rust, I think the missing thing is, just above githubcommentfilter.rs:140, to add a check verifying that the user is allowed by some ACL. An example can likely be seen at githubcommentfilter.rs:58. First having a check against some hard-coded ACL, then dynamically loading it from a file (in a similar way to known-users and trusted-users). Then it'll be possible to poke grahamc for review of esp. github-labeler.rs, I think :)

@Ekleog
Copy link
Member Author

Ekleog commented Aug 30, 2018

Issue is, AFAIR there's currently no testbed, so apart from verifying that the code compiles and looks good… :/

@timokau
Copy link
Member

timokau commented Aug 30, 2018

What is ACL? Assuming you mean some kind of access control: I think a lot of the value of this is actually in allowing everybody to set some labels. That seems to work pretty well for sage. In case that ever becomes an issue we could still think about access control (it wouldn't be a huge deal if somebody spammed some wrong labels). A reasonable policy would be

  • the PR author can set all labels
  • anybody that is a "Contributor" e.g. has at least one merged commit can set all labels

But I think starting out with no access control whatsoever and see how that goes would be best.

@Ekleog
Copy link
Member Author

Ekleog commented Aug 30, 2018

The “anybody that is a "Contributor" e.g. has at least one merged commit can set all labels” part would likely be quite hard to handle (unless GitHub API gives the “Contributor” information easily?)

For allowing everybody to set some labels, I had been operating under the idea that it would be on access control, and some labels could be unrestricted (eg. removing waiting-for-author and setting waiting-for-review). For instance, the 2.status: * are tags that we would likely not want anyone to be able to set? unless people then can no longer trust the labels.

Then, if other people think this PR should land without authentication, then the only thing missing is filling the map at https://github.com/NixOS/ofborg/pull/216/files#diff-5addf818f41ce779c0a0619e30b160f8R24 with the labels that should be set-able by anyone. I can do this quite quickly if that is confirmed, or even load it them from a file :)

@timokau
Copy link
Member

timokau commented Aug 30, 2018

Looking at the current status labels, they are

  • 2.status: duplicate
  • 2.status: invalid
  • 2.status: merge conflict
  • 2.status: wait-for-upstream
  • 2.status: wontfix
  • 2.status: work-in-progress

Which are exactly those which would be useful to be changeable by anyone. Worst-case somebody would spam PRs with status changes. That wouldn't be the end of the world and if its really bad we could write a quick one-off bot to revert those changes and implement access control or even just a blacklist.

As I said, this is working surprisingly well for sage. There everybody can make any change on a ticket and people tend to behave themselves. Since the worst case really isn't that bad I think its worth a try.

@Ekleog
Copy link
Member Author

Ekleog commented Aug 31, 2018

Comments from IRC:

gchristensen │ imo it should be a trivially low bar, but more than just "anybody can do it"
      ekleog │ can you comment on for example what “a trivially low bar” would be to you? […]
   infinisil │ ekleog: Maybe them being a contributor

It sounds like “restricting access to contributors” would currently gather the most voices, so I'm editing the top post to reflect the remaining things to be done :)

@timokau
Copy link
Member

timokau commented Aug 31, 2018

I think the "or the owner of the PR" part is important too. Otherwise first time contributors may have a hard time and be confused.

@Ekleog
Copy link
Member Author

Ekleog commented Aug 31, 2018

Sounds like not risking much, top post edited :)

@ryantm
Copy link
Member

ryantm commented Sep 12, 2018

There is an API endpoint for contributors, but hubcaps doesn't have support for it.

@ryantm
Copy link
Member

ryantm commented Sep 12, 2018

The GitHub API endpoint for contributors only supports showing information for the top 500 contributors. I think we should use the maintainers file of nixpkgs instead. This currently has 900 entries.

If we use the maintainers file from the PR and the requester included their name in the maintainers file, this would automatically support allowing the requester to edit the labels. (This could be exploited by adding a bunch of maintainers, but that seems like a high enough bar than spamming GitHub comments directly.

@timokau
Copy link
Member

timokau commented Sep 12, 2018

If we use the maintainers file from the PR and the requester included their name in the maintainers file, this would automatically support allowing the requester to edit the labels

So would ofBorg take the list from the PR? Or from master? Or from after merging the PR into master? The last one probably makes the most sense.

I would still prefer the "Contributor" variant, but this would be a good replacement if that is not possible. One thing to consider is that it would encourage people to add themselves to the maintainers list even though they don't actually maintain anything. Not much harm in that, but it does change the purpose of the list a bit.

@ryantm
Copy link
Member

ryantm commented Sep 12, 2018

I'm not sure how reliable it would be, but we could get the list of public email addresses that the label change requester has via the GitHub Users API. Then search for those email addresses in

git shortlog --summary --numbered

on the nixpkgs repo.

@ryantm
Copy link
Member

ryantm commented Sep 12, 2018

We could also search for their GitHub username in the maintainers list. The combination of both should be quite reliable.

@timokau
Copy link
Member

timokau commented Sep 12, 2018

Good idea! People can set arbitrary email addresses when commiting though, not sure if there are any security implications. Probably nothing relevant.

@Ekleog
Copy link
Member Author

Ekleog commented Sep 12, 2018

The idea of using the maintainer list will be hard to pull out: this would mean there would need to be an evaluation before accepting the label change. Which would need propagating information where it's not really propagated yet, and making the overall system much more complex (I think). And the added evaluation would put again more pressure on the master ofborg instance, unless refactoring everything to get the evaluations out of it.

I think it would be much easier to find another way to handle this actually using the GitHub API. For instance, using /search/commits?q=author:[authorname]. I hope that wouldn't exceed GitHub's rate-limiting, though.

@Ekleog
Copy link
Member Author

Ekleog commented Sep 12, 2018

Hmm, actually maybe https://developer.github.com/v3/repos/statistics/#get-contributors-list-with-additions-deletions-and-commit-counts may be more suited to getting the contributor list, and have ofborg cache itself the outcome of this command for something like 1hr, so as to be sure to not exceed the rate limiting?

@ryantm
Copy link
Member

ryantm commented Sep 12, 2018

The maintainer list idea doesn't have to work in the fancy way we were proposing. It could just be a cached hourly evaluation of the maintainer list from master + the PR requester.

Good to know about that statistics endpoint, but does it only show the statistics for the current week? It looks like hubcaps does not have support for statistics endpoints currently.

@ryantm
Copy link
Member

ryantm commented Sep 12, 2018

Using /search/commits?q=author:[authorname] probably would be fine, but I'm not sure what GitHub thinks about using non-API pages like this.

@Ekleog
Copy link
Member Author

Ekleog commented Sep 13, 2018

About hubcaps not having it, currently it looks much easier to me to add support for it to hubcaps than to fetch it from the maintainer list. However, you're right it looks like it only provides the last week worth of activity, and I can't find documentation about the api endpoint that appears to be used to generate eg. https://github.com/NixOS/ofborg/graphs/contributors (ie. /:org/:repo/graphs/contributors-data, if I guess correctly).

For the search API, it's documented here, and appears to be rate-limited at 30 requests / minute, so I think we shouldn't have to worry too much about the rate-limiting? If so I think this looks like the best way to go, currently.

@timokau
Copy link
Member

timokau commented Sep 13, 2018

I'm almost sure I've seen bots reacting to first-time contributions before, so the information must be available somehow right?

A quick search shows https://github.com/behaviorbot/welcome, although I can't find any actual implementation.

@Ekleog
Copy link
Member Author

Ekleog commented Sep 13, 2018

… ok, actually I think the information is already present in the webhook payload: this page puts "author_association": "OWNER",

Unfortunately there appears to be little-to-no documentation on what values author_association can actually take.

@nixos-discourse
Copy link

This pull request has been mentioned on Nix community. There might be relevant details there:

https://discourse.nixos.org/t/suggestion-github-label-for-prs-for-darwin-packages/3458/6

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/suggestion-to-help-maintainers-with-nixpkgs-prs-bloat/4794/14

@Ekleog
Copy link
Member Author

Ekleog commented Jun 13, 2020

Seeing as marvin seems to be making progress, I'm going to close this, as it look like both implement basically the same thing and it'll probably be easier to deploy it as a separate tool than integrated with ofborg.

@Ekleog Ekleog closed this Jun 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants