Consolidate / batch process multiple updates in a single PR #52

codificat · 2021-11-19T10:29:00Z

Is your feature request related to a problem? Please describe.

Some prescription refresh runs generate a considerable amount of updates, each of which comes in the form of a new Pull Request to the prescriptions repo.

This causes a lot of noise.

Also, our triage party application takes a long time to refresh data, because it has to get info about all these PRs.

High-level Goals

The proposal is to consolidate multiple updates from a job run into a single PR.

Describe the solution you'd like

1 PR per job run.

Describe alternatives you've considered

Keep things as they are and bear with the volume of PRs and associated notifications.

Additional context

At the time of this writing, there are over 18000 closed PRs in that repo.

Acceptance Criteria

multiple changes are consolidated in a single PR

sesheta · 2022-05-17T10:15:42Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

sesheta · 2022-06-16T12:48:41Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

sesheta · 2022-07-16T15:43:13Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

sesheta · 2022-07-16T15:43:15Z

@sesheta: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

codificat · 2022-08-09T12:32:35Z

/reopen
/sig stack-guidance

I believe this is still relevant

sesheta · 2022-08-09T12:32:51Z

@codificat: Reopened this issue.

In response to this:

/reopen
/sig stack-guidance

I believe this is still relevant

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

VannTen · 2022-08-09T13:17:22Z

Could the PR be labelled and the query/notifications use the label to exclude those PRs ?

codificat · 2022-08-17T10:07:08Z

Could the PR be labelled and the query/notifications use the label to exclude those PRs ?

I am not sure I understand you here, sorry. Can you elaborate?

The goal of this request is to try to reduce (consolidate) the number of PRs created on the repo. Regardless of labels, e.g. the pre-commit checks should be run - and these do take some time each (lots of yaml files to check).

/remove-lifecycle rotten

VannTen · 2022-08-17T12:26:53Z

I am not sure I understand you here, sorry. Can you elaborate?

Sure.

The goal of this request is to try to reduce (consolidate) the number of PRs created on the repo. Regardless of labels, e.g. the pre-commit checks should be run - and these do take some time each (lots of yaml files to check).

Ok I didn't catch that we were also concerned with resources taken up by pre-commits hooks. So if I resume we have three problems caused by the high number of PR: 1. High numbers of pr check jobs (pre-commit/other) 2. Lots of notifications -> high mental load 3. Triage party long refresh times. My suggestion/question was: Can we add metadata (I'm thinking simply labels, but maybe something else) to all "auto" PRs, in order to enable these 3 consumers to filter them out/treat them specially ? In order: 1. pre-commit: If we can filter the PR (not sure if prow.yaml support that sort of thing, different args depending on labels ?), we could use something like `--to-ref --from-ref` (pre-commit options) instead of `--all-files` to reduce the jobs cost (hooks are run only on modified files) 2. notifications: I don't think Github allows us to filter notifications at the source. We could have standardised filter (mail or github ui) but that's probably the hardest 3. If triage party use the REST or the GraphQl interface the query might be configurable to exclude the PR entirely. (All of that is based on the assumption that we do get value in 'single-unit' PRs and intend explore the space of what's possible, I'm not familiar enough with prescriptions to assess that value)

VannTen · 2022-08-18T16:18:33Z

Relevant: google/triage-party#10
Triage-party seem to be able to handle bigger repos than that, maybe our configuration is not optimal ?
Apparently it does pre-match filtering (I guess incorporating match parameters in the query to gh api ?) and there is the possibility to have a cache, optionnaly persistent.

VannTen · 2022-08-19T11:11:27Z

Regarding pre-commit, we should also check if all pre-commit runs have a shared cache. Pre-commit create immutable environments (per hooks I think, hashing together url + version + add_deps I suppose) which are reusable if they're the same (one the reason I use types-all instead of more fine-grained stubs packages for mypy hooks). At least the command-line says so. I wanted to check this morning but trying to get jobs logs return a 504^. -> reminder to myself to check on it.

VannTen · 2022-08-19T14:18:28Z

Something seems weird : while checking aicoe-ci/tasks/pre-commit-check.yaml

    - name: run-pre-commit
      image: quay.io/thoth-station/thoth-precommit:v0.12.2
      workingDir: /workspace/repo
      script: |
        if [[ -f .pre-commit-config.yaml ]]; then
          set +e
          out=$(pre-commit run --all-files 2>&1)
          exit_code=$?
          set -e
          if [[ $exit_code -ne 0 ]]; then
            state="failure"
            desc="The pre-commit test failed!"
            cat <<EOF > /workspace/repo/pr-comment
        <details>
        <summary>Pre-Commit Test failed! Click here</summary>

        \`\`\`
        $out
        \`\`\`
        </details>
        EOF
          else
            state="success"
            desc="The pre-commit test succeeded!"
          fi
          cat <<EOF > /workspace/repo/pr-status.json
          {
            "state": "$state",
            "desc": "$desc"
          }
        EOF
        fi

Isn't part of that defined in .prow.yaml ?
So which is deciding what images & cmd to run ? Am I looking at the correct file ?

codificat · 2022-08-19T15:00:08Z

Isn't part of that defined in .prow.yaml ?
So which is deciding what images & cmd to run ? Am I looking at the correct file ?

I believe that this check was there before prow was deployed and implemented - it is probably deprecated by now, although there might be some repos that still use it. I remember removing some references in the past (basically, remove the pre-commit check from .aicoe-ci.yaml and add it to .prow.yaml instead)

@harshad16 might be able to confirm this.

Maybe worth creating an issue to hunt down any remaining references and deprecate that check?

codificat · 2022-08-19T15:10:39Z

Potentially related (as an improvement to the prescriptions CI): thoth-station/prescriptions#8

harshad16 · 2022-08-19T19:32:15Z

the pre-commit check and pytest in aicoe-ci are no longer used.
we can remove them now.
as prow takes care of it.

VannTen · 2022-08-22T07:50:17Z

So the prow-job is entirely defined in .prow.yaml ? It's not quite clear looking at .prow.yaml if that depends on external stuff or not. (and the `context: aicoe-ci/prow/commit` might be confusing ^)

On Fri, Aug 19, 2022 at 08:00:20AM -0700, Pep Turró Mauri wrote: Maybe worth creating an issue to hunt down any remaining references and deprecate that check?

To Aicoie-ci you mean ? That could be a good idea. I think we probably also need one for documenting the CI process in core. AFAICT, we don't use Zuul anymore ^^

VannTen · 2022-09-01T09:01:28Z

@codificat Regarding specifically the triage-party issue, should we create an issue for that ? There was no lagginess this week when I used it, but probably because there isn't any active open PR on the prescription repos.

codificat · 2022-10-04T16:39:07Z

@codificat Regarding specifically the triage-party issue, should we create an issue for that ? There was no lagginess this week when I used it, but probably because there isn't any active open PR on the prescription repos.

As far as I can tell, the issue only affects triage-party upon startup, while it initially creates the cache. We don't restart triage-party that often in prod, so it's probably fine as it is.

More serious, I believe, is the time it takes to run pre-commit run --all-files on each of the small PRs. thoth-station/prescriptions#27401 was a nice idea, but it was reverted and not fixed. Giving it another try in thoth-station/prescriptions#30488

VannTen · 2022-10-17T11:47:11Z

So, did thoth-station/prescriptions#30488 helped ?

codificat · 2022-10-17T22:03:07Z

So, did thoth-station/prescriptions#30488 helped ?

For the PRs coming from the refresh job, I'm not so sure. I see that @mayaCostantini merged more than 600 PRs from the last refresh batch, but AFACIT no precommit checks were performed on them - at least by prow.

The main topic of this issue, though, is the high volume of PRs to begin with. It is still a wishlist item for me to reduce that number 😇

I believe the incremental pre-commit helps significantly with PRs that are sent by humans. On one I saw the pre-commit check run go down from ~45 min to ~2 min (I can't find the reference now... I blame the sheer volume of PRs 😛 - although by now a job would have been cleaned up anyway).

VannTen · 2022-10-18T08:15:49Z

The main topic of this issue, though, is the high volume of PRs to begin with. It is still a wishlist item for me to reduce that number 😇

I still don't have any opinions on that ^. What's the choking point exactly, still notifications/triage party then ? @mayaCostantini I think you said there was some value in keeping separate PRs ?

I blame the sheer volume of PRs 😛

`-label:bot -bot` helps significantly with that ^

codificat · 2022-10-18T11:23:00Z

What's the choking point exactly, still notifications/triage party then ?

For me, yes: notification handling is broken for me. That might be just affecting me and safely ignored, though. Triage party does get its hit, but as far as I know it only happens during startup, which does not happen often. We might also be hit by API rate limits when this happens, but again this is rare.

All in all, you are right: this is not a big issue. And if you prefer to close it instead, so be it 😃. Especially if there is value in keeping (hundreds of) separate PRs. I don't see why, but I might be missing something.

codificat added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 19, 2021

goern added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Feb 16, 2022

sesheta added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 17, 2022

sesheta added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 16, 2022

sesheta closed this as completed Jul 16, 2022

sesheta reopened this Aug 9, 2022

sesheta added the sig/stack-guidance Categorizes an issue or PR as relevant to SIG Stack Guidance. label Aug 9, 2022

codificat removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 18, 2022

VannTen mentioned this issue Aug 19, 2022

Run pre-commit only on modified files from PR thoth-station/prescriptions#27401

Merged

VannTen mentioned this issue Aug 19, 2022

Add shared cache to pre-commit pipelines AICoE/aicoe-ci#180

Closed

This was referenced Aug 22, 2022

Documentation updates on CI thoth-station/core#449

Closed

[3pt] create a pre-commit-update-manager thoth-station/core#462

Open

Gregory-Pereira mentioned this issue Aug 24, 2022

Operate-first prow / CI issues from kubeval check operate-first/apps#2288

Closed

codificat added this to Planning Board Sep 26, 2022

codificat moved this to 🆕 New in Planning Board Sep 26, 2022

codificat mentioned this issue Oct 4, 2022

Run pre-commit only on the files modified by the PR thoth-station/prescriptions#30488

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consolidate / batch process multiple updates in a single PR #52

Consolidate / batch process multiple updates in a single PR #52

codificat commented Nov 19, 2021

sesheta commented May 17, 2022

sesheta commented Jun 16, 2022

sesheta commented Jul 16, 2022

sesheta commented Jul 16, 2022

codificat commented Aug 9, 2022

sesheta commented Aug 9, 2022

VannTen commented Aug 9, 2022 •

edited

Loading

codificat commented Aug 17, 2022

VannTen commented Aug 17, 2022 via email

VannTen commented Aug 18, 2022

VannTen commented Aug 19, 2022 via email

VannTen commented Aug 19, 2022

codificat commented Aug 19, 2022

codificat commented Aug 19, 2022

harshad16 commented Aug 19, 2022

VannTen commented Aug 22, 2022 via email

VannTen commented Sep 1, 2022 via email

codificat commented Oct 4, 2022

VannTen commented Oct 17, 2022

codificat commented Oct 17, 2022

VannTen commented Oct 18, 2022 via email

codificat commented Oct 18, 2022

Consolidate / batch process multiple updates in a single PR #52

Consolidate / batch process multiple updates in a single PR #52

Comments

codificat commented Nov 19, 2021

sesheta commented May 17, 2022

sesheta commented Jun 16, 2022

sesheta commented Jul 16, 2022

sesheta commented Jul 16, 2022

codificat commented Aug 9, 2022

sesheta commented Aug 9, 2022

VannTen commented Aug 9, 2022 • edited Loading

codificat commented Aug 17, 2022

VannTen commented Aug 17, 2022 via email

VannTen commented Aug 18, 2022

VannTen commented Aug 19, 2022 via email

VannTen commented Aug 19, 2022

codificat commented Aug 19, 2022

codificat commented Aug 19, 2022

harshad16 commented Aug 19, 2022

VannTen commented Aug 22, 2022 via email

VannTen commented Sep 1, 2022 via email

codificat commented Oct 4, 2022

VannTen commented Oct 17, 2022

codificat commented Oct 17, 2022

VannTen commented Oct 18, 2022 via email

codificat commented Oct 18, 2022

VannTen commented Aug 9, 2022 •

edited

Loading