-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Filter Pull Requests Server-Side with GitHub GraphQL API #22617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Seems alright, but it is a bit of a bandaid. Is there no additional filter we could add to limit the size of the response? |
For Github specifically, it is possible to use the Search issues and pull requests API to filter pull requests by the labels server-side which is very fast but this only returns the pull request numbers and would require N more API calls to get the necessary details (e.g. head sha) about each PR. query := fmt.Sprintf("repo:%s/%s is:pr is:open", g.owner, g.repo)
for _, label := range g.labels {
query += fmt.Sprintf(" label:\"%s\"", label)
}
opts := &github.SearchOptions{
ListOptions: github.ListOptions{
PerPage: 100,
},
}
for {
result, resp, err := g.client.Search.Issues(ctx, query, opts)
...
} For repos with large numbers of irrelevant PRs and a small number of relevant PRs this will be much faster but for smaller repos (which I assume are more common) then the existing code will be much faster. There seems to be no clear general solution. Increasing page size is definitely a band aid but it might be the only reasonable thing that can be tuned with the REST API. I will explore the GraphQL API instead. |
I tested the GraphQL API in place of the REST API to list pull requests and it appears to be significantly faster with the following query using var query struct {
Search struct {
Nodes []struct {
PullRequest struct {
Number githubv4.Int
Title githubv4.String
HeadRefName githubv4.String
BaseRefName githubv4.String
HeadRefOid githubv4.String
Labels struct {
Nodes []githubLabel
} `graphql:"labels(first: 10)"`
Author struct {
Login githubv4.String
}
} `graphql:"... on PullRequest"`
}
PageInfo struct {
EndCursor githubv4.String
HasNextPage bool
}
} `graphql:"search(query: $query, type: ISSUE, first: 100, after: $after)"`
}
queryString := fmt.Sprintf("repo:%s/%s is:pr is:open", g.owner, g.repo)
for _, label := range g.labels {
queryString += fmt.Sprintf(" label:\"%s\"", label)
} I ran this against our largest repo which has over 4000 open PRs. On my branch with GraphQL search query:
On
I have a PR I can put up with this change and I am happy to test it internally at my company to validate it. |
Summary
The page size for listing pull requests from GitHub is hardcoded which makes refreshing application sets which list pull requests from large repos very slow when there is thousands of open PRs.
Motivation
We use ArgoCD pull request generator at my company for deploying preview environments on GitHub PRs that we then use for integration tests. The repository we use is very large and has thousands of pull requests from many different teams at any given time.
For example, a repository with large pull request volume might have over 4000 open PRs at any one time. This currently would require 40 round trips in order to list all pull requests which is noticeably slow to the developer who is waiting for their preview environment.
Proposal
Replace the REST API in the pull request
GithubService
with the GraphQL API in order to filter pull requests using the labels on the server-side which is significantly faster and less round trips in order for the pull request generator to be usable for large repos with high pull request volume.The text was updated successfully, but these errors were encountered: