-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add integration testsuite #1678
base: master
Are you sure you want to change the base?
Conversation
This is a little incomplete as I wanted to get some feedback to see if this is something you would be willing to consider accepting. This only includes a few tests, so it doesn't cover very much. Some questions to consider:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the high-level feedback is that the overall direction is good, I think we need to improve how the JSON files are kept and generated. One thought I had is that it seems like we ought to be able to support something like --bless
to generate them automatically? Doing so on CI (with perhaps a tarball artifact produced or some other easy-ish way of using results locally) would make iterating a little better.
Otherwise I'm worried that the bar for adding tests is pretty high if you need to figure out gh api and such.
README.md
Outdated
> --url=http://127.0.0.1:8000/github-hook --secret somelongsekrit | ||
> ``` | ||
> | ||
> Where the value in `--secret` is the secret value you place in `GITHUB_WEBHOOK_SECRET` described below, and `--repo` is the repo you want to test against. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know what permissions are necessary for this? It seems like it might be a convenient way to test against actual GitHub from CI (too), but it's not clear whether the permissions model really makes that easy.
For CI at least we could probably set up our own forwarding if needed though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, the default token generated by gh auth login
will have full permissions. gh webhook
just needs the ability to alter webhook settings on the repo, which that token should have. At least in my testing, it "just worked" without me needing to do anything special (other than signing up for the beta).
I'm not sure it would really be feasible to set it up on CI. I think it would need to start by creating a bot account, and manually creating a token with a limited scope to a single "test" repo, and add that token to the workflow secrets. But then I think PR authors would then have full write access to that repo, which could be dangerous.
I think it would be challenging to do in a secure way, but I haven't thought about it much. I don't know when it will be out of beta, either.
tests/common/mod.rs
Outdated
"PUT" => Method::PUT, | ||
"DELETE" => Method::DELETE, | ||
"PATCH" => Method::PATCH, | ||
_ => panic!("unexpected HTTP method {s}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW it seems plausible that https://docs.rs/http/latest/http/method/struct.Method.html might work instead of this (and some other manually defined structs here). Though historically I've not been super happy with the http crate API for this kind of thing... so maybe hand rolling is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I didn't use it is because:
- I wanted an enum to make it easy to
use Method::*
.http
uses associated consts. - I also wanted it to impl
Copy
to make it easy to use. - I had a custom STOP verb, which is not standard, and the
http
Method extensions weren't easy to use. I could easily switch that to use a different method, though.
I understand that it might be nicer to not use a custom http server, but I suspect adding anything else would have a pretty large build time and complexity hit.
I could trim a few things down, like removing custom header support. It isn't being used here, and I'm not sure it will ever be needed (I copied this from Cargo which does use it). However, it only adds a few lines so I figured it didn't hurt to keep it.
@@ -0,0 +1,543 @@ | |||
//! `GithubClient` tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an improvement over status quo, but I am worried about the amount of JSON we're checking in here.
I have a few thoughts on alternatives:
- Store the JSON elsewhere (at minimum: a separate directory would help to avoid losing track of the .rs files)
- S3 seems like a possible candidate but is annoying to update, etc.
- Don't try to make it possible to run this offline (basically: require CI or a GITHUB_TOKEN -- and "just" query the github API).
- The dynamic nature of replies seems like it makes this quite annoying to do well.
Ultimately I'm not opposed to just checking in a bunch of JSON but I've always felt a little icky about that kind of thing. I'm not sure there are good alternatives though.
(I guess the "big" option is getting rid of our own client and using octocrab... which IIRC might be auto-generated at least in part? But that doesn't really solve the problem, just shifts it elsewhere).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, the JSON files are a mess. I'll look into organizing it. I think I would prefer to avoid keeping them external, since I think that would make it more difficult to manage.
I'm not sure how feasible it is to make it run online (particularly on CI). Some of the CI complications are authentication and handling concurrent jobs. I also wanted to ensure this is as easy as possible for contributors to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about this off and on, and I'm feeling like the server tests are going to be a pain to maintain. They aren't too difficult to create, but if someone modifies some behavior in triagebot that perturbs them in any way, they are going to need to figure out how to recreate the tests which I think is just going to be too painful.
I think I'm going to need to figure out a different strategy.
If I dropped the server tests from this PR, and left just the GithubClient and database tests, would you be more willing to merge it? I think those should be much more maintainable, and at least provide some increase in testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general -- happy to experiment, the main "blocker" for this PR has (from my perspective) been that you feel happy with it -- I would like to understand the tests as well, and feel OK updating them, but the main thing is that you seem good since we can always just not run them / delete them.
So reducing scope seems fine.
tests/server_test/mod.rs
Outdated
//! At the end of the test, you should call `ctx.events.assert_eq()` to | ||
//! validate that the correct HTTP actions were actually performed by | ||
//! triagebot. If you are uncertain about what to put in there, just start | ||
//! with an empty list, and the error will tell you what to add. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In broad strokes this seems great -- I really like the assertions -- but I am (similarly to the github client) worried about maintaining and updating the JSON files.
Do you have a sense of how hard (time/complexity) adding new tests will be? Would it be reasonable to expect drive-by contributors to figure out the test framework, or will maintainers (e.g., me, you, etc.) need to write tests? Either seems workable, but it'll help gauge the level of complexity we can afford.
I agree it's a pain -- most triagebot features don't actually rely on the database, so I wonder if we could stub it out. Separately I'd personally expect that having an sqlite backend wouldn't actually be that hard (perf.rlo does something like this), but in production Postgres is much easier to manage (mostly because we can have an "instance" rather than worrying about the files, and can login to fix the database up concurrently with the service). |
I've used cargo insta for recording and comparing json replies from API endpoints, it's quite useful and easy to use. Could it be an alternate way to handle these JSON files? |
The GithubClient tests were relatively easy. Each one took just a few minutes to write. The server tests were quite a bit more challenging, and took maybe 10 minutes to write each one (and that's for an easy one). So I do fear it might be difficult for people unfamiliar with it to contribute easily.
I'll look into that, or maybe something like sqlite. I was reluctant to bring in another dependency. I would like to avoid some of the complexity here.
I'll look into a more automated way to record the behavior, and possibly using something like |
TRIAGEBOT_TEST_RECORD can be used to record webhook events to disk. TRIAGEBOT_TEST_DISABLE_JOBS can be used to disable automatic jobs.
dd97c88
to
1617f0d
Compare
.configure(self) | ||
.build() | ||
.unwrap(), | ||
) | ||
.await?; | ||
rate_resp.error_for_status_ref()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: this call to error_for_status_ref()
is a small unrelated fix to fail early if the /rate_limit
URL fails. Otherwise this was ignoring the status code, which can cause a confusing error.
OK, I pushed a new version that takes a different approach of just using recorded playbacks. It is substantially easier to write tests, though I'm a little concerned about the possible fragility. The PR is a little large, so let me recap the changes:
I wanted to do a snapshot at this point and see what you think of this new approach. I'm not sure how reliable these tests are going to be. If in the long run they end up being too fussy, then I think they can be removed. However, I should be around in case there are issues or people need help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ehuss left a few comments, I think nothing but nits.
I would agree with @Mark-Simulacrum to investigate using a sqlite DB for the use case of testing.
src/main.rs
Outdated
if env::var_os("TRIAGEBOT_TEST_DISABLE_JOBS").is_some() { | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of TRIAGEBOT_TEST_DISABLE_JOBS
is explained in ./tests/server_test/mod.rs
. Perhaps also write here the same bit of documentation? I would also add a short note in the README.md
.
Also, would it change anything if the env var check stays outside of the task spawning block (since tasks are disabled at all when running tests)? Would that spare some cycles or change anything in the compiled code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I shuffled the spawning code to separate functions to hopefully make the structure a little clearer (the run_server
function was starting to get a little long in my opinion).
I'd rather not mention TRIAGEBOT_TEST_DISABLE_JOBS
in README.md, since users should never need to know about it specifically. I did add some documentation to server_test/mod.rs
discussing scheduled jobs.
src/test_record.rs
Outdated
/// | ||
/// Returns `None` if recording is disabled. | ||
fn record_dir() -> Option<PathBuf> { | ||
let Some(test_record) = std::env::var_os("TRIAGEBOT_TEST_RECORD") else { return None }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that record_dir()
is only checked in the initial main()
. Could checking for TRIAGEBOT_TEST_RECORD
be moved there instead of checking it in init()
and record_event()
? Would that help it being a bit more visible?
Also I would mention it in the README.md (for increased awareness when someone wants to run tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely clear on this suggestion. Each time an event comes in, it needs to record it to disk. In order for it to know where to record it, it needs to read the TRIAGEBOT_TEST_RECORD
. It can't be checked in one place. Unless this is perhaps suggesting to store the value somewhere and read that directly? It seems like that would make it a bit more difficult to use since there would need to be some global state tracked somewhere.
I added a brief mention of recording in README.md
tests/server_test/mod.rs
Outdated
// TODO: This is a poor way to choose a TCP port, as it could already | ||
// be in use by something else. | ||
let triagebot_port = NEXT_TCP_PORT.fetch_add(1, std::sync::atomic::Ordering::SeqCst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it test if the port is already taken?
Another approach could be like Wiremock does: allocate a pool of free ports at startup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I'm clear on the suggestion about setting up a pool. Each triagebot process is independent, and wouldn't be able to easily "take" a port from a pool (sending file descriptors over sockets exists, but not something I would want to bother with).
I'm not too worried about using up a lot of ports. I would only expect for there to ever be a few dozen tests, and there should be over 15,000 ports available starting at 50,000.
I added some code to pick a free port, though it may not be very reliable.
@@ -4,7 +4,8 @@ use rust_team_data::v1::{Teams, ZulipMapping, BASE_URL}; | |||
use serde::de::DeserializeOwned; | |||
|
|||
async fn by_url<T: DeserializeOwned>(client: &GithubClient, path: &str) -> anyhow::Result<T> { | |||
let url = format!("{}{}", BASE_URL, path); | |||
let base = std::env::var("TEAMS_API_URL").unwrap_or(BASE_URL.to_string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mention TEAMS_API_URL
in the README with purpose and a suggested default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you say what should be mentioned? These environment variables (GITHUB_API_URL
, GITHUB_RAW_URL
, etc.) are intended only for testing, and should only ever be set automatically. Users shouldn't ever need to know about them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, documenting for developers the env vars needed to run the triagebot (new one: TEAMS_API_URL
). I only half agree but probably just unimportant nits, I guess :-)
This helps keep the `run_server` function a little smaller, and makes the disabling code a little clearer. This also adds a check if test recording is enabled, to help avoid jobs interfering with recording.
I'd be happy to investigate adding sqlite support. What do you think about using something like |
I am not familiar with how rustc-perf handles sqlite DB connections but in my experience |
Oh yea, I strongly agree with keeping deps at a minimum. I haven't used sqlx, and I was mostly curious about it, since it supports async, handles pooling, and migrations among other things (and seems well maintained and often recommended). Copying from rustc-perf means pulling in the pooling code and the migration code, and needing to write every DB query twice. It's not huge to copy in, so I can take a crack to see how difficult it is. |
I have no direct objections to sqlx, but when I wrote the rustc-perf logic I believe they encouraged having a database around for schema checking or similar at build time which I really didn't want. It may have also been the case that they had a stale version of rusqlite in dependencies, and we needed something newer? In part I also believe I ended up feeling like the pooling and other abstraction layer was less work than learning sqlx and then writing atop that some amount of abstraction (since there were still incompatibilities between SQLite and postgres backends, so just raw sqlx wasn't enough). It's quite possible these problems have been solved since then. |
Not all API responses are JSON, which can make it a bit awkward to differentiate which kind of responses are JSON. This unifies the two Activity variants so that there is only one, which simplifies things a bit. Non-JSON responses are stored as a JSON string, under the assumption that GitHub never response with a JSON string.
I have pushed an update that adds sqlite support. I essentially copied the code from rustc-perf. There is a The way sqlite handles concurrency and locking is a bit different from Postgres, but for testing it should be fine. |
This can help avoid accidentally deleting something that hasn't been checked in yet.
Apparently some environments roundtrip datetimes in Postgres with different precision.
This reverts commit 5254631.
This adds an integration testsuite for testing triagebot's behavior. There are two parts of the testsuite:
github_client
tests theGithubClient
by sending requests to a local server and checking the response.server_test
actually launches thetriagebot
executable, sets up some servers, launches PostgreSQL, and injects webhooks, and checks for the response.An overview of the changes here:
GithubClient
was simplified a little since it was always used in the same way.The comments and docs should give some overview of how things are set up and how to write and run tests.