-
Notifications
You must be signed in to change notification settings - Fork 152
Backup coding challenge fix #313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Backup coding challenge fix #313
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi thanks for the effort put into this. 🙌
please have a look the comments posted :)
case NoOp: | ||
// logger.Printf("Progress for team '%s' is in sync", team) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented that out because I thought it might get too noisy in the logs. Is it okay to leave it commented out, or should I remove the line entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave it in.
Haven't found it too noisy yet.
And yes, if you think it would be too noisy the better thing to do would be to delete the line.
Commented out code is really not great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let it be then
Will make it a line of code then
// Refetch after applying to ensure persistence uses the newly set state | ||
// Although, ideally Apply should make the instance match the last known state, | ||
// re-fetching adds a layer of verification before potentially overwriting annotations. | ||
// However, this adds latency and complexity. Let's trust the apply works for now | ||
// and persist the state we *intended* to apply. | ||
// If issues arise, re-fetch here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this comment. nothing is refetched here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the confusing comment! You are correct, the code currently doesn't refetch after applying. It persists the lastChallengeProgress, lastFindItCode, and lastFixItCode (the state we intended to apply) immediately after the apply* calls. The comment was kinda reflecting on a possible alternative (refetching) but doesn't match the implementation. I'll rephrase the comment to be accurate.
|
||
if errStandard != nil { | ||
logger.Println(fmt.Errorf("failed to fetch current Standard Challenge Progress for team '%s': %w", team, errStandard)) | ||
// We can skip this cycle too |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why "too" this is the first skip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove it. Sorry
@@ -85,24 +95,37 @@ func createProgressUpdateJobs(progressUpdateJobs chan<- ProgressUpdateJobs, clie | |||
} | |||
juiceShops, err := clientset.AppsV1().Deployments(namespace).List(context.TODO(), opts) | |||
if err != nil { | |||
panic(err.Error()) | |||
// Log error instead of panicking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the panic here is intentional.
if it panics the progress watch dog is restarted.
if we just log it might get stuck in non recoverable state forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it because I (ok not me but chatgpt/blogs which claim they teach you best practices lol) thought logging errors was preferred over panicking, but I understand the reasoning for wanting a restart if listing deployments (a core function) fails. I'll revert that change back to using panic.
AnnotationContinueCodeFindIt = "multi-juicer.owasp-juice.shop/continueCodeFindIt" | ||
AnnotationContinueCodeFixIt = "multi-juicer.owasp-juice.shop/continueCodeFixIt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally i don't really like that we are now persisting the coding challenge state differently than the hacking challenge state. would be way nicer if these two would also be array like the challenges annotation.
espeically if we want to use them later on in the MJ score board, this would make this hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented it this way primarily to address the backup/restore requirement using the existing continueCodeFindIt and continueCodeFixIt mechanisms that Juice Shop seems to use internally for state restoration, similar to the standard continueCode.
Storing the detailed progress for coding challenges (like which specific snippets are solved) would indeed be better long-term but seems more complex. It would likely require understanding Juice Shop's internal representation and potentially new API endpoints to expose that structured data, rather than just fetching/applying the opaque restore codes.
For the immediate goal of preventing progress loss on restart, storing the codes seemed like the most direct approach based on the current Juice Shop capabilities I could find.
I'll take my time, explore and then go with the preferred direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the thing is, doing it the same way as the hacking challenges would be a lot simpler than this is now though...
It would completely eliminate the need to do separate requests to the JuiceShop instances for either continueCode, the coding challenge status is already part of the response that JuiceShop sends, it's just not part of the golang type.
So instead of sending 3 requests per sync instance it would just be one request which response is then taken to fill the hacking, findit/fixit coding challenge arrays.
The continue codes use the same algorithm which is already present in the progress-watchdog.
Also if we do it this way now, we'll have a migration to deal with soon so then migrate the backed up continue codes the the proper format....
LastChallengeProgress []ChallengeStatus | ||
Team string | ||
LastChallengeProgress []ChallengeStatus | ||
LastContinueCodeFindIt string // Add FindIt code from annotation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of places in this PR have these code comments which only sense in context of a PR.
If somebody looks at this in a year they will be confused why this says "// Add FindIt code from annotation"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right. Those comments were just notes during development to track the changes. I'll clean them up and remove them before marking the PR as ready.
} | ||
|
||
// Generic function to apply a continue code via PUT request | ||
func applyCodeToEndpoint(url string, codeType string, team string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code style: would prefer if this would take in the team and the continue code here instead of the url.
the codeType should then also be used to generate the url and the format of the values should match the one from codeType in getContinueCode. ideally both would use a string enum / the clunky equivalent in go (https://www.sohamkamani.com/golang/enums/)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense for better encapsulation and consistency. I'll refactor applyCodeToEndpoint to take team, codeType, and code as arguments and generate the URL internally based on the codeType. I'll define constants (e.g., CodeTypeStandard) for the codeType parameter to improve readability and maintainability, similar to an enum approach.
} | ||
code, ok := response["continueCode"] | ||
if !ok { | ||
return "", fmt.Errorf("'continueCode' field not found in %s response", codeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a extremly common case. any team which haven't used coding challenges will get two of the log lines every minute.
failed to fetch current FindIt Code for team 'bar': 'continueCode' field not found in findIt response
failed to fetch current FixIt Code for team 'bar': 'continueCode' field not found in fixIt response
this might be somewhat confusing to people having a look at the logs, as there is no actual error or failure here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about the log spam for cases where coding challenges haven't been started. I agree that a 404 or missing continueCode field in the response likely just means "no progress yet" and isn't a true error.
I'll modify getContinueCode to specifically handle http.StatusNotFound by returning "", nil (empty code, no error). I'll also adjust the JSON decoding part to handle the case where the continueCode key might be missing in a 200 OK response (if that's possible) and return "", nil in that scenario too, removing the error logs for these expected "no progress" cases.
} | ||
} | ||
} | ||
|
||
// Helper to compare fetched code vs stored code, considering fetch errors | ||
func compareCodes(currentCode, lastCode string, fetchErr error) UpdateState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice if we could add tests for these very easily testable functions.
the test coverage for the progress watchdog right now isn't great but it would be good if it could at least slightly improve / remain stable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I will add unit tests for the compareCodes function to cover the different scenarios (NoOp, ApplyCode, UpdateCache, handling fetch errors).
currentFixItCode, errFixIt := getCurrentFixItCode(team) | ||
|
||
if errStandard != nil { | ||
logger.Println(fmt.Errorf("failed to fetch current Standard Challenge Progress for team '%s': %w", team, errStandard)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpicky. can you try to match the log style? all other log lines in the progress-watchdog start with a uppercase letter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, I'll update the log message capitalization to match the existing style (e.g., "Failed to fetch..." instead of "failed to fetch...").
Fixes issue #95
@J12934 I did some changes but I low key feel they are not enough. I am getting the findit/fixit codes here now but was that it?
BTW, I also changed some code which panicked to error logs as I heard it was the best practice (also, removed the bytes and replaced it with relevant substitutions (have a look in code changes)
Please suggest if I should change something else or revert stuff
Also, here's a full annotation copy paste. Have a look and see if there's something wrong.