Skip to content

Backup coding challenge fix #313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

AyushRajSinghParihar
Copy link
Contributor

@AyushRajSinghParihar AyushRajSinghParihar commented Apr 21, 2025

Fixes issue #95

@J12934 I did some changes but I low key feel they are not enough. I am getting the findit/fixit codes here now but was that it?

BTW, I also changed some code which panicked to error logs as I heard it was the best practice (also, removed the bytes and replaced it with relevant substitutions (have a look in code changes)

Please suggest if I should change something else or revert stuff

Also, here's a full annotation copy paste. Have a look and see if there's something wrong.


tenxcoder@100xmachine:~/Desktop/GSOC/multi-juicer$ kubectl get deployment juiceshop-testteam -n default -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    multi-juicer.owasp-juice.shop/challenges: '[{"key":"scoreBoardChallenge","solvedAt":"2025-04-21T19:13:09.813Z"}]'
    multi-juicer.owasp-juice.shop/challengesSolved: "1"
    multi-juicer.owasp-juice.shop/continueCodeFindIt: GDoMmq2QJVNkm9wKRp0WMBaGzD3nAj3VZY5Ee7OqjLPxdvXlyo61gbrdJQO6
    multi-juicer.owasp-juice.shop/continueCodeFixIt: 8bnEDMDmJKYlw6RZ50zLb92nN3gQawG2aGejO4WBpdXPqrM7kvE8yoVkPqjY
    multi-juicer.owasp-juice.shop/lastRequest: "1745262789788"
    multi-juicer.owasp-juice.shop/lastRequestReadable: 2025-04-21 19:13:09.788696042
      +0000 UTC m=+156.736885276
    multi-juicer.owasp-juice.shop/passcode: $2a$10$V1QVexSDqph6COBHhLwCIem5AsOrknvWoOYc/xUgWGw4Akzgm/5o.
  creationTimestamp: "2025-04-21T19:11:22Z"
  generation: 8
  labels:
    app.kubernetes.io/component: vulnerable-app
    app.kubernetes.io/instance: juice-shop-testteam
    app.kubernetes.io/name: juice-shop
    app.kubernetes.io/part-of: multi-juicer
    app.kubernetes.io/version: v17.2.0
    team: testteam
  name: juiceshop-testteam
  namespace: default
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: Deployment
    name: balancer
    uid: bbb8b85a-5000-4a88-a7c9-b36647939f49
  resourceVersion: "864"
  uid: b567494b-1b2d-4ffc-af5c-9e6b34bac221
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/name: juice-shop
      app.kubernetes.io/part-of: multi-juicer
      team: testteam
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/name: juice-shop
        app.kubernetes.io/part-of: multi-juicer
        app.kubernetes.io/version: v17.2.0
        team: testteam
    spec:
      affinity: {}
      containers:
      - env:
        - name: NODE_ENV
          value: multi-juicer
        - name: CTF_KEY
          value: [email protected]!9uR_K!NfkkTr
        - name: SOLUTIONS_WEBHOOK
          value: http://progress-watchdog.default.svc/team/testteam/webhook
        image: bkimminich/juice-shop:v17.2.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /rest/admin/application-version
            port: 3000
            scheme: HTTP
          initialDelaySeconds: 30
          periodSeconds: 15
          successThreshold: 1
          timeoutSeconds: 1
        name: juice-shop
        ports:
        - containerPort: 3000
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /rest/admin/application-version
            port: 3000
            scheme: HTTP
          periodSeconds: 5
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          requests:
            cpu: 150m
            memory: 300Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
        startupProbe:
          failureThreshold: 150
          httpGet:
            path: /rest/admin/application-version
            port: 3000
            scheme: HTTP
          periodSeconds: 2
          successThreshold: 1
          timeoutSeconds: 1
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /juice-shop/config/multi-juicer.yaml
          name: juice-shop-config
          readOnly: true
          subPath: multi-juicer.yaml
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        runAsNonRoot: true
      terminationGracePeriodSeconds: 30
      volumes:
      - configMap:
          defaultMode: 420
          name: juice-shop-config
        name: juice-shop-config
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2025-04-21T19:12:28Z"
    lastUpdateTime: "2025-04-21T19:12:28Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2025-04-21T19:11:23Z"
    lastUpdateTime: "2025-04-21T19:12:28Z"
    message: ReplicaSet "juiceshop-testteam-865cc69587" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 8
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1


Copy link
Member

@J12934 J12934 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi thanks for the effort put into this. 🙌

please have a look the comments posted :)

case NoOp:
// logger.Printf("Progress for team '%s' is in sync", team)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented that out because I thought it might get too noisy in the logs. Is it okay to leave it commented out, or should I remove the line entirely?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave it in.
Haven't found it too noisy yet.

And yes, if you think it would be too noisy the better thing to do would be to delete the line.
Commented out code is really not great.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let it be then
Will make it a line of code then

Comment on lines +184 to +189
// Refetch after applying to ensure persistence uses the newly set state
// Although, ideally Apply should make the instance match the last known state,
// re-fetching adds a layer of verification before potentially overwriting annotations.
// However, this adds latency and complexity. Let's trust the apply works for now
// and persist the state we *intended* to apply.
// If issues arise, re-fetch here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment. nothing is refetched here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the confusing comment! You are correct, the code currently doesn't refetch after applying. It persists the lastChallengeProgress, lastFindItCode, and lastFixItCode (the state we intended to apply) immediately after the apply* calls. The comment was kinda reflecting on a possible alternative (refetching) but doesn't match the implementation. I'll rephrase the comment to be accurate.


if errStandard != nil {
logger.Println(fmt.Errorf("failed to fetch current Standard Challenge Progress for team '%s': %w", team, errStandard))
// We can skip this cycle too
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why "too" this is the first skip

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it. Sorry

@@ -85,24 +95,37 @@ func createProgressUpdateJobs(progressUpdateJobs chan<- ProgressUpdateJobs, clie
}
juiceShops, err := clientset.AppsV1().Deployments(namespace).List(context.TODO(), opts)
if err != nil {
panic(err.Error())
// Log error instead of panicking
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the panic here is intentional.
if it panics the progress watch dog is restarted.

if we just log it might get stuck in non recoverable state forever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it because I (ok not me but chatgpt/blogs which claim they teach you best practices lol) thought logging errors was preferred over panicking, but I understand the reasoning for wanting a restart if listing deployments (a core function) fails. I'll revert that change back to using panic.

Comment on lines +23 to +24
AnnotationContinueCodeFindIt = "multi-juicer.owasp-juice.shop/continueCodeFindIt"
AnnotationContinueCodeFixIt = "multi-juicer.owasp-juice.shop/continueCodeFixIt"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally i don't really like that we are now persisting the coding challenge state differently than the hacking challenge state. would be way nicer if these two would also be array like the challenges annotation.

espeically if we want to use them later on in the MJ score board, this would make this hard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented it this way primarily to address the backup/restore requirement using the existing continueCodeFindIt and continueCodeFixIt mechanisms that Juice Shop seems to use internally for state restoration, similar to the standard continueCode.

Storing the detailed progress for coding challenges (like which specific snippets are solved) would indeed be better long-term but seems more complex. It would likely require understanding Juice Shop's internal representation and potentially new API endpoints to expose that structured data, rather than just fetching/applying the opaque restore codes.

For the immediate goal of preventing progress loss on restart, storing the codes seemed like the most direct approach based on the current Juice Shop capabilities I could find.

I'll take my time, explore and then go with the preferred direction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the thing is, doing it the same way as the hacking challenges would be a lot simpler than this is now though...

It would completely eliminate the need to do separate requests to the JuiceShop instances for either continueCode, the coding challenge status is already part of the response that JuiceShop sends, it's just not part of the golang type.

So instead of sending 3 requests per sync instance it would just be one request which response is then taken to fill the hacking, findit/fixit coding challenge arrays.

The continue codes use the same algorithm which is already present in the progress-watchdog.

Also if we do it this way now, we'll have a migration to deal with soon so then migrate the backed up continue codes the the proper format....

LastChallengeProgress []ChallengeStatus
Team string
LastChallengeProgress []ChallengeStatus
LastContinueCodeFindIt string // Add FindIt code from annotation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of places in this PR have these code comments which only sense in context of a PR.
If somebody looks at this in a year they will be confused why this says "// Add FindIt code from annotation"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. Those comments were just notes during development to track the changes. I'll clean them up and remove them before marking the PR as ready.

}

// Generic function to apply a continue code via PUT request
func applyCodeToEndpoint(url string, codeType string, team string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code style: would prefer if this would take in the team and the continue code here instead of the url.
the codeType should then also be used to generate the url and the format of the values should match the one from codeType in getContinueCode. ideally both would use a string enum / the clunky equivalent in go (https://www.sohamkamani.com/golang/enums/)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense for better encapsulation and consistency. I'll refactor applyCodeToEndpoint to take team, codeType, and code as arguments and generate the URL internally based on the codeType. I'll define constants (e.g., CodeTypeStandard) for the codeType parameter to improve readability and maintainability, similar to an enum approach.

}
code, ok := response["continueCode"]
if !ok {
return "", fmt.Errorf("'continueCode' field not found in %s response", codeType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a extremly common case. any team which haven't used coding challenges will get two of the log lines every minute.

failed to fetch current FindIt Code for team 'bar': 'continueCode' field not found in findIt response
failed to fetch current FixIt Code for team 'bar': 'continueCode' field not found in fixIt response

this might be somewhat confusing to people having a look at the logs, as there is no actual error or failure here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about the log spam for cases where coding challenges haven't been started. I agree that a 404 or missing continueCode field in the response likely just means "no progress yet" and isn't a true error.

I'll modify getContinueCode to specifically handle http.StatusNotFound by returning "", nil (empty code, no error). I'll also adjust the JSON decoding part to handle the case where the continueCode key might be missing in a 200 OK response (if that's possible) and return "", nil in that scenario too, removing the error logs for these expected "no progress" cases.

}
}
}

// Helper to compare fetched code vs stored code, considering fetch errors
func compareCodes(currentCode, lastCode string, fetchErr error) UpdateState {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice if we could add tests for these very easily testable functions.
the test coverage for the progress watchdog right now isn't great but it would be good if it could at least slightly improve / remain stable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will add unit tests for the compareCodes function to cover the different scenarios (NoOp, ApplyCode, UpdateCache, handling fetch errors).

currentFixItCode, errFixIt := getCurrentFixItCode(team)

if errStandard != nil {
logger.Println(fmt.Errorf("failed to fetch current Standard Challenge Progress for team '%s': %w", team, errStandard))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicky. can you try to match the log style? all other log lines in the progress-watchdog start with a uppercase letter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, I'll update the log message capitalization to match the existing style (e.g., "Failed to fetch..." instead of "failed to fetch...").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants