Add files via upload #1

kashikamahajan · 2025-10-13T20:22:44Z

No description provided.

iross

I think it's in good shape overall! There are a few things worth addressing here and I think we should go ahead and fix them as part of this PR so that you can try out that aspect of the PR workflow. Commits to this branch will automatically be included in the PR, and you can mark all the comments as resolved as you push fixes.

iross · 2025-10-14T16:45:49Z

README.md

+
+## 📁 Repository Structure
+
+  - hold_classifer.py # Diagnoses held jobs and groups by hold reasons


It looks like the files have been renamed since this readme was written.

iross · 2025-10-14T16:52:23Z

analytics.py

+        if wall_time:
+            runtimes.append(wall_time)
+
+    from statistics import median


You've already imported statistics, so no need to do it again here: you can just use statistics.median

iross · 2025-10-14T16:56:26Z

analytics.py

+    print(f"{'Efficiency Notes':^80}")
+    print("=" * 80)
+
+    def warn(resource, efficiency):


I think the upper and lower bounds should be arguments. I can definitely see value in being able to set different thresholds for each resource type.

iross · 2025-10-14T16:56:53Z

analytics.py

+
+if __name__ == "__main__":
+    if len(sys.argv) != 2:
+        print("Usage: python htcondor_cluster_summary.py <ClusterId>")


Name of the script is wrong

iross · 2025-10-14T16:57:53Z

analytics.py

+    filepath = os.path.join(data_dir, f"cluster_{cluster_id}_jobs.csv")
+
+    if not os.path.exists(filepath):
+        print(f"File not found: {filepath}")


This won't be an informative error message for a user.

iross · 2025-10-14T18:04:46Z

hold_bucket.py

+        code = ad.eval("HoldReasonCode")
+        subcode = ad.eval("HoldReasonSubCode")
+
+        reason = ad.eval("HoldReason").split('. ')[0]


Comment here to explain why the split. I'm guessing it eliminates a lot of the EP-specific things that would muddy up the similarities?

iross · 2025-10-14T18:07:30Z

hold_bucket.py

+        Dict[int, List[Tuple[str, int]]]: A dictionary mapping each HoldReasonCode to a list of (HoldReason, HoldReasonSubCode) tuples.
+"""
+def group_by_code(cluster_id):
+    global total_jobs


Why is this a global? It seems to only be used in this function.

iross · 2025-10-14T18:14:44Z

query.py

+
+    while hits and len(all_hits) < MAX_RESULTS:
+        remaining = MAX_RESULTS - len(all_hits)
+        to_add = hits[:remaining]


I'm not seeing the reason for this. If hits is just the next page of results from ES, I don't see why we'd want to potentially truncate that collection.

iross · 2025-10-14T18:23:56Z

summarise.py

+def load_job_data(cluster_id, folder="cluster_data"):
+    filepath = os.path.join(folder, f"cluster_{cluster_id}_jobs.csv")
+    if not os.path.exists(filepath):
+        print(f"File not found: {filepath}")


Again, a user who sees this message won't necessarily understand that it was a result of a bad ClusterID (or at least a Cluster whose csv isn't where the program expected)

iross · 2025-10-14T18:25:30Z

summarise.py

+    return jobs
+
+# safe conversion to float
+def safe_float(val):


Things like this (and the validate_params) which are repeated patterns can be moved into utils so that definition changes only need to happen once.

Oh! It looks like you started to define things there, but haven't imported them from there.

kashikamahajan and others added 2 commits October 13, 2025 15:22

Add files via upload

316217b

test

5cc2aa4

iross self-requested a review October 14, 2025 13:47

Delete test

7953ec7

iross reviewed Oct 14, 2025

View reviewed changes

Add note about internship

b390108


		## 📁 Repository Structure

		- hold_classifer.py # Diagnoses held jobs and groups by hold reasons

Add files via upload #1

Are you sure you want to change the base?

Add files via upload #1

Uh oh!

Conversation

kashikamahajan commented Oct 13, 2025

Uh oh!

iross left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants