Skip to content

Commit 7de9516

Browse files
committed
Add quick ref page
1 parent 918731d commit 7de9516

File tree

1 file changed

+74
-0
lines changed

1 file changed

+74
-0
lines changed
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
---
2+
layout: guide
3+
title: "Quick reference: HTCondor commands"
4+
alt_title: "Quick reference: HTCondor commands"
5+
guide:
6+
order: 5
7+
category: Basics and Policies
8+
tag: htc
9+
---
10+
11+
## Introduction
12+
13+
This page lists common HTCondor commands and options for jobs. Users familiar with HTCondor and job submission on CHTC's High Throughput Computing (HTC) system can use this page as a quick reference. For users who are just starting out, we suggest reading the full guides (linked) to understand the full context of each command or option.
14+
15+
Note: Bracketed items (`<>`) denote where to place your input. Do not include the brackets in your command.
16+
17+
## Job submission
18+
19+
You can use these commands to submit, hold, or remove your jobs.
20+
21+
| Command | Use | Notes |
22+
| --- | --- | --- |
23+
| `condor_submit <submit_file>` | submits job(s) as specified by `submit_file` | See [job submission](htcondor-job-submission) |
24+
| `condor_submit -i <submit_file>` | submits an interactive job as specified by `submit_file` |
25+
| `condor_rm <username>` | removes all of your jobs |
26+
| `condor_rm <job_ID>` | removes the job(s) associated with `job_ID` |
27+
| `condor_hold <job_ID>` | holds the job(s) associated with `job_ID` |
28+
| `condor_release <job_ID>` | releases the held job(s) associated with `job_ID` |
29+
| `condor_ssh_to_job <job_ID>` | allows you to "ssh" to the execution point on which the job associated with `job_ID` is running on |
30+
{:.command-table}
31+
32+
## Monitor jobs
33+
34+
| Command | Use | Notes |
35+
| --- | --- | --- |
36+
| `condor_q` | displays status of your submitted jobs; jobs are batched by default | See [monitor your jobs](condor_q) |
37+
| `condor_q -nobatch` | displays status of your submitted jobs without the batched view |
38+
| `condor_q <job_ID>` | displays status of the job(s) associated with `job_ID` |
39+
| `condor_q -l <job_ID>` | lists all attributes of the job(s) associated with `job_ID` |
40+
| `condor_q -hold <job_ID>` | displays the hold reason for job(s) associated with `job_ID` |
41+
| `condor_q -better-analyze <job_ID>` | displays *simulated* results of the matching process associated with the job | This is a *starting point* for troubleshooting jobs sitting in the idle state. |
42+
| `watch_condor_q` | displays the "real-time" status of your jobs | Updated every 2 seconds. `CTRL + C` to exit. |
43+
| `condor_tail <job_ID>` | prints the last 10 lines of the standard output the job associated with `job_ID` |
44+
{:.command-table}
45+
46+
## Machine information
47+
48+
These commands display information about the execution points - machines that execute/run jobs.
49+
50+
| Command | Use | Notes |
51+
| --- | --- | --- |
52+
| `condor_status` | lists all execution point slots |
53+
| `condor_status <execution_point>` | lists information about the specified `execution_point` |
54+
| `condor_status -l <execution_point>` | lists all attributes of `execution_point` |
55+
| `condor_status -compact` | lists all execution_point slots in a compact view |
56+
| `condor_status -gpus` | lists all execution_point slots with GPUs |
57+
{:.command-table}
58+
59+
## Glossary
60+
61+
| Term | Meaning |
62+
| --- | --- |
63+
| access point | The machine which you log into to access CHTC's servers. This is the machine you use to prepare files for job submission and submit jobs. |
64+
| cluster ID | A unique number associated with a single job submission. |
65+
| error file / standard error | The file where your job typically prints any errors. |
66+
| execution point | The machine that executes or runs your job. |
67+
| idle | A job state in which your job has not yet matched to an execution point and hasn't started running yet. |
68+
| job ID | The unique number associated with a job. This consists of a cluster ID, followed by a period and a process ID. For example, the job ID `12345.0` has a cluster ID of `12345` and a process ID of `0`. |
69+
| log file | The file where HTCondor prints messages about your job's execution and resource usage. |
70+
| output file / standard out | The file where your job typically prints output. Any messages printed to the "screen" in a job will be saved in this file. |
71+
| process ID | A unique number associated with each job within a job submission. See [submit multiple jobs](multiple-jobs). |
72+
| running | A job state in which your job has matched to the execution point and is currently executing/running. |
73+
| held/hold | A job state in which your job has stopped running due to an error. |
74+
| submit file | A text-based file that tells HTCondor details about your job, including the commands to run, what files need to be transferred, where to save the outputs, and more. See the [job submission](htcondor-job-submission). |

0 commit comments

Comments
 (0)