|
| 1 | +--- |
| 2 | +layout: guide |
| 3 | +title: "Quick reference: HTCondor commands" |
| 4 | +alt_title: "Quick reference: HTCondor commands" |
| 5 | +guide: |
| 6 | + order: 5 |
| 7 | + category: Basics and Policies |
| 8 | + tag: htc |
| 9 | +--- |
| 10 | + |
| 11 | +## Introduction |
| 12 | + |
| 13 | +This page lists common HTCondor commands and options for jobs. Users familiar with HTCondor and job submission on CHTC's High Throughput Computing (HTC) system can use this page as a quick reference. For users who are just starting out, we suggest reading the full guides (linked) to understand the full context of each command or option. |
| 14 | + |
| 15 | +Note: Bracketed items (`<>`) denote where to place your input. Do not include the brackets in your command. |
| 16 | + |
| 17 | +## Job submission |
| 18 | + |
| 19 | +You can use these commands to submit, hold, or remove your jobs. |
| 20 | + |
| 21 | +| Command | Use | Notes | |
| 22 | +| --- | --- | --- | |
| 23 | +| `condor_submit <submit_file>` | submits job(s) as specified by `submit_file` | See [job submission](htcondor-job-submission) | |
| 24 | +| `condor_submit -i <submit_file>` | submits an interactive job as specified by `submit_file` | |
| 25 | +| `condor_rm <username>` | removes all of your jobs | |
| 26 | +| `condor_rm <job_ID>` | removes the job(s) associated with `job_ID` | |
| 27 | +| `condor_hold <job_ID>` | holds the job(s) associated with `job_ID` | |
| 28 | +| `condor_release <job_ID>` | releases the held job(s) associated with `job_ID` | |
| 29 | +| `condor_ssh_to_job <job_ID>` | allows you to "ssh" to the execution point on which the job associated with `job_ID` is running on | |
| 30 | +{:.command-table} |
| 31 | + |
| 32 | +## Monitor jobs |
| 33 | + |
| 34 | +| Command | Use | Notes | |
| 35 | +| --- | --- | --- | |
| 36 | +| `condor_q` | displays status of your submitted jobs; jobs are batched by default | See [monitor your jobs](condor_q) | |
| 37 | +| `condor_q -nobatch` | displays status of your submitted jobs without the batched view | |
| 38 | +| `condor_q <job_ID>` | displays status of the job(s) associated with `job_ID` | |
| 39 | +| `condor_q -l <job_ID>` | lists all attributes of the job(s) associated with `job_ID` | |
| 40 | +| `condor_q -hold <job_ID>` | displays the hold reason for job(s) associated with `job_ID` | |
| 41 | +| `condor_q -better-analyze <job_ID>` | displays *simulated* results of the matching process associated with the job | This is a *starting point* for troubleshooting jobs sitting in the idle state. | |
| 42 | +| `watch_condor_q` | displays the "real-time" status of your jobs | Updated every 2 seconds. `CTRL + C` to exit. | |
| 43 | +| `condor_tail <job_ID>` | prints the last 10 lines of the standard output the job associated with `job_ID` | |
| 44 | +{:.command-table} |
| 45 | + |
| 46 | +## Machine information |
| 47 | + |
| 48 | +These commands display information about the execution points - machines that execute/run jobs. |
| 49 | + |
| 50 | +| Command | Use | Notes | |
| 51 | +| --- | --- | --- | |
| 52 | +| `condor_status` | lists all execution point slots | |
| 53 | +| `condor_status <execution_point>` | lists information about the specified `execution_point` | |
| 54 | +| `condor_status -l <execution_point>` | lists all attributes of `execution_point` | |
| 55 | +| `condor_status -compact` | lists all execution_point slots in a compact view | |
| 56 | +| `condor_status -gpus` | lists all execution_point slots with GPUs | |
| 57 | +{:.command-table} |
| 58 | + |
| 59 | +## Glossary |
| 60 | + |
| 61 | +| Term | Meaning | |
| 62 | +| --- | --- | |
| 63 | +| access point | The machine which you log into to access CHTC's servers. This is the machine you use to prepare files for job submission and submit jobs. | |
| 64 | +| cluster ID | A unique number associated with a single job submission. | |
| 65 | +| error file / standard error | The file where your job typically prints any errors. | |
| 66 | +| execution point | The machine that executes or runs your job. | |
| 67 | +| idle | A job state in which your job has not yet matched to an execution point and hasn't started running yet. | |
| 68 | +| job ID | The unique number associated with a job. This consists of a cluster ID, followed by a period and a process ID. For example, the job ID `12345.0` has a cluster ID of `12345` and a process ID of `0`. | |
| 69 | +| log file | The file where HTCondor prints messages about your job's execution and resource usage. | |
| 70 | +| output file / standard out | The file where your job typically prints output. Any messages printed to the "screen" in a job will be saved in this file. | |
| 71 | +| process ID | A unique number associated with each job within a job submission. See [submit multiple jobs](multiple-jobs). | |
| 72 | +| running | A job state in which your job has matched to the execution point and is currently executing/running. | |
| 73 | +| held/hold | A job state in which your job has stopped running due to an error. | |
| 74 | +| submit file | A text-based file that tells HTCondor details about your job, including the commands to run, what files need to be transferred, where to save the outputs, and more. See the [job submission](htcondor-job-submission). | |
0 commit comments