Skip to content

Commit 8653ea7

Browse files
authored
Merge pull request #933 from CHTC/shell-submit-option
Add `shell` option to job submission guide
2 parents 89e9cdb + 7902a1b commit 8653ea7

File tree

1 file changed

+80
-22
lines changed

1 file changed

+80
-22
lines changed

_uw-research-computing/htcondor-job-submission.md

Lines changed: 80 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ We are going to run the traditional 'hello world' program with a CHTC twist. In
3535

3636
> You can follow along with the job submission tutorial outlined in this guide in video format.
3737
> <iframe width="560" height="315" src="https://www.youtube.com/embed/d5siupeu2kE?si=32FUkZyceV9ROfb1" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
38+
> You may notice that the example in the video is slightly different—it uses `executable` and `arguments` in the submit file instead of `shell`. This is reflects an older submit convention, however, either case still works!
3839
{:.tip}
3940

4041
### Prepare job executable and submit file on an Access Point
@@ -55,50 +56,76 @@ We are going to run the traditional 'hello world' program with a CHTC twist. In
5556
sleep 180
5657
```
5758

58-
This script would be run locally on our terminal by typing `hello-world.sh <FirstArgument>`.
59-
However, to run it on CHTC, we will use our HTCondor submit file to run the `hello-world.sh` executable and to automatically pass different arguments to our script.
59+
Let's test this script locally. First, let's add *executable* permissions to the script with the `chmod` command, which allows us to execute the code.
60+
61+
```
62+
chmod +x hello-world.sh
63+
```
64+
{:.term}
65+
66+
Test the code by typing the following line:
67+
```
68+
./hello-world.sh 0
69+
```
70+
{:.term}
71+
72+
You should see a message printed to the terminal, like so:
73+
74+
```
75+
[alice@ap2002 hello-world]$ ./hello-world.sh 0
76+
Hello CHTC from Job 0 running on alice@ap2002
77+
```
78+
{:.term}
79+
80+
The terminal will pause for 3 minutes, as specified by `sleep 180` in our script. Cancel the pause time by pressing `CTRL + C`. Now we've successfully run the script locally!
81+
82+
However, to run it on CHTC, we will use our HTCondor submit file to run the `hello-world.sh` executable and to automatically pass different arguments to our script.
83+
84+
> ### ⚠️ Do not test your full workload directly on the Access Points!
85+
{:.tip-header}
86+
87+
> Simple scripts, such as this example, which use few compute resources, are safe to test, but **any script or executable that requires computing power or excessive memory should be tested inside of a job.**
88+
{:.tip}
6089

6190
3. Prepare your HTCondor submit file, which you will use to tell HTCondor what job to run and how to run it.
6291
Copy the text below, and paste it into file called `hello-world.sub`.
6392
This is the file you will submit to HTCondor to describe your jobs (known as the submit file).
6493

6594
```
6695
# hello-world.sub
67-
# My HTCondor submit file
6896
6997
# Specify your executable (single binary or a script that runs several
7098
# commands) and arguments to be passed to jobs.
7199
# $(Process) will be a integer number for each job, starting with "0"
72100
# and increasing for the relevant number of jobs.
73-
executable = hello-world.sh
74-
arguments = $(Process)
101+
shell = ./hello-world.sh $(Process)
75102
76-
# Specify the name of the log, standard error, and standard output (or "screen output") files. Wherever you see $(Cluster), HTCondor will insert the
103+
# Specify the name of the log, standard error, and standard output (or
104+
# "screen output") files. Wherever you see $(Cluster), HTCondor will insert the
77105
# queue number assigned to this set of jobs at the time of submission.
78106
log = hello-world_$(Cluster)_$(Process).log
79107
error = hello-world_$(Cluster)_$(Process).err
80108
output = hello-world_$(Cluster)_$(Process).out
81109
82-
# This line *would* be used if there were any other files
83-
# needed for the executable to use.
84-
# transfer_input_files = file1,/absolute/pathto/file2,etc
110+
# Transfer our executable script
111+
transfer_input_files = hello-world.sh
85112
86-
# Tell HTCondor requirements (e.g., operating system) your job needs,
87-
# what amount of compute resources each job will need on the computer where it runs.
113+
# Requirements (e.g., operating system) your job needs, what amount of
114+
# compute resources each job will need on the computer where it runs.
88115
request_cpus = 1
89116
request_memory = 1GB
90117
request_disk = 5GB
91118
92-
# Tell HTCondor to run 3 instances of our job:
119+
# Run 3 instances of our job:
93120
queue 3
94121
```
95122
{:.sub}
96123

97-
By using the "$1" variable in our hello-world.sh executable, we are telling HTCondor to fetch the value of the argument in the first position in the submit file and to insert it in location of "$1" in our executable file.
124+
By using the "`$1`" variable in our hello-world.sh executable, we are telling HTCondor to fetch the value of the argument in the first position in the submit file and to insert it in location of "$1" in our executable file.
98125

99-
Therefore, when HTCondor runs this executable, it will pass the $(Process) value for each job and hello-world.sh will insert that value for "$1" in hello-world.sh.
126+
Therefore, when HTCondor runs this executable, it will pass the `$(Process)` value for each job and hello-world.sh will insert that value for "$1" in hello-world.sh.
100127

101-
More information on special variables like "$1", "$2", and "$@" can be found [here](https://swcarpentry.github.io/shell-novice/06-script.html).
128+
More information on special variables like "`$1"`, "`$2`", and "`$@`" can be found [here](https://swcarpentry.github.io/shell-novice/06-script.html).
102129

103130
5. Now, submit your job to HTCondor’s queue using `condor_submit`:
104131

@@ -191,7 +218,7 @@ We are going to run the traditional 'hello world' program with a CHTC twist. In
191218

192219
## Important Workflow Elements
193220

194-
**A. Removing Jobs**
221+
### Removing Jobs
195222

196223
To remove a specific job, use `condor_rm <JobID, ClusterID, Username>`.
197224
Example:
@@ -201,13 +228,11 @@ Example:
201228
```
202229
{:.term}
203230

204-
**B. Importance of Testing & Resource Optimization**
231+
### Test and Optimize Resources
205232

206-
1. **Examine Job Success** Within the log file, you can see information about the completion of each job, including a system error code (as seen in "return value 0").
207-
You can use this code, as well as information in your ".err" file and other output files, to determine what issues your job(s) may have had, if any.
233+
1. **Examine Job Success**. Within the log file, you can see information about the completion of each job, including a system error code (as seen in "return value 0"). You can use this code, as well as information in your ".err" file and other output files, to determine what issues your job(s) may have had, if any.
208234

209-
2. **Improve Efficiency** Researchers with input and output files greater than 1GB, should store them in their `/staging` directory instead of `/home` to improve file transfer efficiency.
210-
See our data transfer guides to learn more.
235+
2. **Improve Efficiency**. Researchers with input and output files greater than 1GB, should store them in their `/staging` directory instead of `/home` to improve file transfer efficiency. See our [data transfer guides](htc-job-file-transfer) to learn more.
211236

212237
3. **Get the Right Resource Requests**
213238
Be sure to always add or modify the following lines in your submit files, as appropriate, and after running a few tests.
@@ -237,4 +262,37 @@ Example:
237262
To learn more about why a job as gone on hold, use `condor_q -hold`.
238263
When you request too much, your jobs may not match to as many available "slots" as they could otherwise, and your overall throughput will suffer.
239264

240-
## You have the basics, now you are ready to run your OWN jobs!
265+
## Use `shell` or `executable`/`arguments` in your submit file
266+
267+
You can either use `shell` or `executable` and `arguments` in your submit file to specify how to run your jobs.
268+
269+
### Option 1: Submit with `shell`
270+
271+
You can use `shell` to specify the whole command you want to run.
272+
273+
```
274+
shell = ./hello-world.sh $(Process)
275+
transfer_input_files = hello-world.sh
276+
```
277+
278+
When using `shell`, consider:
279+
280+
* **Do you need to transfer your executable?** You may need to add your executable script (i.e., `hello-world.sh`) in the `transfer_input_files` line, as HTCondor does not have the ability to autodetect scripts to be transferred.
281+
* If you are using `./` to execute your code, as in the example above, **ensure your shell script has executable permissions** with the `chmod +x <script>` command.
282+
* Alternatively, **you may use a shell like `bash` to execute your code**, (i.e., `shell = bash hello-world.sh 0`). When you use this option, you do not have to give your shell script executable permissions.
283+
* **Keep your `shell` script simple**; quoting and special characters may throw errors. If you need complex scripting, we recommend writing a wrapper script.
284+
285+
### Option 2: `executable` and `arguments`
286+
287+
In this convention, you break your command into two parts—the executable and the arguments.
288+
289+
```
290+
executable = hello-world.sh
291+
arguments = $(Process)
292+
```
293+
294+
When using this option:
295+
296+
* **HTCondor will transfer your executable by default.** You do not need to list your executable in `transfer_input_files`.
297+
* You do not have to add a `./` or `/bin/bash` to the beginning of your `executable` line.
298+
* You do not have to give your `executable` script executable permissions.

0 commit comments

Comments
 (0)