You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _uw-research-computing/htc-job-file-transfer.md
+53-29Lines changed: 53 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,18 +8,24 @@ guide:
8
8
- htc
9
9
---
10
10
11
+
## Introduction
12
+
13
+
This guide covers general information on using and transferring data on the HTC system. We will introduce you to the two file systems, how to determine which one is the best place for your data, and how to edit your submit file to transfer input and output files.
-[Transferring Data to Jobs with `transfer_input_files`](#transferring-data-to-jobs-with-transfer_input_files)
14
-
*[Important Note: File Transfers and Caching with `osdf:///`](#important-note-file-transfers-and-caching-with-osdf)
15
-
-[Transferring Data Back from Jobs to `/home` or `/staging`](#transferring-data-back-from-jobs-to-home-or-staging)
16
-
*[Default Behavior for Transferring Output Files](#default-behavior-for-transferring-output-files)
17
-
*[Specify Which Output Files to Transfer with `transfer_output_files` and `transfer_output_remaps`](#specify-which-output-files-to-transfer-with-transfer_output_files-and-transfer_output_remaps)
@@ -43,17 +49,23 @@ The data management mechanisms behind `/home` and `/staging` are different and a
43
49
</div>
44
50
45
51
46
-
## Transferring Data to Jobs with `transfer_input_files`
52
+
## Transfer input data to jobs with `transfer_input_files`
47
53
48
-
In the HTCondor submit file, `transfer_input_files` should always be used to tell HTCondor what files to transfer to each job, regardless of if that file originates from your `/home` or `/staging` directory. However, the syntax you use to tell HTCondor to fetch files from `/home`and`/staging` and transfer to your job will change depending on the file size.
54
+
To transfer files to jobs, we must specify these files with `transfer_input_files` in the HTCondor job submit file. The syntax you use will depend on its location and file size.
49
55
50
-
| Input Sizes| File Location | Submit File Syntax to Transfer to Jobs |
56
+
| Input File Size (Per File)*| File Location | Submit File Syntax to Transfer to Jobs |
| 100 GB+ || Contact the facilitation team about the best strategy to stage your data |
63
+
64
+
<caption>
65
+
<sup>*</sup> If you are transferring many small files, we recommend <a href="transfer-files-computer#transfer-multiple-files-using-tarballs">compressing them into a single file (.zip, .tar.gz)</a> before transfer. Use the size of the compressed file to determine where to place it.<br>
66
+
<sup>†</sup> Only files in personal staging directories can be transferred to jobs with the <code>osdf:///</code> protocol. Files in shared directories (i.e. <code>/staging/groups</code>) currently cannot be transferred to jobs with <code>osdf:///</code> and should use <code>file:///</code>.<br>
67
+
<!--<sup>‡</sup> While available on external pools, file transfer performance may be limited.-->
68
+
</caption><br>
57
69
58
70
Multiple input files and file transfer protocols can be specified and delimited by commas, as shown below:
Ensure you are using the correct file transfer protocol for efficiency. Failure to use the right protocol can result in slow file transfers or overloading the system.
70
82
71
-
### Important Note: File Transfers and Caching with `osdf:///`
72
-
The `osdf:///` file transfer protocol uses a [caching](https://en.wikipedia.org/wiki/Cache_(computing)) mechanism for input files to reduce file transfers over the network. This can affect users who refer to input files that are frequently modified.
83
+
> ### ⚠️ File transfers and caching with `osdf:///`
84
+
{:.tip-header}
73
85
74
-
*If you are changing the contents of the input files frequently, you should rename the file or change its path to ensure the new version is transferred.*
86
+
> The `osdf:///` file transfer protocol uses a [caching](https://en.wikipedia.org/wiki/Cache_(computing)) mechanism for input files to reduce file transfers over the network.
87
+
>
88
+
> The caching mechanism enables faster transfers for frequently used files/containers. However, older versions of frequently modified files may be transferred instead of the latest version.
89
+
>
90
+
> **If you are changing the contents of the input files frequently, you should rename the file or change its path to ensure the new version is transferred.**
91
+
{:.tip}
75
92
76
-
## Transferring Data Back from Jobs to `/home` or `/staging`
93
+
## Transfer output data from jobs
77
94
78
-
### Default Behavior for Transferring Output Files
79
-
When a job completes, by default, HTCondor will return **newly created or edited files only in top-level directory** back to your `/home` directory. **Files in subdirectories are *not* transferred.** Ensure that the files you want are in the top-level directory by moving them, [creating tarballs](transfer-files-computer#transfer-multiple-files-using-tarballs), or specifying them in your submit file.
95
+
### Default behavior for transferring output files
96
+
When a job completes, by default, HTCondor will only return **newly created or edited files in top-level directory** back to your `/home` directory. **Files in subdirectories are *not* transferred.** Ensure that the files you want are in the top-level directory by moving them, [creating tarballs](transfer-files-computer#transfer-multiple-files-using-tarballs), or specifying them in your submit file.
<caption>The directory structure of an example job on the execution point. In this example, according to its default behavior, HTCondor will only transfer the newly created "output_file" and will not transfer the subdirectory "output/".</caption>
83
100
84
-
### Specify Which Output Files to Transfer with `transfer_output_files` and `transfer_output_remaps`
101
+
### Specify which output files to transfer with `transfer_output_files`
85
102
If you don't want to transfer all files but only *specific files*, in your HTCondor submit file, use
In this example above, `file1.txt` is remapped to the staging directory using the `file:///` transfer protocol and simultaneously renamed `output1.txt`. In addition, `file2.txt` is renamed to `output2.txt`and will be transferred to a different directory on `/home`. Ensure you have the right file transfer syntax (`osdf:///` or `file:///` depending on the anticipated file size).
116
+
In this example above, `output_file` is remapped to the staging directory using the `file:///` transfer protocol and simultaneously renamed `output1.txt`. In addition, `output_file2` is transferred to a different directory on `/home`. The last output file, `output_file3` is transferred back to the original directory from where the job was submitted from. Ensure you have the right file transfer syntax (`osdf:///` or `file:///` depending on the anticipated file size).
117
+
118
+
Make sure to only include one set of quotation marks that wraps around the information you are feeding to `transfer_output_remaps`.
119
+
120
+
### Transfer files to other locations with `output_destination`
121
+
122
+
If you want to transfer *all* files to a specific destination, use `output_destination`:
98
123
99
-
If you have multiple files or folders to transfer back to `/staging`, use a semicolon (;) to separate each object:
0 commit comments