You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _uw-research-computing/variable-memory.md
+23-18Lines changed: 23 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,18 +10,29 @@ guide:
10
10
11
11
## Introduction
12
12
13
-
Over-requesting memory may cause your jobs to wait in idle for longer than needed, but under-requesting memory may cause your jobs to go on hold when they do exceed the memory allocated to your job.
14
-
15
13
**This page outlines strategies for requesting variable amounts of memory in jobs.** This guide is for users whose memory usage for a list of jobs may spike unexpectedly or vary depending on inputs or other conditions.
16
14
17
15
{% capture content %}
18
16
-[Introduction](#introduction)
17
+
-[Why you should care about memory usage](#why-you-should-care-about-memory-usage)
{% include /components/directory.html title="Table of Contents" %}
21
22
22
-
## Option 1: Use `retry_request_memory`
23
+
If your job has ever gone on hold for exceeding memory use, you've probably solved it by increasing your `request_memory` attribute in your submit file. You might even always over-request memory, just to be on the safe side. But have you ever checked your HTCondor `.log` file to see how much memory you actually used?
24
+
25
+
## Why you should care about memory usage
26
+
27
+
***Over-requesting memory** may cause your jobs to **wait in idle** for longer than needed, since HTCondor needs to find and allocate these larger resource requests for your jobs. Additionally, CHTC's HTC system is a shared resource, so we encourage you to be a good citizen and only request the resources you need for your jobs.
28
+
29
+
***Under-requesting memory** may cause your jobs to **go on hold** when they do exceed the memory allocated to your job.
23
30
24
-
This submit file option is good for jobs where a **few of their jobs have unexpected spikes in memory usage**. To use this feature, add this line to your submit file:
31
+
> But what if a fraction of your jobs needs more memory than the rest of the list of jobs? How can you get the throughput you need without over-requesting memory?
32
+
33
+
## Use `retry_request_memory`
34
+
35
+
This submit file option is good for jobs where a **few of the jobs have unexpected spikes in memory usage**. To use this feature, add this line to your submit file:
25
36
26
37
```
27
38
retry_request_memory = <memory>
@@ -38,26 +49,20 @@ retry_request_memory = 4 GB
38
49
39
50
Each job generated in this submission will request 1 GB of memory. If the job is evicted because it uses more than 1 GB of memory, the job will be restarted with 4 GB of memory.
40
51
41
-
## Option 2: Use `retry_request_memory_increase` and `retry_request_memory_max`
42
-
43
-
If you need a more incremental list of memory options, you can use these two submit file attributes together.
52
+
You may also use expressions:
44
53
45
-
```
46
-
retry_request_memory_increase = <quantity to add or RequestMemory expression>
47
-
retry_request_memory_max = <memory>
48
-
```
49
-
50
-
This option works similar to `retry_request_memory`, except allowing multiple retries in increments.
51
-
52
-
For example, if you use these lines in your submit file:
53
54
```
54
55
request_memory = 1 GB
55
-
retry_request_memory_increase = RequestMemory*4
56
-
retry_request_memory_max = 16 GB
56
+
retry_request_memory = RequestMemory*4
57
57
```
58
58
59
-
Your jobs will be submitted at three increments of increasing memory (1 GB, 4 GB, and 16 GB) until they succeed. If your jobs exceed 16 GB of memory, they will go on hold.
59
+
When using expressions:
60
+
61
+
* We recommend *only* multiplying by integers.
62
+
* Addition expressions and floating point numbers are not recommended.
0 commit comments