|
| 1 | +--- |
| 2 | +highlighter: none |
| 3 | +layout: guide |
| 4 | +title: Request variable memory |
| 5 | +guide: |
| 6 | + category: Troubleshooting |
| 7 | + tag: |
| 8 | + - htc |
| 9 | +--- |
| 10 | + |
| 11 | +## Introduction |
| 12 | + |
| 13 | +**This page outlines strategies for requesting variable amounts of memory in jobs.** This guide is for users whose memory usage for a list of jobs may spike unexpectedly or vary depending on inputs or other conditions. |
| 14 | + |
| 15 | +{% capture content %} |
| 16 | +- [Introduction](#introduction) |
| 17 | +- [Why you should care about memory usage](#why-you-should-care-about-memory-usage) |
| 18 | +- [Use `retry_request_memory`](#use-retry_request_memory) |
| 19 | +- [Related pages](#related-pages) |
| 20 | +{% endcapture %} |
| 21 | +{% include /components/directory.html title="Table of Contents" %} |
| 22 | + |
| 23 | +If your job has ever gone on hold for exceeding memory use, you've probably solved it by increasing your `request_memory` attribute in your submit file. You might even always over-request memory, just to be on the safe side. |
| 24 | + |
| 25 | +## Why you should care about memory usage |
| 26 | + |
| 27 | +Because CHTC is a shared resource, correctly requesting the resources that you require for your jobs to function ensures that both you and other users have a good experience on the system. |
| 28 | + |
| 29 | +* **Over-requesting memory** may cause your jobs to **wait in idle** for longer than needed, since HTCondor needs to find and allocate these larger resource requests for your jobs. And resources unused by your job could be used for others' jobs. |
| 30 | + |
| 31 | +* **Under-requesting memory** may cause your jobs to **go on hold** when they do exceed the memory allocated to your job. Whatever work by your job will be lost but the computing time will still affect your priority. |
| 32 | + |
| 33 | +But what if only a **fraction** of your jobs needs more memory than the rest of the list of jobs? How can you get the throughput you need without over-requesting memory? |
| 34 | + |
| 35 | +## Use `retry_request_memory` |
| 36 | + |
| 37 | +This submit file option is good for jobs where a **few of the jobs have unexpected spikes in memory usage**. To use this feature, add this line to your submit file: |
| 38 | + |
| 39 | +``` |
| 40 | +retry_request_memory = <memory> |
| 41 | +``` |
| 42 | + |
| 43 | +If your job is evicted because it uses more memory than allocated, the `retry_request_memory` option tells HTCondor to retry the job with the specified increased memory. |
| 44 | + |
| 45 | +For example, if you use these lines in your submit file: |
| 46 | + |
| 47 | +``` |
| 48 | +request_memory = 1 GB |
| 49 | +retry_request_memory = 4 GB |
| 50 | +``` |
| 51 | + |
| 52 | +Each job generated in this submission will request 1 GB of memory. If the job is evicted because it uses more than 1 GB of memory, the job will be restarted with 4 GB of memory. |
| 53 | + |
| 54 | +You may also use expressions: |
| 55 | + |
| 56 | +``` |
| 57 | +request_memory = 1 GB |
| 58 | +retry_request_memory = RequestMemory*4 |
| 59 | +``` |
| 60 | + |
| 61 | +When using expressions: |
| 62 | + |
| 63 | +* We recommend *only* multiplying by integers. |
| 64 | +* Expressions using addition operators or floating point numbers are not recommended. |
| 65 | + |
| 66 | +## Related pages |
| 67 | + |
| 68 | +* [HTCondor manual reference](https://htcondor.readthedocs.io/en/main/man-pages/condor_submit.html#retry_request_memory) |
| 69 | +* [Job submission basics](htcondor-job-submission) |
| 70 | +* [Monitor your job](condor_q) |
0 commit comments