Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After first pass, ram usage increases unexpectedly. Starts swapping and eventually crashes #1001

Open
Furious-George opened this issue Feb 26, 2022 · 0 comments

Comments

@Furious-George
Copy link

Furious-George commented Feb 26, 2022

I'm using madmax 1.1.8-ecec17d.

I have a supermicro X9DRI motherboard, with 512 GB of ddr3.

Prior to starting plotting, less than 1% of the system's memory is in use, even before I stop superfluous services:

              total        used        free      shared  buff/cache   available
Mem:      528338236     4535696   523449388       62292      353152   521400696
Swap:      25161724           0    25161724

I am attempting to run madmax in parallel, per numa node.

I create tmpfs partitions:

mount -t tmpfs -o size=245G,mpol=bind:0,defaults tmpfs /tmpfs0
mount -t tmpfs -o size=245G,mpol=bind:1,defaults tmpfs /tmpfs1

and I begin to plot:

numactl --cpubind=0 --membind=0 /root/chia-blockchain/chia-plotter/build/chia_plot -r 22 -u 8 -n -1 -t /tmpfs0/ -s /nvme/ -d /chia00/ -f foo -p  bar
numactl --cpubind=1 --membind=1 /root/chia-blockchain/chia-plotter/build/chia_plot -r 22 -u 8 -n -1 -t /tmpfs1/ -s /nvme/ -d /chia01/  -f foo -p bar

I also set swappiness to zero now:

sysctl vm.swappiness=0

The first pass everything seems to go fine. No swap storage gets used:

Network Port: 8444 [chia]
Final Directory: /chia37/
Stage Directory: /nvme/
Number of Plots: infinite
Crafting plot 1 out of -1 (2022/02/25 19:40:16)
Process ID: 30160
Number of Threads: 22
Number of Buckets P1:    2^8 (256)
Number of Buckets P3+P4: 2^8 (256)
Pool Public Key:   87360049e3f15e7b3b8d9f7605e5e6abfa44c254e183ffbf7f63a1c3a8a7d9265fedb3e80ad3f102402818a
24a18e6f8
Farmer Public Key: 8934b2b0af6fb032944907fb0f38ee706f2a557ce152fb9f7e07a7f7978c6e90cb6681cd90b4390c664513a
0dde89ab6
Working Directory:   /tmpfs0/
Working Directory 2: /tmpfs0/
Plot Name: plot-k32-2022-02-25-19-40-194f5119638f2c64f8808aabc4272e27b923a7bca63ecb8703fd24a96330c28d
[P1] Table 1 took 16.6896 sec
[P1] Table 2 took 135.187 sec, found 4294906657 matches
[P1] Table 3 took 147.339 sec, found 4294884563 matches
[P1] Table 4 took 166.244 sec, found 4294689520 matches
[P1] Table 5 took 164.188 sec, found 4294408554 matches
[P1] Table 6 took 159.424 sec, found 4293875369 matches
[P1] Table 7 took 127.028 sec, found 4292745068 matches
Phase 1 took 916.121 sec
[P2] max_table_size = 4294967296
[P2] Table 7 scan took 9.10817 sec
[P2] Table 7 rewrite took 36.413 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 34.2086 sec
[P2] Table 6 rewrite took 48.4351 sec, dropped 581421025 entries (13.5407 %)
[P2] Table 5 scan took 32.2651 sec
[P2] Table 5 rewrite took 46.5942 sec, dropped 762109002 entries (17.7465 %)
[P2] Table 4 scan took 31.7279 sec
[P2] Table 4 rewrite took 45.8766 sec, dropped 828989486 entries (19.3027 %)
[P2] Table 3 scan took 31.5959 sec
[P2] Table 3 rewrite took 45.481 sec, dropped 855164322 entries (19.9112 %)
[P2] Table 2 scan took 31.5307 sec
[P2] Table 2 rewrite took 45.4374 sec, dropped 865601145 entries (20.1541 %)
Phase 2 took 455.778 sec
Wrote plot header with 268 bytes
[P3-1] Table 2 took 41.9283 sec, wrote 3429305512 right entries
[P3-2] Table 2 took 32.6918 sec, wrote 3429305512 left entries, 3429305512 final
[P3-1] Table 3 took 53.1187 sec, wrote 3439720241 right entries
[P3-2] Table 3 took 32.6833 sec, wrote 3439720241 left entries, 3439720241 final
[P3-1] Table 4 took 53.6664 sec, wrote 3465700034 right entries
[P3-2] Table 4 took 32.5607 sec, wrote 3465700034 left entries, 3465700034 final
[P3-1] Table 5 took 54.4015 sec, wrote 3532299552 right entries
[P3-2] Table 5 took 33.341 sec, wrote 3532299552 left entries, 3532299552 final
[P3-1] Table 6 took 56.2956 sec, wrote 3712454344 right entries
[P3-2] Table 6 took 35.0807 sec, wrote 3712454344 left entries, 3712454344 final
[P3-1] Table 7 took 43.3253 sec, wrote 4292745068 right entries
[P3-2] Table 7 took 40.3063 sec, wrote 4292745068 left entries, 4292745068 final
Phase 3 took 513.911 sec, wrote 21872224751 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 57.7683 sec, final plot size is 108805631481 bytes
Total plot creation time was 1943.65 sec (32.3941 min)
Started copy to /chia37/plot-k32-2022-02-25-19-40-194f5119638f2c64f8808aabc4272e27b923a7bca63ecb8703fd24a96330c28d.plot

The parallel job is similarly unremarkable. However, I start to notice there seems to be memory in use that is unaccounted for. After the first pass, the free ram seems to not all be freed. Some is buffered, and eventually released as the 2nd pass starts to use more RAM. However, very quickly it starts swapping.

Crafting plot 2 out of -1 (2022/02/25 20:12:39)
Process ID: 30160
Number of Threads: 22
Number of Buckets P1:    2^8 (256)
Number of Buckets P3+P4: 2^8 (256)
Pool Public Key:   87360049e3f15e7b3b8d9f7605e5e6abfa44c254e183ffbf7f63a1c3a8a7d9265fedb3e80ad3f102402818a24a18e6f8
Farmer Public Key: 8934b2b0af6fb032944907fb0f38ee706f2a557ce152fb9f7e07a7f7978c6e90cb6681cd90b4390c664513a0dde89ab6
Working Directory:   /tmpfs0/
Working Directory 2: /tmpfs0/
Plot Name: plot-k32-2022-02-25-20-12-2fca26a0acbcef6ed2cb766af119b1607e39dd87e4e6bd9e13f2d60d09b9eef4
[P1] Table 1 took 17.2749 sec
[P1] Table 2 took 141.758 sec, found 4294798949 matches
[P1] Table 3 took 154.108 sec, found 4294613298 matches
[P1] Table 4 took 177.264 sec, found 4294121045 matches
[P1] Table 5 took 176.233 sec, found 4293306541 matches
Copy to /chia37/plot-k32-2022-02-25-19-40-194f5119638f2c64f8808aabc4272e27b923a7bca63ecb8703fd24a96330c28d.plot finished, took 674.963 sec, 153.735 MB/s avg.
[P1] Table 6 took 164.334 sec, found 4291669308 matches
^C
****************************************************************************************
**  The crafting of plots will stop after the creation and copy of the current plot.  **
**         !! If you want to force quit now, press Ctrl-C twice in series !!          **
****************************************************************************************
[P1] Table 7 took 129.035 sec, found 4288198668 matches
Phase 1 took 960.026 sec
[P2] max_table_size = 4294967296
[P2] Table 7 scan took 9.36747 sec
[P2] Table 7 rewrite took 80.4473 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 34.3109 sec
[P2] Table 6 rewrite took 49.7584 sec, dropped 581724842 entries (13.5547 %)
[P2] Table 5 scan took 49.8793 sec
[P2] Table 5 rewrite took 48.0592 sec, dropped 762463788 entries (17.7594 %)
[P2] Table 4 scan took 35.6396 sec
[P2] Table 4 rewrite took 134.517 sec, dropped 829251400 entries (19.3113 %)
[P2] Table 3 scan took 122.388 sec
[P2] Table 3 rewrite took 164.452 sec, dropped 855382896 entries (19.9176 %)
[P2] Table 2 scan took 47.5694 sec
[P2] Table 2 rewrite took 164.525 sec, dropped 865731775 entries (20.1577 %)
Phase 2 took 959.806 sec
Wrote plot header with 268 bytes
[P3-1] Table 2 took 52.9561 sec, wrote 3429067174 right entries
[P3-2] Table 2 took 32.7323 sec, wrote 3429067174 left entries, 3429067174 final
[P3-1] Table 3 took 53.3638 sec, wrote 3439230402 right entries
[P3-2] Table 3 took 32.6586 sec, wrote 3439230402 left entries, 3439230402 final
[P3-1] Table 4 took 53.8672 sec, wrote 3464869645 right entries
[P3-2] Table 4 took 33.1071 sec, wrote 3464869645 left entries, 3464869645 final
[P3-1] Table 5 took 54.5899 sec, wrote 3530842753 right entries
[P3-2] Table 5 took 33.5783 sec, wrote 3530842753 left entries, 3530842753 final
[P3-1] Table 6 took 56.5318 sec, wrote 3709944466 right entries
[P3-2] Table 6 took 35.2192 sec, wrote 3709944466 left entries, 3709944466 final
[P3-1] Table 7 took 44.1606 sec, wrote 4288198668 right entries
[P3-2] Table 7 took 40.3744 sec, wrote 4288198668 left entries, 4288198668 final
Phase 3 took 528.386 sec, wrote 21862153108 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 57.5257 sec, final plot size is 108744530557 bytes
Total plot creation time was 2505.79 sec (41.7632 min)
Started copy to /chia37/plot-k32-2022-02-25-20-12-2fca26a0acbcef6ed2cb766af119b1607e39dd87e4e6bd9e13f2d60d09b9eef4.plot

As it's copying the completed plots to the disk, and the tmpfs volumes are being emptied, the RAM usage does not decrease. Instead, after one of the copy jobs finishes, the RAM usage hits a low of about 100 GB, and then actually increases until about half my RAM is in use when the 2nd transfer completes:

              total        used        free      shared  buff/cache   available
Mem:         515955      238948      276687          17         319      274698

Eventually, if I allow it to keep going, one of the jobs will be killed for lack of memory when the swap becomes exhausted. Whether or not the process is killed or terminates normally, a large part of my RAM is no longer available, as per above.

The only way I have found to free it is to reboot.

I use zfs on the destination disk, but I limit arc cache to 4 GB. Also, I have 24 GB of swap. Once again, none is used the first pass, but increasingly it is exhausted until one of the processes is killed, and the other will continue normally.

In addition, the overhead incurred by swapping actually makes parallel jobs slower.

image

UPDATE:

I tried running with no swap, and it nearly worked.

I could have probably freed up another 3 GB by using a lighter distro, but I estimate, there was still ~7.6 GB of tmpfs to be written. I calculated that based on 8.7 GB shown free on tmpfs1, and the past observation that it bottoms out at ~1.1 GB. Therefore 8.7-1.1=7.6:

image

Shortly thereafter, the second job crashed at the same point. Both file transfers had completed, so something else is causing the additional memory usage.

Obviously, "MemAvailable", which I got from /prox/meminfo, was a bit optimistic. That pane and the one with /tmpfs/ space to the left are updated every second. Only a few seconds elapsed before I was able to get the screengrab.

[ 4272.941648] phase2/add/3 invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[ 4272.941656] CPU: 29 PID: 2708 Comm: phase2/add/3 Tainted: P           O      5.4.166-1-pve #1
[ 4272.941657] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.4 11/20/2019
[ 4272.941659] Call Trace:
[ 4272.941683]  dump_stack+0x6d/0x8b
[ 4272.941693]  dump_header+0x4f/0x1e1
[ 4272.941696]  oom_kill_process.cold.33+0xb/0x10
[ 4272.941705]  out_of_memory+0x1bf/0x4e0
[ 4272.941709]  __alloc_pages_slowpath+0xd40/0xe30
[ 4272.941711]  __alloc_pages_nodemask+0x2df/0x330
[ 4272.941716]  alloc_pages_vma+0x7a/0x1f0
[ 4272.941721]  shmem_alloc_page+0x4b/0x90
[ 4272.941730]  ? __switch_to_asm+0x34/0x70
[ 4272.941732]  shmem_alloc_and_acct_page+0x7c/0x1c0
[ 4272.941735]  shmem_getpage_gfp+0x178/0x8e0
[ 4272.941738]  shmem_write_begin+0x36/0x60
[ 4272.941740]  generic_perform_write+0xf2/0x1b0
[ 4272.941747]  ? file_update_time+0xed/0x130
[ 4272.941750]  __generic_file_write_iter+0x101/0x1f0
[ 4272.941752]  generic_file_write_iter+0x106/0x190
[ 4272.941756]  new_sync_write+0x125/0x1c0
[ 4272.941757]  __vfs_write+0x29/0x40
[ 4272.941759]  vfs_write+0xab/0x1b0
[ 4272.941762]  ksys_write+0x61/0xe0
[ 4272.941764]  __x64_sys_write+0x1a/0x20
[ 4272.941770]  do_syscall_64+0x57/0x190
[ 4272.941772]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 4272.941776] RIP: 0033:0x7efe5b70c537
[ 4272.941782] Code: 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 2b c2 01 00 4c 89 e2 48 89 ee 89 df 41 89 c0 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 64 c2 01 00 48
[ 4272.941783] RSP: 002b:00007efe5ac64a80 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[ 4272.941786] RAX: ffffffffffffffda RBX: 0000000000000062 RCX: 00007efe5b70c537
[ 4272.941787] RDX: 0000000000001000 RSI: 00007efdbca4c800 RDI: 0000000000000062
[ 4272.941788] RBP: 00007efdbca4c800 R08: 0000000000000000 R09: 00007efe403b0190
[ 4272.941788] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000001000
[ 4272.941789] R13: 0000000000001000 R14: 00007efe5b7d9760 R15: 0000000000001000
[ 4272.941791] Mem-Info:
[ 4272.941807] active_anon:14242188 inactive_anon:99877257 isolated_anon:0
                active_file:1299 inactive_file:1761 isolated_file:0
                unevictable:6633 dirty:0 writeback:0 unstable:0
                slab_reclaimable:313616 slab_unreclaimable:276405
                mapped:6025 shmem:112459688 pagetables:7309 bounce:0
                free:12081162 free_pcp:1288 free_cma:0
[ 4272.941812] Node 0 active_anon:28700592kB inactive_anon:222951904kB active_file:576kB inactive_file:304kB unevictable:26532kB isolated(anon):0kB isolated(file):0kB mapped:16772kB dirty:0kB writeback:0kB shmem:248119140kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[ 4272.941814] Node 0 DMA free:15884kB min:0kB low:12kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15968kB managed:15884kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 4272.941817] lowmem_reserve[]: 0 1886 257880 257880 257880
[ 4272.941820] Node 0 DMA32 free:1024224kB min:328kB low:2256kB high:4184kB active_anon:148588kB inactive_anon:608496kB active_file:0kB inactive_file:20kB unevictable:0kB writepending:0kB present:2044880kB managed:1979344kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:2912kB local_pcp:144kB free_cma:0kB
[ 4272.941823] lowmem_reserve[]: 0 0 255994 255994 255994
[ 4272.941826] Node 0 Normal free:44292kB min:44712kB low:306848kB high:568984kB active_anon:28552004kB inactive_anon:222343812kB active_file:576kB inactive_file:284kB unevictable:26532kB writepending:0kB present:266338304kB managed:262137928kB mlocked:26532kB kernel_stack:83432kB pagetables:15536kB bounce:0kB free_pcp:2240kB local_pcp:248kB free_cma:0kB
[ 4272.941830] lowmem_reserve[]: 0 0 0 0 0
[ 4272.941832] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15884kB
[ 4272.941840] Node 0 DMA32: 16*4kB (U) 25*8kB (UME) 29*16kB (UE) 35*32kB (UME) 25*64kB (UE) 325*128kB (UE) 245*256kB (UME) 166*512kB (UE) 52*1024kB (U) 26*2048kB (UME) 177*4096kB (UM) = 1024248kB
[ 4272.941848] Node 0 Normal: 1*4kB (M) 67*8kB (U) 581*16kB (UE) 1044*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 43244kB
[ 4272.941856] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 4272.941857] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 4272.941858] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 4272.941859] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 4272.941860] 112465497 total pagecache pages
[ 4272.941863] 0 pages in swap cache
[ 4272.941864] Swap cache stats: add 0, delete 0, find 0/0
[ 4272.941864] Free swap  = 0kB
[ 4272.941865] Total swap = 0kB
[ 4272.941866] 134208652 pages RAM
[ 4272.941866] 0 pages HighMem/MovableOnly
[ 4272.941867] 2124093 pages reserved
[ 4272.941867] 0 pages cma reserved
[ 4272.941868] 0 pages hwpoisoned
[ 4272.941869] Tasks state (memory values in pages):
[ 4272.941869] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[ 4272.941955] [   1123]     0  1123     7439     2498   110592        0             0 systemd-journal
[ 4272.941959] [   1133]     0  1133     1921       53    57344        0             0 blkmapd
[ 4272.941964] [   1144]     0  1144     5809      814    65536        0         -1000 systemd-udevd
[ 4272.941971] [   1890]     0  1890    20263     5930    98304        0         -1000 dmeventd
[ 4272.943185] [  27324]   100 27324    23270     1144    86016        0             0 systemd-timesyn
[ 4272.943189] [  27327]   106 27327     1705      737    53248        0             0 rpcbind
[ 4272.943193] [  27329]     0 27329     2269       42    49152        0             0 rpc.idmapd
[ 4272.943197] [  27342]     0 27342    21333      400    53248        0             0 lxcfs
[ 4272.943200] [  27343]     0 27343    41547      799    77824        0             0 zed
[ 4272.943203] [  27344]     0 27344    56455      796    81920        0             0 rsyslogd
[ 4272.943207] [  27346]     0 27346      535      167    40960        0         -1000 watchdog-mux
[ 4272.943210] [  27348]     0 27348     1022       34    45056        0             0 qmeventd
[ 4272.943213] [  27360]     0 27360     4847      593    73728        0             0 systemd-logind
[ 4272.943215] [  27361]   104 27361     2242      806    53248        0          -900 dbus-daemon
[ 4272.943218] [  27367]     0 27367     1681      579    45056        0             0 ksmtuned
[ 4272.943223] [  27866]     0 27866     1823      180    49152        0             0 lxc-monitord
[ 4272.943225] [  27890]     0 27890      568      184    45056        0             0 none
[ 4272.943230] [  27894]     0 27894     3962      783    73728        0         -1000 sshd
[ 4272.943233] [  27897]     0 27897     7939     6287   102400        0             0 rpc.mountd
[ 4272.943235] [  27898]     0 27898     1722       62    49152        0             0 iscsid
[ 4272.943237] [  27899]     0 27899     1945     1427    57344        0           -17 iscsid
[ 4272.943241] [  28147]     0 28147    10867      522    77824        0             0 master
[ 4272.943243] [  28154]   107 28154    10957      398    77824        0             0 pickup
[ 4272.943246] [  28155]   107 28155    10970      583    81920        0             0 qmgr
[ 4272.943251] [  28324]     0 28324     1402      345    49152        0             0 agetty
[ 4272.943254] [  28358]     0 28358   111508      608   159744        0             0 rrdcached
[ 4272.943257] [  28384]     0 28384     2125      530    57344        0             0 cron
[ 4272.943262] [  28464]    33 28464    17661    12424   167936        0             0 spiceproxy
[ 4272.943264] [  28465]    33 28465    17694    12914   167936        0             0 spiceproxy work
[ 4272.943268] [  28510]     0 28510     4157     1290    73728        0             0 sshd
[ 4272.943272] [  28544]     0 28544     6036     1874    77824        0             0 systemd
[ 4272.943275] [  28547]     0 28547     6215     1059    86016        0             0 (sd-pam)
[ 4272.943278] [  28562]     0 28562     1978      786    53248        0             0 bash
[ 4272.943285] [  29650]     0 29650     2153      375    57344        0             0 screen
[ 4272.943287] [  29653]     0 29653     2405      814    57344        0             0 screen
[ 4272.943290] [  29659]     0 29659     1747      547    57344        0             0 bash
[ 4272.943295] [  29745]     0 29745     2886     1804    69632        0             0 htop
[ 4272.943298] [  29746]     0 29746     1747      567    49152        0             0 bash
[ 4272.943300] [  29750]     0 29750  1872912   857590 13975552        0             0 chia_plot
[ 4272.943303] [  29786]     0 29786     1747      535    57344        0             0 bash
[ 4272.943332] [  29833]     0 29833     1747      771    57344        0             0 bash
[ 4272.943334] [  30007]     0 30007     1781      785    49152        0             0 bash
[ 4272.943340] [  31007]     0 31007     1747      736    57344        0             0 bash
[ 4272.943352] [   3129]     0  3129     1314      172    45056        0             0 sleep
[ 4272.943355] [   3132]     0  3132     1314      188    53248        0             0 sleep
[ 4272.943359] [   3135]     0  3135     1314      172    49152        0             0 sleep
[ 4272.943361] [   3141]     0  3141     1314      188    49152        0             0 sleep
[ 4272.943365] oom-kill:constraint=CONSTRAINT_MEMORY_POLICY,nodemask=0,cpuset=/,mems_allowed=0-1,global_oom,task_memcg=/user.slice/user-0.slice/session-1.scope,task=chia_plot,pid=29750,uid=0
[ 4272.943481] Out of memory: Killed process 29750 (chia_plot) total-vm:7491648kB, anon-rss:3427324kB, file-rss:3036kB, shmem-rss:0kB, UID:0 pgtables:13648kB oom_score_adj:0
[ 4273.099859] oom_reaper: reaped process 29750 (chia_plot), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

I don't have a problem with it swapping some at ~P2/T2. That happens when I run one job with 256 GB of RAM, and it only adds a minute or so onto plot times. Unfortunately, if I let it run, it will use all 24 GB of my swap, and I'm sure it would grow much larger.

tl;dr
When plotting parallel jobs in tmpfs with 512 GB of RAM, every pass needs increasing amounts of RAM, until one job crashes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant