Description
I'm using madmax 1.1.8-ecec17d.
I have a supermicro X9DRI motherboard, with 512 GB of ddr3.
Prior to starting plotting, less than 1% of the system's memory is in use, even before I stop superfluous services:
total used free shared buff/cache available
Mem: 528338236 4535696 523449388 62292 353152 521400696
Swap: 25161724 0 25161724
I am attempting to run madmax in parallel, per numa node.
I create tmpfs partitions:
mount -t tmpfs -o size=245G,mpol=bind:0,defaults tmpfs /tmpfs0
mount -t tmpfs -o size=245G,mpol=bind:1,defaults tmpfs /tmpfs1
and I begin to plot:
numactl --cpubind=0 --membind=0 /root/chia-blockchain/chia-plotter/build/chia_plot -r 22 -u 8 -n -1 -t /tmpfs0/ -s /nvme/ -d /chia00/ -f foo -p bar
numactl --cpubind=1 --membind=1 /root/chia-blockchain/chia-plotter/build/chia_plot -r 22 -u 8 -n -1 -t /tmpfs1/ -s /nvme/ -d /chia01/ -f foo -p bar
I also set swappiness to zero now:
sysctl vm.swappiness=0
The first pass everything seems to go fine. No swap storage gets used:
Network Port: 8444 [chia]
Final Directory: /chia37/
Stage Directory: /nvme/
Number of Plots: infinite
Crafting plot 1 out of -1 (2022/02/25 19:40:16)
Process ID: 30160
Number of Threads: 22
Number of Buckets P1: 2^8 (256)
Number of Buckets P3+P4: 2^8 (256)
Pool Public Key: 87360049e3f15e7b3b8d9f7605e5e6abfa44c254e183ffbf7f63a1c3a8a7d9265fedb3e80ad3f102402818a
24a18e6f8
Farmer Public Key: 8934b2b0af6fb032944907fb0f38ee706f2a557ce152fb9f7e07a7f7978c6e90cb6681cd90b4390c664513a
0dde89ab6
Working Directory: /tmpfs0/
Working Directory 2: /tmpfs0/
Plot Name: plot-k32-2022-02-25-19-40-194f5119638f2c64f8808aabc4272e27b923a7bca63ecb8703fd24a96330c28d
[P1] Table 1 took 16.6896 sec
[P1] Table 2 took 135.187 sec, found 4294906657 matches
[P1] Table 3 took 147.339 sec, found 4294884563 matches
[P1] Table 4 took 166.244 sec, found 4294689520 matches
[P1] Table 5 took 164.188 sec, found 4294408554 matches
[P1] Table 6 took 159.424 sec, found 4293875369 matches
[P1] Table 7 took 127.028 sec, found 4292745068 matches
Phase 1 took 916.121 sec
[P2] max_table_size = 4294967296
[P2] Table 7 scan took 9.10817 sec
[P2] Table 7 rewrite took 36.413 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 34.2086 sec
[P2] Table 6 rewrite took 48.4351 sec, dropped 581421025 entries (13.5407 %)
[P2] Table 5 scan took 32.2651 sec
[P2] Table 5 rewrite took 46.5942 sec, dropped 762109002 entries (17.7465 %)
[P2] Table 4 scan took 31.7279 sec
[P2] Table 4 rewrite took 45.8766 sec, dropped 828989486 entries (19.3027 %)
[P2] Table 3 scan took 31.5959 sec
[P2] Table 3 rewrite took 45.481 sec, dropped 855164322 entries (19.9112 %)
[P2] Table 2 scan took 31.5307 sec
[P2] Table 2 rewrite took 45.4374 sec, dropped 865601145 entries (20.1541 %)
Phase 2 took 455.778 sec
Wrote plot header with 268 bytes
[P3-1] Table 2 took 41.9283 sec, wrote 3429305512 right entries
[P3-2] Table 2 took 32.6918 sec, wrote 3429305512 left entries, 3429305512 final
[P3-1] Table 3 took 53.1187 sec, wrote 3439720241 right entries
[P3-2] Table 3 took 32.6833 sec, wrote 3439720241 left entries, 3439720241 final
[P3-1] Table 4 took 53.6664 sec, wrote 3465700034 right entries
[P3-2] Table 4 took 32.5607 sec, wrote 3465700034 left entries, 3465700034 final
[P3-1] Table 5 took 54.4015 sec, wrote 3532299552 right entries
[P3-2] Table 5 took 33.341 sec, wrote 3532299552 left entries, 3532299552 final
[P3-1] Table 6 took 56.2956 sec, wrote 3712454344 right entries
[P3-2] Table 6 took 35.0807 sec, wrote 3712454344 left entries, 3712454344 final
[P3-1] Table 7 took 43.3253 sec, wrote 4292745068 right entries
[P3-2] Table 7 took 40.3063 sec, wrote 4292745068 left entries, 4292745068 final
Phase 3 took 513.911 sec, wrote 21872224751 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 57.7683 sec, final plot size is 108805631481 bytes
Total plot creation time was 1943.65 sec (32.3941 min)
Started copy to /chia37/plot-k32-2022-02-25-19-40-194f5119638f2c64f8808aabc4272e27b923a7bca63ecb8703fd24a96330c28d.plot
The parallel job is similarly unremarkable. However, I start to notice there seems to be memory in use that is unaccounted for. After the first pass, the free ram seems to not all be freed. Some is buffered, and eventually released as the 2nd pass starts to use more RAM. However, very quickly it starts swapping.
Crafting plot 2 out of -1 (2022/02/25 20:12:39)
Process ID: 30160
Number of Threads: 22
Number of Buckets P1: 2^8 (256)
Number of Buckets P3+P4: 2^8 (256)
Pool Public Key: 87360049e3f15e7b3b8d9f7605e5e6abfa44c254e183ffbf7f63a1c3a8a7d9265fedb3e80ad3f102402818a24a18e6f8
Farmer Public Key: 8934b2b0af6fb032944907fb0f38ee706f2a557ce152fb9f7e07a7f7978c6e90cb6681cd90b4390c664513a0dde89ab6
Working Directory: /tmpfs0/
Working Directory 2: /tmpfs0/
Plot Name: plot-k32-2022-02-25-20-12-2fca26a0acbcef6ed2cb766af119b1607e39dd87e4e6bd9e13f2d60d09b9eef4
[P1] Table 1 took 17.2749 sec
[P1] Table 2 took 141.758 sec, found 4294798949 matches
[P1] Table 3 took 154.108 sec, found 4294613298 matches
[P1] Table 4 took 177.264 sec, found 4294121045 matches
[P1] Table 5 took 176.233 sec, found 4293306541 matches
Copy to /chia37/plot-k32-2022-02-25-19-40-194f5119638f2c64f8808aabc4272e27b923a7bca63ecb8703fd24a96330c28d.plot finished, took 674.963 sec, 153.735 MB/s avg.
[P1] Table 6 took 164.334 sec, found 4291669308 matches
^C
****************************************************************************************
** The crafting of plots will stop after the creation and copy of the current plot. **
** !! If you want to force quit now, press Ctrl-C twice in series !! **
****************************************************************************************
[P1] Table 7 took 129.035 sec, found 4288198668 matches
Phase 1 took 960.026 sec
[P2] max_table_size = 4294967296
[P2] Table 7 scan took 9.36747 sec
[P2] Table 7 rewrite took 80.4473 sec, dropped 0 entries (0 %)
[P2] Table 6 scan took 34.3109 sec
[P2] Table 6 rewrite took 49.7584 sec, dropped 581724842 entries (13.5547 %)
[P2] Table 5 scan took 49.8793 sec
[P2] Table 5 rewrite took 48.0592 sec, dropped 762463788 entries (17.7594 %)
[P2] Table 4 scan took 35.6396 sec
[P2] Table 4 rewrite took 134.517 sec, dropped 829251400 entries (19.3113 %)
[P2] Table 3 scan took 122.388 sec
[P2] Table 3 rewrite took 164.452 sec, dropped 855382896 entries (19.9176 %)
[P2] Table 2 scan took 47.5694 sec
[P2] Table 2 rewrite took 164.525 sec, dropped 865731775 entries (20.1577 %)
Phase 2 took 959.806 sec
Wrote plot header with 268 bytes
[P3-1] Table 2 took 52.9561 sec, wrote 3429067174 right entries
[P3-2] Table 2 took 32.7323 sec, wrote 3429067174 left entries, 3429067174 final
[P3-1] Table 3 took 53.3638 sec, wrote 3439230402 right entries
[P3-2] Table 3 took 32.6586 sec, wrote 3439230402 left entries, 3439230402 final
[P3-1] Table 4 took 53.8672 sec, wrote 3464869645 right entries
[P3-2] Table 4 took 33.1071 sec, wrote 3464869645 left entries, 3464869645 final
[P3-1] Table 5 took 54.5899 sec, wrote 3530842753 right entries
[P3-2] Table 5 took 33.5783 sec, wrote 3530842753 left entries, 3530842753 final
[P3-1] Table 6 took 56.5318 sec, wrote 3709944466 right entries
[P3-2] Table 6 took 35.2192 sec, wrote 3709944466 left entries, 3709944466 final
[P3-1] Table 7 took 44.1606 sec, wrote 4288198668 right entries
[P3-2] Table 7 took 40.3744 sec, wrote 4288198668 left entries, 4288198668 final
Phase 3 took 528.386 sec, wrote 21862153108 entries to final plot
[P4] Starting to write C1 and C3 tables
[P4] Finished writing C1 and C3 tables
[P4] Writing C2 table
[P4] Finished writing C2 table
Phase 4 took 57.5257 sec, final plot size is 108744530557 bytes
Total plot creation time was 2505.79 sec (41.7632 min)
Started copy to /chia37/plot-k32-2022-02-25-20-12-2fca26a0acbcef6ed2cb766af119b1607e39dd87e4e6bd9e13f2d60d09b9eef4.plot
As it's copying the completed plots to the disk, and the tmpfs volumes are being emptied, the RAM usage does not decrease. Instead, after one of the copy jobs finishes, the RAM usage hits a low of about 100 GB, and then actually increases until about half my RAM is in use when the 2nd transfer completes:
total used free shared buff/cache available
Mem: 515955 238948 276687 17 319 274698
Eventually, if I allow it to keep going, one of the jobs will be killed for lack of memory when the swap becomes exhausted. Whether or not the process is killed or terminates normally, a large part of my RAM is no longer available, as per above.
The only way I have found to free it is to reboot.
I use zfs on the destination disk, but I limit arc cache to 4 GB. Also, I have 24 GB of swap. Once again, none is used the first pass, but increasingly it is exhausted until one of the processes is killed, and the other will continue normally.
In addition, the overhead incurred by swapping actually makes parallel jobs slower.
UPDATE:
I tried running with no swap, and it nearly worked.
I could have probably freed up another 3 GB by using a lighter distro, but I estimate, there was still ~7.6 GB of tmpfs to be written. I calculated that based on 8.7 GB shown free on tmpfs1, and the past observation that it bottoms out at ~1.1 GB. Therefore 8.7-1.1=7.6:
Shortly thereafter, the second job crashed at the same point. Both file transfers had completed, so something else is causing the additional memory usage.
Obviously, "MemAvailable", which I got from /prox/meminfo, was a bit optimistic. That pane and the one with /tmpfs/ space to the left are updated every second. Only a few seconds elapsed before I was able to get the screengrab.
[ 4272.941648] phase2/add/3 invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[ 4272.941656] CPU: 29 PID: 2708 Comm: phase2/add/3 Tainted: P O 5.4.166-1-pve #1
[ 4272.941657] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.4 11/20/2019
[ 4272.941659] Call Trace:
[ 4272.941683] dump_stack+0x6d/0x8b
[ 4272.941693] dump_header+0x4f/0x1e1
[ 4272.941696] oom_kill_process.cold.33+0xb/0x10
[ 4272.941705] out_of_memory+0x1bf/0x4e0
[ 4272.941709] __alloc_pages_slowpath+0xd40/0xe30
[ 4272.941711] __alloc_pages_nodemask+0x2df/0x330
[ 4272.941716] alloc_pages_vma+0x7a/0x1f0
[ 4272.941721] shmem_alloc_page+0x4b/0x90
[ 4272.941730] ? __switch_to_asm+0x34/0x70
[ 4272.941732] shmem_alloc_and_acct_page+0x7c/0x1c0
[ 4272.941735] shmem_getpage_gfp+0x178/0x8e0
[ 4272.941738] shmem_write_begin+0x36/0x60
[ 4272.941740] generic_perform_write+0xf2/0x1b0
[ 4272.941747] ? file_update_time+0xed/0x130
[ 4272.941750] __generic_file_write_iter+0x101/0x1f0
[ 4272.941752] generic_file_write_iter+0x106/0x190
[ 4272.941756] new_sync_write+0x125/0x1c0
[ 4272.941757] __vfs_write+0x29/0x40
[ 4272.941759] vfs_write+0xab/0x1b0
[ 4272.941762] ksys_write+0x61/0xe0
[ 4272.941764] __x64_sys_write+0x1a/0x20
[ 4272.941770] do_syscall_64+0x57/0x190
[ 4272.941772] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 4272.941776] RIP: 0033:0x7efe5b70c537
[ 4272.941782] Code: 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 2b c2 01 00 4c 89 e2 48 89 ee 89 df 41 89 c0 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 64 c2 01 00 48
[ 4272.941783] RSP: 002b:00007efe5ac64a80 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[ 4272.941786] RAX: ffffffffffffffda RBX: 0000000000000062 RCX: 00007efe5b70c537
[ 4272.941787] RDX: 0000000000001000 RSI: 00007efdbca4c800 RDI: 0000000000000062
[ 4272.941788] RBP: 00007efdbca4c800 R08: 0000000000000000 R09: 00007efe403b0190
[ 4272.941788] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000001000
[ 4272.941789] R13: 0000000000001000 R14: 00007efe5b7d9760 R15: 0000000000001000
[ 4272.941791] Mem-Info:
[ 4272.941807] active_anon:14242188 inactive_anon:99877257 isolated_anon:0
active_file:1299 inactive_file:1761 isolated_file:0
unevictable:6633 dirty:0 writeback:0 unstable:0
slab_reclaimable:313616 slab_unreclaimable:276405
mapped:6025 shmem:112459688 pagetables:7309 bounce:0
free:12081162 free_pcp:1288 free_cma:0
[ 4272.941812] Node 0 active_anon:28700592kB inactive_anon:222951904kB active_file:576kB inactive_file:304kB unevictable:26532kB isolated(anon):0kB isolated(file):0kB mapped:16772kB dirty:0kB writeback:0kB shmem:248119140kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[ 4272.941814] Node 0 DMA free:15884kB min:0kB low:12kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15968kB managed:15884kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 4272.941817] lowmem_reserve[]: 0 1886 257880 257880 257880
[ 4272.941820] Node 0 DMA32 free:1024224kB min:328kB low:2256kB high:4184kB active_anon:148588kB inactive_anon:608496kB active_file:0kB inactive_file:20kB unevictable:0kB writepending:0kB present:2044880kB managed:1979344kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:2912kB local_pcp:144kB free_cma:0kB
[ 4272.941823] lowmem_reserve[]: 0 0 255994 255994 255994
[ 4272.941826] Node 0 Normal free:44292kB min:44712kB low:306848kB high:568984kB active_anon:28552004kB inactive_anon:222343812kB active_file:576kB inactive_file:284kB unevictable:26532kB writepending:0kB present:266338304kB managed:262137928kB mlocked:26532kB kernel_stack:83432kB pagetables:15536kB bounce:0kB free_pcp:2240kB local_pcp:248kB free_cma:0kB
[ 4272.941830] lowmem_reserve[]: 0 0 0 0 0
[ 4272.941832] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15884kB
[ 4272.941840] Node 0 DMA32: 16*4kB (U) 25*8kB (UME) 29*16kB (UE) 35*32kB (UME) 25*64kB (UE) 325*128kB (UE) 245*256kB (UME) 166*512kB (UE) 52*1024kB (U) 26*2048kB (UME) 177*4096kB (UM) = 1024248kB
[ 4272.941848] Node 0 Normal: 1*4kB (M) 67*8kB (U) 581*16kB (UE) 1044*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 43244kB
[ 4272.941856] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 4272.941857] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 4272.941858] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 4272.941859] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 4272.941860] 112465497 total pagecache pages
[ 4272.941863] 0 pages in swap cache
[ 4272.941864] Swap cache stats: add 0, delete 0, find 0/0
[ 4272.941864] Free swap = 0kB
[ 4272.941865] Total swap = 0kB
[ 4272.941866] 134208652 pages RAM
[ 4272.941866] 0 pages HighMem/MovableOnly
[ 4272.941867] 2124093 pages reserved
[ 4272.941867] 0 pages cma reserved
[ 4272.941868] 0 pages hwpoisoned
[ 4272.941869] Tasks state (memory values in pages):
[ 4272.941869] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 4272.941955] [ 1123] 0 1123 7439 2498 110592 0 0 systemd-journal
[ 4272.941959] [ 1133] 0 1133 1921 53 57344 0 0 blkmapd
[ 4272.941964] [ 1144] 0 1144 5809 814 65536 0 -1000 systemd-udevd
[ 4272.941971] [ 1890] 0 1890 20263 5930 98304 0 -1000 dmeventd
[ 4272.943185] [ 27324] 100 27324 23270 1144 86016 0 0 systemd-timesyn
[ 4272.943189] [ 27327] 106 27327 1705 737 53248 0 0 rpcbind
[ 4272.943193] [ 27329] 0 27329 2269 42 49152 0 0 rpc.idmapd
[ 4272.943197] [ 27342] 0 27342 21333 400 53248 0 0 lxcfs
[ 4272.943200] [ 27343] 0 27343 41547 799 77824 0 0 zed
[ 4272.943203] [ 27344] 0 27344 56455 796 81920 0 0 rsyslogd
[ 4272.943207] [ 27346] 0 27346 535 167 40960 0 -1000 watchdog-mux
[ 4272.943210] [ 27348] 0 27348 1022 34 45056 0 0 qmeventd
[ 4272.943213] [ 27360] 0 27360 4847 593 73728 0 0 systemd-logind
[ 4272.943215] [ 27361] 104 27361 2242 806 53248 0 -900 dbus-daemon
[ 4272.943218] [ 27367] 0 27367 1681 579 45056 0 0 ksmtuned
[ 4272.943223] [ 27866] 0 27866 1823 180 49152 0 0 lxc-monitord
[ 4272.943225] [ 27890] 0 27890 568 184 45056 0 0 none
[ 4272.943230] [ 27894] 0 27894 3962 783 73728 0 -1000 sshd
[ 4272.943233] [ 27897] 0 27897 7939 6287 102400 0 0 rpc.mountd
[ 4272.943235] [ 27898] 0 27898 1722 62 49152 0 0 iscsid
[ 4272.943237] [ 27899] 0 27899 1945 1427 57344 0 -17 iscsid
[ 4272.943241] [ 28147] 0 28147 10867 522 77824 0 0 master
[ 4272.943243] [ 28154] 107 28154 10957 398 77824 0 0 pickup
[ 4272.943246] [ 28155] 107 28155 10970 583 81920 0 0 qmgr
[ 4272.943251] [ 28324] 0 28324 1402 345 49152 0 0 agetty
[ 4272.943254] [ 28358] 0 28358 111508 608 159744 0 0 rrdcached
[ 4272.943257] [ 28384] 0 28384 2125 530 57344 0 0 cron
[ 4272.943262] [ 28464] 33 28464 17661 12424 167936 0 0 spiceproxy
[ 4272.943264] [ 28465] 33 28465 17694 12914 167936 0 0 spiceproxy work
[ 4272.943268] [ 28510] 0 28510 4157 1290 73728 0 0 sshd
[ 4272.943272] [ 28544] 0 28544 6036 1874 77824 0 0 systemd
[ 4272.943275] [ 28547] 0 28547 6215 1059 86016 0 0 (sd-pam)
[ 4272.943278] [ 28562] 0 28562 1978 786 53248 0 0 bash
[ 4272.943285] [ 29650] 0 29650 2153 375 57344 0 0 screen
[ 4272.943287] [ 29653] 0 29653 2405 814 57344 0 0 screen
[ 4272.943290] [ 29659] 0 29659 1747 547 57344 0 0 bash
[ 4272.943295] [ 29745] 0 29745 2886 1804 69632 0 0 htop
[ 4272.943298] [ 29746] 0 29746 1747 567 49152 0 0 bash
[ 4272.943300] [ 29750] 0 29750 1872912 857590 13975552 0 0 chia_plot
[ 4272.943303] [ 29786] 0 29786 1747 535 57344 0 0 bash
[ 4272.943332] [ 29833] 0 29833 1747 771 57344 0 0 bash
[ 4272.943334] [ 30007] 0 30007 1781 785 49152 0 0 bash
[ 4272.943340] [ 31007] 0 31007 1747 736 57344 0 0 bash
[ 4272.943352] [ 3129] 0 3129 1314 172 45056 0 0 sleep
[ 4272.943355] [ 3132] 0 3132 1314 188 53248 0 0 sleep
[ 4272.943359] [ 3135] 0 3135 1314 172 49152 0 0 sleep
[ 4272.943361] [ 3141] 0 3141 1314 188 49152 0 0 sleep
[ 4272.943365] oom-kill:constraint=CONSTRAINT_MEMORY_POLICY,nodemask=0,cpuset=/,mems_allowed=0-1,global_oom,task_memcg=/user.slice/user-0.slice/session-1.scope,task=chia_plot,pid=29750,uid=0
[ 4272.943481] Out of memory: Killed process 29750 (chia_plot) total-vm:7491648kB, anon-rss:3427324kB, file-rss:3036kB, shmem-rss:0kB, UID:0 pgtables:13648kB oom_score_adj:0
[ 4273.099859] oom_reaper: reaped process 29750 (chia_plot), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
I don't have a problem with it swapping some at ~P2/T2. That happens when I run one job with 256 GB of RAM, and it only adds a minute or so onto plot times. Unfortunately, if I let it run, it will use all 24 GB of my swap, and I'm sure it would grow much larger.
tl;dr
When plotting parallel jobs in tmpfs with 512 GB of RAM, every pass needs increasing amounts of RAM, until one job crashes.