OpenCL: Default to table based AES, now backed in local memory #5613

magnumripper · 2024-12-16T09:53:42Z

This revised version pushes 1.4 Tbps of AES-128 decryption (axcrypt) or 914 Gbps of AES-256 encryption (keepass) on a 4070ti.

The bitsliced code we defaulted to before is really good but it's register hungry. It has some merits when two or more blocks are encrypted/decrypted at once (does two in parallel) but still is slower than table based now. We can still opt in to use it.

Closes #5594

ROR with 8, 16 or 24 can be made using byte_perm instruction. I saw no gain so left it disabled but it might be nice to have it sitting there for reference / future testing.

magnumripper · 2024-12-16T22:03:14Z

Up to 6.6x boost seen on super's AMD, and a mere 3x on the nvidias. I was hoping to achieve 10x but it turns out the bitsliced AES we used was too good as baseline for that to happen :) Anyway we seem to be on par with hashcat now.

solardiz

Cool stuff, but impossible to review for real without diving into it.

As a minor suggestion, maybe the moving of tables to a separate file can be a separate commit?

Also use same file for bitlocker format, which had another copy of them. The whole commit is effectively a no-op.

This revised version pushes 1.4 Tbps of AES-128 decryption (axcrypt) or 914 Gbps of AES-256 encryption (keepass) on a 4070ti. The bitsliced code we defaulted to before is really good but it's register hungry. It has some merits when two or more blocks are encrypted/decrypted at once (does two in parallel) but still is slower than table based now. We can still opt in to use it. Note: This commit switches all formats to table-based AES without actually enabling the copying to local until next commit where all formats are adapted to use it. This very commit thus makes for a performance regression. See openwall#5594

Enable local memory for table-based AES. Closes openwall#5594 Bitlocker format is not affected as it has it's own implementation, but AES performance is insignificant for it anyway.

magnumripper · 2024-12-17T12:22:31Z

impossible to review for real without diving into it.

The important changes are to opencl_aes_plain.h and are surprisingly few due to macros.

As a minor suggestion, maybe the moving of tables to a separate file can be a separate commit?

I fail to see the point of that, but did so now. BTW the bitlocker format was also changed to use that table file but was otherwise not changed - it has its own copy of more or less identical (afaics) table based AES but did not gain anything from using local memory as the AES part of it is insignificant. So I did not commit any such changes to it.

Add dmg-opencl, rar-opencl to the list of problematic formats. Side effect of openwall/john#5613. Document all formats that fail and therefore need to be disabled during testing. Signed-off-by: Claudio André <[email protected]>

solardiz · 2024-12-18T00:25:44Z

src/opencl_keepass_fmt_plug.c

@@ -53,10 +53,13 @@ typedef struct {
 	uint32_t cracked;
 } result;

+#define AES_MAXNR   14


FWIW, this addition of AES_MAXNR to opencl_keepass_fmt_plug.c looks unused.

Good catch, that's a remnant from the older state struct.

solardiz · 2024-12-18T00:27:31Z

As a minor suggestion, maybe the moving of tables to a separate file can be a separate commit?

I fail to see the point of that, but did so now.

Thank you for doing it. Looks cleaner to me that way, and makes the actual changes (in other commits) stand out.

magnumripper · 2024-12-18T09:13:23Z

run/opencl/7z_kernel.cl

@@ -162,7 +163,7 @@ __kernel void sevenzip_aes(__constant sevenzip_salt *salt,
 	/* Early rejection if possible (only decrypt last 16 bytes) */
 	if (pad > 0 && salt->length >= 32) {
 		uint8_t buf[16];
-		AES_KEY akey;
+		AES_KEY akey; akey.lt = &lt;


Oh BTW I first had this as AES_KEY akey = { .lt = &lt }; everywhere, but some device did not like that. Unfortunately I forget which. Good to know, perhaps I should start a wiki page listing knowledge like that. Another example would be the static vs inline vs static inline that we had to add a workaround for in opencl_misc.h.

Add raw-SHA512-free-opencl to the list of problematic formats. Side effect of openwall/john#5613 and openwall/john#5615. Document all formats that fail and therefore need to be disabled during testing. Signed-off-by: Claudio André <[email protected]>

Br0kenUK · 2025-07-08T09:46:57Z

Hi, can you use inverse sbox to speed up the invert key mix columns. I got 40% speed up doing it in hashcats code on aes256

magnumripper · 2025-07-09T00:43:38Z

Hi, can you use inverse sbox to speed up the invert key. I got 40% speed up doing it in hashcats code on aes256

Are we not doing that already?

Br0kenUK · 2025-07-09T01:08:16Z

Hi, can you use inverse sbox to speed up the invert key. I got 40% speed up doing it in hashcats code on aes256

Are we not doing that already?

I don't think so? Especially if your speed is only on par with hashcat

magnumripper · 2025-07-09T01:17:29Z

I'm trying to context switch into this. My current guess is you mean we don't have the four T-tables for decryption, only a plain reverse sbox table - right?

Is your code in hashcat already or are you planning a PR? Feel free to add a PR here as well!

Br0kenUK · 2025-07-09T01:19:00Z

Hi, can you use inverse sbox to speed up the invert key. I got 40% speed up doing it in hashcats code on aes256

Are we not doing that already?

https://pastebin.com/b9CBfXRS

Above is roughly what I'm doing, I have left some random things out unrelated like hashcats decrypt function. This was mostly chatgpts idea so take it with a grain of salt, regardless it got me 40% increase in speed on a certain hashcat mode which also has additional md5 steps( and it works perfect). The only thing that has actually changed is the invert key step and the tables. And no I havn't PR'd it there, from what I understand they dont care much about improving AES.

magnumripper · 2025-07-09T01:25:42Z

You are very welcome to make a PR for us, for fun and fame 😉

If not, I will look into it! And anyway thanks a lot for this suggestion!

Br0kenUK · 2025-07-09T01:28:17Z

I do not care for fame on things that was not my idea. I just randomly came across this issue and suggested it as it helped me. I won't be PRing it as I dont have the time to figure out your code aswell. GL.

These are used for set_decrypt_key() only, so would mainly affect formats that decrypt a small amount per candidate. While a decent boost was reported for hashcat, we only got a regression (as tested on nvidia) so this is left disabled for now. Closes openwall#5800, see openwall#5613 (comment)

This would mostly affect formats that decrypt a small amount per key (several formats only decrypt one or two blocks). This should theoretically boost AES_set_decrypt_key() by halving the number of table lookups but the results were disappointing so it's left disabled for now. Closes openwall#5800, see openwall#5613 (comment)

This would mostly affect formats that decrypt a small amount per key (several formats only decrypt one or two blocks). This should theoretically boost AES_set_decrypt_key() by halving the number of table lookups but the results on nvidia are disappointing as of now. Others get more or less boost. Closes openwall#5800, see openwall#5613 (comment)

This boosts AES_set_decrypt_key() by halving the number of table lookups. It mostly affects formats that decrypt a small amount per key (several formats only decrypt one or two blocks). Closes openwall#5800, see openwall#5613 (comment)

magnumripper and others added 2 commits December 12, 2024 14:11

opencl_rotate.h: Add ror32 optimization for nvidia

3ea0f8d

ROR with 8, 16 or 24 can be made using byte_perm instruction. I saw no gain so left it disabled but it might be nice to have it sitting there for reference / future testing.

Amend a few algorithm names to include AES when applicable

8f6a028

magnumripper force-pushed the opencl-aes-local branch from 6154359 to dffd323 Compare December 16, 2024 10:48

solardiz approved these changes Dec 17, 2024

View reviewed changes

magnumripper and others added 3 commits December 17, 2024 11:23

OpenCL (table based) AES: Move tables to a separate file

7d045a7

Also use same file for bitlocker format, which had another copy of them. The whole commit is effectively a no-op.

OpenCL AES formats: Adapt to new shared code

493a6a5

Enable local memory for table-based AES. Closes openwall#5594 Bitlocker format is not affected as it has it's own implementation, but AES performance is insignificant for it anyway.

magnumripper force-pushed the opencl-aes-local branch from dffd323 to 493a6a5 Compare December 17, 2024 12:04

magnumripper merged commit 4c5de1b into openwall:bleeding-jumbo Dec 17, 2024
35 of 36 checks passed

magnumripper deleted the opencl-aes-local branch December 17, 2024 12:26

ghost mentioned this pull request Dec 17, 2024

Edit the list of problematic formats openwall/john-packages#689

Merged

6 tasks

solardiz reviewed Dec 18, 2024

View reviewed changes

magnumripper commented Dec 18, 2024

View reviewed changes

solardiz mentioned this pull request Jun 16, 2025

Some early feedback samuel-lucas6/draft-lucas-bkdf#27

Open

magnumripper mentioned this pull request Jul 9, 2025

Speed up OpenCL AES decryption further #5800

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OpenCL: Default to table based AES, now backed in local memory #5613

OpenCL: Default to table based AES, now backed in local memory #5613

Uh oh!

magnumripper commented Dec 16, 2024

Uh oh!

magnumripper commented Dec 16, 2024

Uh oh!

solardiz left a comment

Uh oh!

magnumripper commented Dec 17, 2024

Uh oh!

Uh oh!

solardiz Dec 18, 2024

Uh oh!

magnumripper Dec 18, 2024

Uh oh!

solardiz commented Dec 18, 2024

Uh oh!

magnumripper Dec 18, 2024 •

edited

Loading

Uh oh!

Br0kenUK commented Jul 8, 2025 •

edited

Loading

Uh oh!

magnumripper commented Jul 9, 2025

Uh oh!

Br0kenUK commented Jul 9, 2025

Uh oh!

magnumripper commented Jul 9, 2025

Uh oh!

Br0kenUK commented Jul 9, 2025 •

edited

Loading

Uh oh!

magnumripper commented Jul 9, 2025

Uh oh!

Br0kenUK commented Jul 9, 2025

Uh oh!

Uh oh!

OpenCL: Default to table based AES, now backed in local memory #5613

OpenCL: Default to table based AES, now backed in local memory #5613

Uh oh!

Conversation

magnumripper commented Dec 16, 2024

Uh oh!

magnumripper commented Dec 16, 2024

Uh oh!

solardiz left a comment

Choose a reason for hiding this comment

Uh oh!

magnumripper commented Dec 17, 2024

Uh oh!

Uh oh!

solardiz Dec 18, 2024

Choose a reason for hiding this comment

Uh oh!

magnumripper Dec 18, 2024

Choose a reason for hiding this comment

Uh oh!

solardiz commented Dec 18, 2024

Uh oh!

magnumripper Dec 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Br0kenUK commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

magnumripper commented Jul 9, 2025

Uh oh!

Br0kenUK commented Jul 9, 2025

Uh oh!

magnumripper commented Jul 9, 2025

Uh oh!

Br0kenUK commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

magnumripper commented Jul 9, 2025

Uh oh!

Br0kenUK commented Jul 9, 2025

Uh oh!

Uh oh!

magnumripper Dec 18, 2024 •

edited

Loading

Br0kenUK commented Jul 8, 2025 •

edited

Loading

Br0kenUK commented Jul 9, 2025 •

edited

Loading