-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Un Auditted #2
Comments
There is an Audit Branch now I'm going to be placing platform specific asm dumps and the build scripts I used to generate them there |
Initial Review: The compare equality family: OKAY
Have no branches. They appear to be constructed out of nothing but The optional select family: OKAY
These have been compiled down to 4 instructions total (only register sizing changes between them). They all take the form of: test cl, cl #Is Arg0 true?
cmove rdx, r8 #CONDITIONALLY OVER WRITE Arg2
mov rax, rdx #Put Arg2 in return register
ret
So this boils down too a branch + conditional move. According to Agner Fog CMOV is a fixed latency instruction, albeit non-free (1-6 cycles depending on vendor/generation). The issue still stands this is a branch, and it can be miss-predicted. This is a ~20-40 tick delta for a prediction. Not necessarily a true or false. I think this is OKAY. I don't think and external hacker across ethernet can know my branch cache status. The compare buffer family: BROKE AS A JOKE
Well this sucks! sub rsp, 40
mov r10, rdx
cmp r10, r9 #length check
jnz ?_003 #Fail Length Check Branch
mov r9b, -1
test r10, r10
jz ?_002 #Dump optmization for array len == 1
xor edx, edx
?_001: cmp rdx, r10 #start loop, bounds check
jnc ?_005 #PANIC on bad memory access
movzx eax, byte [r8+rdx]
xor al, byte [rcx+rdx]
inc rdx
or r9b, al
cmp rdx, r10 #Are we at the end of the loop?
jc ?_001 #Loop again
not r9b
?_002: mov ecx, r9d #This _should_ be a call to ct_u8_eq. B
shr cl, 4 #But it is inlined
and cl, r9b
mov eax, ecx
shr al, 2
and al, cl
mov ecx, eax
shr cl, 1
test cl, al #ct_u8_eq is being inlined here
setne al #but ct_u8_eq DOESNT CALL TEST
jmp ?_004 #jump past zeroing eax to return TRUE
?_003: xor eax, eax
?_004: add rsp, 40
ret
?_005:
#Rust Panic
#Unwind the Stack I think changing the I am very concerned about that last branch. It really isn't do anything productive. The I'm assuming single list optimization seems stupid, but I really don't know how to combat it. Conditional Copying: BROKE AS A JOKE
The functions have 2 paths. Both will increment across the whole buffer. But one does A LOT more work. So it is pretty obvious these are not consistent time. ct_copy_u8:
sub rsp, 40
mov rax, rdx
cmp r8, qword [rsp+50H]
jnz ?_030 #If Array Len's aren't equal panic
test r8, r8
jz ?_028 #If array len is 0 short circuit to TRUE
xor edx, edx
test cl, cl
jz ?_027 #Jump to a loop that does nothing
?_026: cmp rdx, r8 #This loop does stuff
jnc ?_029
movzx ecx, byte [r9+rdx]
mov byte [rax+rdx], cl
inc rdx
cmp rdx, r8 #Determine to keep looping
jc ?_026 #LOOP
jmp ?_028 #RETURN
?_027: cmp rdx, r8 #This loop does N O T H I N G
jnc ?_029 #Still has a santy check b/c LMAO
inc rdx
cmp rdx, r8
jc ?_027 #Should we keep looping?
?_028: add rsp, 40
ret
?_029: #Rust Panic
?_030: #Rust panic Comments questions concerns? Am I not understanding the assembly ? I'm guess and reading docs at the same time so my confidence is a bit low. |
So this totally fixed everything ct_u8_slice_eq:; Function begin
sub rsp, 40
mov r10, rdx
cmp r10, r9
jnz ?_002 #these jump are based on
test r10, r10 #non equal lens
jz ?_003 #and zero len
xor r9d, r9d #so their A-OKAY
xor edx, edx
?_001: cmp rdx, r10
jnc ?_006 #sanity check
mov al, byte [r8+rdx]
xor al, byte [rcx+rdx]
inc rdx
or al, r9b
cmp rdx, r10
mov r9b, al
jc ?_001 #repeat loop
jmp ?_004 #exit loop
?_002: xor eax, eax
jmp ?_005 #weird non-equal list stuff
?_003: xor eax, eax
?_004: movzx ecx, al
xor edx, edx
call ct_u8_eq #10/10
?_005: nop
add rsp, 40
ret
?_006:
#Rust Panic
#Unwind the Stack Inspired by this fix I'm going to just call into the conditional comparison for the condition slice copy. I think that'll work since I'm just looping over a function call. The loop will be constant, and the function will be. So life will be amazing. |
And life is amazing 😎😎😎😎😎😎😎 ct_copy_u8:; Function begin
push r15
push r14
push rsi
push rdi
push rbp
push rbx
sub rsp, 40
mov r14, r9
mov rsi, r8
mov rbx, rdx
cmp rsi, qword [rsp+80H]
jnz ?_029 #Jump to explicit panic for non-equal args
test rsi, rsi
jz ?_027 #Skipping looping if array len == 0
xor edi, edi
movzx r15d, cl
?_026: cmp rdi, rsi
jnc ?_028 #Memory safety panic
lea rbp, [rdi+1H]
movzx r8d, byte [rbx+rdi]
movzx edx, byte [r14+rdi]
mov ecx, r15d
call ct_select_u8 #constant time swap
mov byte [rbx+rdi], al
cmp rbp, rsi
mov rdi, rbp
jc ?_026 #loop
?_027: add rsp, 40
pop rbx
pop rbp
pop rdi
pop rsi
pop r14
pop r15
ret
?_028:
#Rust Panic
?_029:
#Rust Panic The Audit Rev3 branch appears FULLY FUNCTIONAL on Windows MSVC AMD64. All advertised guarantees exist. Branching only occurs on 0 array lengths, and memory panics 💯 |
All previous changes have been merged with master as of ed6901c |
So I wrote a [timer crate](https://github.com/valarauca/amd64_timer0 to help me benchmark if Oh it isn't even close to consistent :( |
Have you considered using the |
Not a bad idea. Performance really isn't the first and foremost goal, the timing is. Also But yes putting up In reality I've mostly failed to make this consistent time, and I should pull it from |
There is another crate trying to do this now : https://github.com/isislovecruft/subtle/ I've no idea if it will succeed either though, but worth exploring. |
I don't know this is working as indented
I wouldn't trust a single rando online to tell me it was working as indented.
This'll be an open issue/thread until there is a consensus the code is correct. Please use it to raise concerns, or make requests, or yell me for doing something wrong.
Thank you.
The text was updated successfully, but these errors were encountered: