-
Notifications
You must be signed in to change notification settings - Fork 905
[crypto] Add consttime_memeq_byte()
function
#28501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This came after offline discussions, a super nice function to compare bytes in a constant time manner such that non-world aligned buffers can be matched! Thanks Pascal!
sw/device/lib/base/hardened_memory.c
Outdated
unsigned char a = ((unsigned char *)lhs)[it]; | ||
unsigned char b = ((unsigned char *)rhs)[it]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you compare the assembly generated by this array indexing construction vs. writing the loop with pointer arithmetic?
e.g.:
unsigned char *lh = (unsigned char*)lhs;
unsigned char *rh = (unsigned char*)rhs;
for(; it < len; ++it, ++lh, ++rh) {
unsigned char a = *lh, b= *rh;
/* the rest of the loop... */
}
This seems like something the optimizer would figure out, but I can't help but be curious about the generated code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hehe I asked the same question @cfrantz !
Pascal showed me assembly and it does exactly that: start pointers in reg and increment them in loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
Here the ASM output of the current implementation:
20012c92 <consttime_memeq_byte>:
20012c92: 4281 li t0,0
20012c94: 58fd li a7,-1
20012c96: 8832 mv a6,a2
20012c98: 02080263 beqz a6,20012cbc <consttime_memeq_byte+0x2a>
20012c9c: 00054703 lbu a4,0(a0)
20012ca0: 0005c683 lbu a3,0(a1)
20012ca4: 87ba mv a5,a4
20012ca6: 8fb5 xor a5,a5,a3
20012ca8: 0057e2b3 or t0,a5,t0
20012cac: 8eb9 xor a3,a3,a4
20012cae: 40d8f8b3 andn a7,a7,a3
20012cb2: 0505 addi a0,a0,1
20012cb4: 0585 addi a1,a1,1
20012cb6: 187d addi a6,a6,-1
20012cb8: fe0812e3 bnez a6,20012c9c <consttime_memeq_byte+0xa>
20012cbc: 00c60363 beq a2,a2,20012cc2 <consttime_memeq_byte+0x30>
20012cc0: 0000 unimp
20012cc2: fec61fe3 bne a2,a2,20012cc0 <consttime_memeq_byte+0x2e>
20012cc6: 00028f63 beqz t0,20012ce4 <consttime_memeq_byte+0x52>
20012cca: 557d li a0,-1
20012ccc: 00a89363 bne a7,a0,20012cd2 <consttime_memeq_byte+0x40>
20012cd0: 0000 unimp
20012cd2: fea88fe3 beq a7,a0,20012cd0 <consttime_memeq_byte+0x3e>
20012cd6: 1d400513 li a0,468
20012cda: 8082 ret
20012cdc: c0001073 unimp
20012ce0: c0001073 unimp
20012ce4: 557d li a0,-1
20012ce6: 00a88363 beq a7,a0,20012cec <consttime_memeq_byte+0x5a>
20012cea: 0000 unimp
20012cec: fea89fe3 bne a7,a0,20012cea <consttime_memeq_byte+0x58>
20012cf0: 73900513 li a0,1849
20012cf4: 8082 ret
20012cf6: c0001073 unimp
20012cfa: c0001073 unimp
and here the one when switching to your suggestion:
20012c92 <consttime_memeq_byte>:
20012c92: 4281 li t0,0
20012c94: 58fd li a7,-1
20012c96: 8832 mv a6,a2
20012c98: 02080263 beqz a6,20012cbc <consttime_memeq_byte+0x2a>
20012c9c: 00054703 lbu a4,0(a0)
20012ca0: 0005c683 lbu a3,0(a1)
20012ca4: 87ba mv a5,a4
20012ca6: 8fb5 xor a5,a5,a3
20012ca8: 0057e2b3 or t0,a5,t0
20012cac: 8eb9 xor a3,a3,a4
20012cae: 40d8f8b3 andn a7,a7,a3
20012cb2: 0505 addi a0,a0,1
20012cb4: 0585 addi a1,a1,1
20012cb6: 187d addi a6,a6,-1
20012cb8: fe0812e3 bnez a6,20012c9c <consttime_memeq_byte+0xa>
20012cbc: 00c60363 beq a2,a2,20012cc2 <consttime_memeq_byte+0x30>
20012cc0: 0000 unimp
20012cc2: fec61fe3 bne a2,a2,20012cc0 <consttime_memeq_byte+0x2e>
20012cc6: 00028f63 beqz t0,20012ce4 <consttime_memeq_byte+0x52>
20012cca: 557d li a0,-1
20012ccc: 00a89363 bne a7,a0,20012cd2 <consttime_memeq_byte+0x40>
20012cd0: 0000 unimp
20012cd2: fea88fe3 beq a7,a0,20012cd0 <consttime_memeq_byte+0x3e>
20012cd6: 1d400513 li a0,468
20012cda: 8082 ret
20012cdc: c0001073 unimp
20012ce0: c0001073 unimp
20012ce4: 557d li a0,-1
20012ce6: 00a88363 beq a7,a0,20012cec <consttime_memeq_byte+0x5a>
20012cea: 0000 unimp
20012cec: fea89fe3 bne a7,a0,20012cea <consttime_memeq_byte+0x58>
20012cf0: 73900513 li a0,1849
20012cf4: 8082 ret
20012cf6: c0001073 unimp
20012cfa: c0001073 unimp
Looks to be identical :-)
Which implementation would you prefer? Both work for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
I don't have a strong preference.
From a personal style perspective:
- I like the pointer arithmetic solution because it more closely resembles the generated code.
- I don't really like the cast-then-array-index construction
((unsigned char *)lhs)[it]
. I prefer to cast the void pointers to the desired type and then operate on the variable with the desired type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, regardless of which form you use, both the current code and my example fragment cast away const
. The variables & casts should probably keep const, as we promise that we aren't mutating the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback.
I agree with your points - I've changed the code to the arithmetic solution. I've also added the const
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thansk @nasahlpa !
The current hardened_memeq() is hardened against SCA and is const. time. However, it requires that the input buffers are 32-bit word aligned. This commit adds a new consttime_memeq_byte() function that can operate also on buffers that are not 32-bit aligned. It is const. time but not hardened against SCA. Signed-off-by: Pascal Nasahl <[email protected]>
We should use consttime_memeq_byte instead of hardened_memeq to also allow safely operating on a byte buffer (cipher_input). Signed-off-by: Pascal Nasahl <[email protected]>
ced6dea
to
7387509
Compare
The current hardened_memeq() is hardened against SCA and is const. time. However, it requires that the input buffers are 32-bit word aligned.
This commit adds a new consttime_memeq_byte() function that can operate also on buffers that are not 32-bit aligned. It is const.
time but not hardened against SCA.