Skip to content

Conversation

nasahlpa
Copy link
Member

The current hardened_memeq() is hardened against SCA and is const. time. However, it requires that the input buffers are 32-bit word aligned.

This commit adds a new consttime_memeq_byte() function that can operate also on buffers that are not 32-bit aligned. It is const.
time but not hardened against SCA.

Copy link
Contributor

@siemen11 siemen11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This came after offline discussions, a super nice function to compare bytes in a constant time manner such that non-world aligned buffers can be matched! Thanks Pascal!

@nasahlpa nasahlpa marked this pull request as ready for review October 16, 2025 18:13
@nasahlpa nasahlpa requested a review from a team as a code owner October 16, 2025 18:13
@nasahlpa nasahlpa requested review from alees24 and removed request for a team October 16, 2025 18:13
Comment on lines 134 to 135
unsigned char a = ((unsigned char *)lhs)[it];
unsigned char b = ((unsigned char *)rhs)[it];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you compare the assembly generated by this array indexing construction vs. writing the loop with pointer arithmetic?

e.g.:

unsigned char *lh = (unsigned char*)lhs;
unsigned char *rh = (unsigned char*)rhs;
for(; it < len; ++it, ++lh, ++rh) {
  unsigned char a = *lh, b= *rh;
  /* the rest of the loop... */
}

This seems like something the optimizer would figure out, but I can't help but be curious about the generated code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe I asked the same question @cfrantz !
Pascal showed me assembly and it does exactly that: start pointers in reg and increment them in loop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

Here the ASM output of the current implementation:

20012c92 <consttime_memeq_byte>:
20012c92:	4281                	li	t0,0
20012c94:	58fd                	li	a7,-1
20012c96:	8832                	mv	a6,a2
20012c98:	02080263          	beqz	a6,20012cbc <consttime_memeq_byte+0x2a>
20012c9c:	00054703          	lbu	a4,0(a0)
20012ca0:	0005c683          	lbu	a3,0(a1)
20012ca4:	87ba                	mv	a5,a4
20012ca6:	8fb5                	xor	a5,a5,a3
20012ca8:	0057e2b3          	or	t0,a5,t0
20012cac:	8eb9                	xor	a3,a3,a4
20012cae:	40d8f8b3          	andn	a7,a7,a3
20012cb2:	0505                	addi	a0,a0,1
20012cb4:	0585                	addi	a1,a1,1
20012cb6:	187d                	addi	a6,a6,-1
20012cb8:	fe0812e3          	bnez	a6,20012c9c <consttime_memeq_byte+0xa>
20012cbc:	00c60363          	beq	a2,a2,20012cc2 <consttime_memeq_byte+0x30>
20012cc0:	0000                	unimp
20012cc2:	fec61fe3          	bne	a2,a2,20012cc0 <consttime_memeq_byte+0x2e>
20012cc6:	00028f63          	beqz	t0,20012ce4 <consttime_memeq_byte+0x52>
20012cca:	557d                	li	a0,-1
20012ccc:	00a89363          	bne	a7,a0,20012cd2 <consttime_memeq_byte+0x40>
20012cd0:	0000                	unimp
20012cd2:	fea88fe3          	beq	a7,a0,20012cd0 <consttime_memeq_byte+0x3e>
20012cd6:	1d400513          	li	a0,468
20012cda:	8082                	ret
20012cdc:	c0001073          	unimp
20012ce0:	c0001073          	unimp
20012ce4:	557d                	li	a0,-1
20012ce6:	00a88363          	beq	a7,a0,20012cec <consttime_memeq_byte+0x5a>
20012cea:	0000                	unimp
20012cec:	fea89fe3          	bne	a7,a0,20012cea <consttime_memeq_byte+0x58>
20012cf0:	73900513          	li	a0,1849
20012cf4:	8082                	ret
20012cf6:	c0001073          	unimp
20012cfa:	c0001073          	unimp

and here the one when switching to your suggestion:

20012c92 <consttime_memeq_byte>:
20012c92:	4281                	li	t0,0
20012c94:	58fd                	li	a7,-1
20012c96:	8832                	mv	a6,a2
20012c98:	02080263          	beqz	a6,20012cbc <consttime_memeq_byte+0x2a>
20012c9c:	00054703          	lbu	a4,0(a0)
20012ca0:	0005c683          	lbu	a3,0(a1)
20012ca4:	87ba                	mv	a5,a4
20012ca6:	8fb5                	xor	a5,a5,a3
20012ca8:	0057e2b3          	or	t0,a5,t0
20012cac:	8eb9                	xor	a3,a3,a4
20012cae:	40d8f8b3          	andn	a7,a7,a3
20012cb2:	0505                	addi	a0,a0,1
20012cb4:	0585                	addi	a1,a1,1
20012cb6:	187d                	addi	a6,a6,-1
20012cb8:	fe0812e3          	bnez	a6,20012c9c <consttime_memeq_byte+0xa>
20012cbc:	00c60363          	beq	a2,a2,20012cc2 <consttime_memeq_byte+0x30>
20012cc0:	0000                	unimp
20012cc2:	fec61fe3          	bne	a2,a2,20012cc0 <consttime_memeq_byte+0x2e>
20012cc6:	00028f63          	beqz	t0,20012ce4 <consttime_memeq_byte+0x52>
20012cca:	557d                	li	a0,-1
20012ccc:	00a89363          	bne	a7,a0,20012cd2 <consttime_memeq_byte+0x40>
20012cd0:	0000                	unimp
20012cd2:	fea88fe3          	beq	a7,a0,20012cd0 <consttime_memeq_byte+0x3e>
20012cd6:	1d400513          	li	a0,468
20012cda:	8082                	ret
20012cdc:	c0001073          	unimp
20012ce0:	c0001073          	unimp
20012ce4:	557d                	li	a0,-1
20012ce6:	00a88363          	beq	a7,a0,20012cec <consttime_memeq_byte+0x5a>
20012cea:	0000                	unimp
20012cec:	fea89fe3          	bne	a7,a0,20012cea <consttime_memeq_byte+0x58>
20012cf0:	73900513          	li	a0,1849
20012cf4:	8082                	ret
20012cf6:	c0001073          	unimp
20012cfa:	c0001073          	unimp

Looks to be identical :-)

Which implementation would you prefer? Both work for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

I don't have a strong preference.

From a personal style perspective:

  • I like the pointer arithmetic solution because it more closely resembles the generated code.
  • I don't really like the cast-then-array-index construction ((unsigned char *)lhs)[it]. I prefer to cast the void pointers to the desired type and then operate on the variable with the desired type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, regardless of which form you use, both the current code and my example fragment cast away const. The variables & casts should probably keep const, as we promise that we aren't mutating the data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback.

I agree with your points - I've changed the code to the arithmetic solution. I've also added the const.

Copy link
Contributor

@johannheyszl johannheyszl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thansk @nasahlpa !

The current hardened_memeq() is hardened against SCA and is const.
time. However, it requires that the input buffers are 32-bit word
aligned.

This commit adds a new consttime_memeq_byte() function that can
operate also on buffers that are not 32-bit aligned. It is const.
time but not hardened against SCA.

Signed-off-by: Pascal Nasahl <[email protected]>
We should use consttime_memeq_byte instead of hardened_memeq to
also allow safely operating on a byte buffer (cipher_input).

Signed-off-by: Pascal Nasahl <[email protected]>
@nasahlpa nasahlpa force-pushed the hardened_memeq_const branch from ced6dea to 7387509 Compare October 17, 2025 16:41
@nasahlpa nasahlpa added the CherryPick:earlgrey_1.0.0 This PR should be cherry-picked to earlgrey_1.0.0 label Oct 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CherryPick:earlgrey_1.0.0 This PR should be cherry-picked to earlgrey_1.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants