fix: more robust handling of whitespace and alternate pem format #232
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I'll explain line by line what changed from the original version to make the code more robust for handling various PEM key formats. Let me break down the key differences:
Original Version
New Version
Changes and Reasons:
Added whitespace flexibility (\s*):
Original: ^---BEGIN RSA PRIVATE KEY.---$
New: ^\s-{3,}\sBEGIN\s+RSA\s+PRIVATE\s+KEY\s-{3,}\s*$
Reason: The original pattern didn't handle spaces well. The new pattern allows for spaces before, after, and within the delimiters.
Explicit dash counting (-{3,}):
Original: ---* (three dashes followed by zero or more dashes)
New: -{3,} (three or more dashes)
Reason: More explicit and readable way to handle variable numbers of dashes.
Explicit whitespace between words (\s+):
Original: No explicit handling of spaces between words
New: BEGIN\s+RSA\s+PRIVATE\s+KEY
Reason: Specifically allows for extra spaces between words like "BEGIN", "RSA", etc.
Full END pattern matching:
Original: ^---END.---$
New: ^\s-{3,}\sEND\s+RSA\s+PRIVATE\s+KEY\s-{3,}\s*$
Reason: The original pattern used .* after "END", which was too permissive. The new pattern requires the full "END RSA PRIVATE KEY" text, preventing incorrect matches.
Added PKCS8 pattern:
Original: Only had PKCS1 pattern
New: Added separate pattern for PKCS8 format (BEGIN PRIVATE KEY without "RSA")
Reason: Some PEM files use the PKCS8 format header/footer without "RSA" in the text.
Original loadKeySpec Method
New loadKeySpec Method
Changes and Reasons:
Dual format support:
Original: Only tested against PKCS1 pattern
New: Tests against both PKCS1 and PKCS8 patterns
Reason: Provides support for both key formats, increasing compatibility
Extracted shared logic:
Original: All processing in the main method
New: Dedicated extractKeySpec method for common processing
Reason: Better code organization and avoids duplication
Whitespace handling:
Original: Used Base64.getMimeDecoder()
New: Explicitly removes all whitespace with replaceAll("\s+", "")
Reason: More explicit control over whitespace handling
Empty content detection:
Original: No explicit check for empty content
New: if (base64Content.isEmpty()) { return Optional.empty(); }
Reason: Prevents trying to decode empty strings, which caused one of your test failures
Exception handling:
Original: No exception handling
New: try/catch block for IllegalArgumentException
Reason: More robust handling of invalid base64 content
Decoder change:
Original: Base64.getMimeDecoder()
New: Base64.getDecoder()
Reason: After explicitly handling whitespace, the standard decoder is more appropriate
Conditional PKCS8 conversion:
Original: Always converted to PKCS8
New: byte[] pkcs8Key = isPKCS1 ? toPkcs8(decodedKey) : decodedKey;
Reason: Only converts when necessary (PKCS8 format is already in the right format)