fix: more robust handling of whitespace and alternate pem format #232

Vikas-Sajanani · 2025-05-10T16:00:16Z

I'll explain line by line what changed from the original version to make the code more robust for handling various PEM key formats. Let me break down the key differences:

Original Version

javaprivate static final Pattern PKCS1_PEM_KEY_PATTERN =
    Pattern.compile("(?m)(?s)^---*BEGIN RSA PRIVATE KEY.*---*$(.*)^---*END.*---*$.*");

New Version

java// Matches RSA PEM format
private static final Pattern PKCS1_PEM_KEY_PATTERN =
    Pattern.compile("(?m)(?s)^\\s*-{3,}\\s*BEGIN\\s+RSA\\s+PRIVATE\\s+KEY\\s*-{3,}\\s*$(.*)^\\s*-{3,}\\s*END\\s+RSA\\s+PRIVATE\\s+KEY\\s*-{3,}\\s*$.*");

// Matches PKCS8 PEM format
private static final Pattern PKCS8_PEM_KEY_PATTERN =
    Pattern.compile("(?m)(?s)^\\s*-{3,}\\s*BEGIN\\s+PRIVATE\\s+KEY\\s*-{3,}\\s*$(.*)^\\s*-{3,}\\s*END\\s+PRIVATE\\s+KEY\\s*-{3,}\\s*$.*");

Changes and Reasons:

Added whitespace flexibility (\s*):

Original: ^---BEGIN RSA PRIVATE KEY.---$
New: ^\s-{3,}\sBEGIN\s+RSA\s+PRIVATE\s+KEY\s-{3,}\s*$
Reason: The original pattern didn't handle spaces well. The new pattern allows for spaces before, after, and within the delimiters.

Explicit dash counting (-{3,}):

Original: ---* (three dashes followed by zero or more dashes)
New: -{3,} (three or more dashes)
Reason: More explicit and readable way to handle variable numbers of dashes.

Explicit whitespace between words (\s+):

Original: No explicit handling of spaces between words
New: BEGIN\s+RSA\s+PRIVATE\s+KEY
Reason: Specifically allows for extra spaces between words like "BEGIN", "RSA", etc.

Full END pattern matching:

Original: ^---END.---$
New: ^\s-{3,}\sEND\s+RSA\s+PRIVATE\s+KEY\s-{3,}\s*$
Reason: The original pattern used .* after "END", which was too permissive. The new pattern requires the full "END RSA PRIVATE KEY" text, preventing incorrect matches.

Added PKCS8 pattern:

Original: Only had PKCS1 pattern
New: Added separate pattern for PKCS8 format (BEGIN PRIVATE KEY without "RSA")
Reason: Some PEM files use the PKCS8 format header/footer without "RSA" in the text.

Original loadKeySpec Method

javapublic static Optional<KeySpec> loadKeySpec(final byte[] privateKey) {
  final Matcher isPEM = PKCS1_PEM_KEY_PATTERN.matcher(new String(privateKey));
  if (!isPEM.matches()) {
    return Optional.empty();
  }

  byte[] pkcs1Key = Base64.getMimeDecoder().decode(isPEM.group(1));
  byte[] pkcs8Key = toPkcs8(pkcs1Key);
  final PKCS8EncodedKeySpec keySpec = new PKCS8EncodedKeySpec(pkcs8Key);
  return Optional.of(keySpec);
}

New loadKeySpec Method

javapublic static Optional<KeySpec> loadKeySpec(final byte[] privateKey) {
  final String keyString = new String(privateKey);
  
  // Try to match PKCS1 (RSA) format first
  Matcher isPKCS1 = PKCS1_PEM_KEY_PATTERN.matcher(keyString);
  if (isPKCS1.matches()) {
    return extractKeySpec(isPKCS1.group(1), true);
  }
  
  // Try to match PKCS8 format
  Matcher isPKCS8 = PKCS8_PEM_KEY_PATTERN.matcher(keyString);
  if (isPKCS8.matches()) {
    return extractKeySpec(isPKCS8.group(1), false);
  }
  
  // Not a recognized PEM format
  return Optional.empty();
}

private static Optional<KeySpec> extractKeySpec(String base64Content, boolean isPKCS1) {
  try {
    // Remove all whitespace
    base64Content = base64Content.replaceAll("\\s+", "");
    
    // Check if content is empty after whitespace removal
    if (base64Content.isEmpty()) {
      return Optional.empty();
    }
    
    // Decode the base64 content
    byte[] decodedKey = Base64.getDecoder().decode(base64Content);
    
    // Convert to PKCS8 if necessary
    byte[] pkcs8Key = isPKCS1 ? toPkcs8(decodedKey) : decodedKey;
    
    return Optional.of(new PKCS8EncodedKeySpec(pkcs8Key));
  } catch (IllegalArgumentException e) {
    // Failed to decode base64 content
    return Optional.empty();
  }
}

Changes and Reasons:

Dual format support:

Original: Only tested against PKCS1 pattern
New: Tests against both PKCS1 and PKCS8 patterns
Reason: Provides support for both key formats, increasing compatibility

Extracted shared logic:

Original: All processing in the main method
New: Dedicated extractKeySpec method for common processing
Reason: Better code organization and avoids duplication

Whitespace handling:

Original: Used Base64.getMimeDecoder()
New: Explicitly removes all whitespace with replaceAll("\s+", "")
Reason: More explicit control over whitespace handling

Empty content detection:

Original: No explicit check for empty content
New: if (base64Content.isEmpty()) { return Optional.empty(); }
Reason: Prevents trying to decode empty strings, which caused one of your test failures

Exception handling:

Original: No exception handling
New: try/catch block for IllegalArgumentException
Reason: More robust handling of invalid base64 content

Decoder change:

Original: Base64.getMimeDecoder()
New: Base64.getDecoder()
Reason: After explicitly handling whitespace, the standard decoder is more appropriate

Conditional PKCS8 conversion:

Original: Always converted to PKCS8
New: byte[] pkcs8Key = isPKCS1 ? toPkcs8(decodedKey) : decodedKey;
Reason: Only converts when necessary (PKCS8 format is already in the right format)

… tests

fix: more robust handling of whitespace and alternate pem format with…

2db92ee

… tests

Vikas-Sajanani mentioned this pull request May 10, 2025

PEMs downloaded from GitHub App Registration Portal have empty new line #153

Open

Vikas-Sajanani added 2 commits May 10, 2025 17:19

fix: format changes for jdk 17 pipeline

ae85938

fix: add newline char to clear pipeline

438ba18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: more robust handling of whitespace and alternate pem format #232

fix: more robust handling of whitespace and alternate pem format #232

Uh oh!

Vikas-Sajanani commented May 10, 2025

Uh oh!

Uh oh!

fix: more robust handling of whitespace and alternate pem format #232

Are you sure you want to change the base?

fix: more robust handling of whitespace and alternate pem format #232

Uh oh!

Conversation

Vikas-Sajanani commented May 10, 2025

Uh oh!

Uh oh!