Skip to content

Conversation

@cjnoname
Copy link

This PR fixes the etag middleware issue in AWS Lambda environments, where the response body cannot be read due to the "Body has already been read" error. This occurs when the Lambda runtime disallows re-reading the body stream, leading to failure when attempting to generate or validate ETags.

#4031

@codecov
Copy link

codecov bot commented May 26, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.34%. Comparing base (2805d32) to head (95e813b).
Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4166      +/-   ##
==========================================
+ Coverage   91.31%   91.34%   +0.02%     
==========================================
  Files         168      168              
  Lines       10788    10821      +33     
  Branches     3183     3076     -107     
==========================================
+ Hits         9851     9884      +33     
  Misses        936      936              
  Partials        1        1              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yusukebe
Copy link
Member

Hi @cjnoname

Thank you for the PR. I'll review this, but can you add proper tests for this change?

@cjnoname
Copy link
Author

cjnoname commented Jun 1, 2025

Hi @cjnoname

Thank you for the PR. I'll review this, but can you add proper tests for this change?

Hey mate, I've added tests to bring the coverage up to 100%. Feel free to make any changes if needed.

@cjnoname cjnoname closed this Jun 5, 2025
@cjnoname cjnoname reopened this Jun 6, 2025
@yusukebe
Copy link
Member

yusukebe commented Jun 7, 2025

Does this really solve #4031? If PR #4198 can't do that, I think this PR also can't resolve this.

Though this is okay, I don't want to add runtime/platform-specific code like the following:

const isLambda = !!(
process.env.AWS_EXECUTION_ENV ||
process.env.LAMBDA_TASK_ROOT ||
process.env.AWS_LAMBDA_FUNCTION_NAME
)

@cjnoname
Copy link
Author

cjnoname commented Jun 7, 2025

Does this really solve #4031? If PR #4198 can't do that, I think this PR also can't resolve this.

Though this is okay, I don't want to add runtime/platform-specific code like the following:

const isLambda = !!(
process.env.AWS_EXECUTION_ENV ||
process.env.LAMBDA_TASK_ROOT ||
process.env.AWS_LAMBDA_FUNCTION_NAME
)

This solves the problem. We just need to ensure that res.clone().body is not consumed in a Lambda environment, because in Lambda, it consumes the "actual" body instead of the "cloned" one.

Your PR won’t work because the error isn’t thrown in the ETag file, but in the subsequent files. Any of those later files that try to consume the body will find that it has already been used, which causes the error.

If you prefer to avoid runtime/platform-specific code, you can use ArrayBuffer in all environments. It’s slightly slower than using clone().body, but the difference is hardly noticeable. That’s why I wrote the code this way—to ensure maximum speed in non-Lambda environments.

@yusukebe
Copy link
Member

yusukebe commented Jun 7, 2025

@cjnoname

So, is it possible that the related problem occurs not only in ETag Middleware?

@cjnoname
Copy link
Author

cjnoname commented Jun 7, 2025

you prefer to avoid runtime/platform-specific code

Good question. I searched for .body usage across the entire application, and currently, the ETag middleware is the only place where res.clone().body is being used. In the Lambda environment, the body can only be consumed once, so for now, this issue only affects the ETag logic. However, if another middleware in the future also tries to consume res.clone().body, it will encounter the same problem in AWS Lambda.

The core difference is: in a browser environment, clone() duplicates both the response object and the underlying stream, allowing the stream to be consumed multiple times.

In contrast, in a Lambda environment, clone() only duplicates the response object—not the underlying stream.

To avoid writing runtime/platform-specific code, there are two ways to solve this:

  1. Consume the body only once – In this case, the ETag middleware should use res.arrayBuffer() instead of res.clone().body.

  2. Consume res.body once globally, cache it in memory, and reuse it wherever needed.

I believe the second option is the one you prefer, but it requires a more global understanding of the changes. Unfortunately, I’m unable to help with that, as I don’t have enough context on how the response body is being used throughout the application.

@yusukebe
Copy link
Member

yusukebe commented Jun 7, 2025

@cjnoname

Thank you for the explanation. I don't want to change the internal logic, but how about adding the following option, like generateDigestFromContext or generateDigestFromResponse?

type ETagOptions = {
  retainedHeaders?: string[]
  weak?: boolean
  generateDigest?: (body: Uint8Array) => ArrayBuffer | Promise<ArrayBuffer>
  generateDigestFromContext?: (c: Context) => ArrayBuffer | Promise<ArrayBuffer> // Add
  // or `generateDigestFromResponse`
  // generateDigestFromResponse?: (res: Response) => ArrayBuffer | Promise<ArrayBuffer>
}

If you want to use ETag Middleware on AWS Lambda, you can pass the function to generate a hash from the Context object or the Response object. It will be a little bit bothersome to you, but this is more generic, and there will not be runtime/platform-specific code and performance degradation.

@cjnoname
Copy link
Author

cjnoname commented Jun 7, 2025

@cjnoname

So, is it possible that the related problem occurs not only in ETag Middleware?

This answer was generated by GPT-4.1. I hope it provides you with a deeper understanding of the issue.

You asked about [res.clone().body] and why, in Lambda environments, it sometimes doesn't work as expected (the stream isn't really cloned).

Explanation:

In standard environments (like browsers or Node.js with undisturbed Fetch API), [res.clone().body] gives you a fresh, readable stream. You can read the body multiple times: once from the original, once from the clone.
In AWS Lambda (or similar serverless environments), the Response object may not be a true Web Fetch Response. Sometimes, the body stream is already consumed or not properly cloneable. This means [res.clone().body] might not actually give you a new, readable stream—it could be locked, empty, or throw errors.

Why does this happen?

Lambda runtimes may wrap or polyfill the Response object, and their implementation of [clone()] may not fully support duplicating the stream.

Streams in JavaScript can only be read once. If the Lambda runtime or a previous middleware has already read the stream, cloning won't work.

How is this handled in your code?

Your code checks if it's running in Lambda ([isLambda]).
If so, it reads the body as an [ArrayBuffer] immediately ([await res.arrayBuffer()], then re-creates the Response. This ensures the body is available for hashing and for the client.
In non-Lambda environments, it safely uses [res.clone().body].

Summary:
In Lambda, [res.clone().body] is unreliable because the stream may already be consumed or not properly cloneable. The workaround is to read the body as a buffer and re-create the Response, as your code does. This ensures consistent behavior across environments.

@yusukebe
Copy link
Member

yusukebe commented Jun 7, 2025

@cjnoname

So I can understand the behavior on the AWS Lambda, but I want to know if that does #4166 (comment) work for you?

@cjnoname
Copy link
Author

cjnoname commented Jun 7, 2025

@cjnoname

Thank you for the explanation. I don't want to change the internal logic, but how about adding the following option, like generateDigestFromContext or generateDigestFromResponse?

type ETagOptions = {
  retainedHeaders?: string[]
  weak?: boolean
  generateDigest?: (body: Uint8Array) => ArrayBuffer | Promise<ArrayBuffer>
  generateDigestFromContext?: (c: Context) => ArrayBuffer | Promise<ArrayBuffer> // Add
  // or `generateDigestFromResponse`
  // generateDigestFromResponse?: (res: Response) => ArrayBuffer | Promise<ArrayBuffer>
}

If you want to use ETag Middleware on AWS Lambda, you can pass the function to generate a hash from the Context object or the Response object. It will be a little bit bothersome to you, but this is more generic, and there will not be runtime/platform-specific code and performance degradation.

I cannot see the full implementation, but I’m sure any change that avoids calling res.clone().body can help.

But the best approach is still to read the full body once at the beginning and cache it so all subsequent middlewares can share it.

@yusukebe
Copy link
Member

yusukebe commented Jun 7, 2025

But the best approach is still to read the full body once at the beginning and cache it so all subsequent middlewares can share it.

Using await res.arrayBuffer() will affect the performance if it sends a large amount of content. I think we shoud go with #4166 (comment)

@cjnoname
Copy link
Author

cjnoname commented Jun 7, 2025

But the best approach is still to read the full body once at the beginning and cache it so all subsequent middlewares can share it.

Using await res.arrayBuffer() will affect the performance if it sends a large amount of content. I think we shoud go with #4166 (comment)

I mean, if you could read res.body() once at the beginning and cache it in memory for all upcoming uses to reuse, that would be the best approach.

For example, in your context, you could define a variable like bodyText (or any suitable name) to hold the content of res.body(). Then, middlewares like etag can simply read the body from your context instead of cloning and reading it each time.

@yusukebe
Copy link
Member

yusukebe commented Jun 7, 2025

For example, in your context, you could define a variable like bodyText (or any suitable name) to hold the content of res.body(). Then, middlewares like etag can simply read the body from your context instead of cloning and reading it each time.

Does this method really work on AWS Lambda? Can it read res.body() on AWS Lambda? Anyway, adding the option will be simpler and don't touch the logics. I'll create a PR.

@cjnoname
Copy link
Author

cjnoname commented Jun 7, 2025

res.body()

Lambda can read res.body(). You need to understand that the body is actually a stream, which can only be consumed once. If it is consumed in the etag middleware, it is gone and cannot be consumed again in the actual place where it is needed.

@yusukebe
Copy link
Member

yusukebe commented Jun 7, 2025

Do you mean changing this res.clone().body to res.body will work?

const hash = await generateDigest(res.clone().body, generator)

To:

const hash = await generateDigest(res.body, generator) 

@cjnoname
Copy link
Author

cjnoname commented Jun 7, 2025

Do you mean changing this res.clone().body to res.body will work?

const hash = await generateDigest(res.clone().body, generator)

To:

const hash = await generateDigest(res.body, generator) 

Do you mean changing this res.clone().body to res.body will work?

const hash = await generateDigest(res.clone().body, generator)

To:

const hash = await generateDigest(res.body, generator) 

Lambda 環境では、時々 body のクローン(clone())が正しく動作しないことがあります。body 自体は正常に動作していますが、もしこの時点で body を消費してしまうと、後続の処理で再び body を使用しようとした際にエラーが発生してしまいます。

@yusukebe
Copy link
Member

yusukebe commented Jun 7, 2025

@cjnoname

I can understand. But can you answer the question? If not, we can't help you. Sorry.

@yusukebe
Copy link
Member

yusukebe commented Jun 7, 2025

My idea is the following:

diff --git a/src/middleware/etag/digest.ts b/src/middleware/etag/digest.ts
index 8c6b6f97..27c7941a 100644
--- a/src/middleware/etag/digest.ts
+++ b/src/middleware/etag/digest.ts
@@ -9,23 +9,25 @@ const mergeBuffers = (buffer1: ArrayBuffer | undefined, buffer2: Uint8Array): Ui
 }

 export const generateDigest = async (
-  stream: ReadableStream<Uint8Array> | null,
+  input: ReadableStream<Uint8Array> | ArrayBuffer | null,
   generator: (body: Uint8Array) => ArrayBuffer | Promise<ArrayBuffer>
 ): Promise<string | null> => {
-  if (!stream) {
+  if (!input) {
     return null
   }

   let result: ArrayBuffer | undefined = undefined
-
-  const reader = stream.getReader()
-  for (;;) {
-    const { value, done } = await reader.read()
-    if (done) {
-      break
+  if (input instanceof ArrayBuffer) {
+    result = await generator(new Uint8Array(input))
+  } else {
+    const reader = input.getReader()
+    for (;;) {
+      const { value, done } = await reader.read()
+      if (done) {
+        break
+      }
+      result = await generator(mergeBuffers(result, value))
     }
-
-    result = await generator(mergeBuffers(result, value))
   }

   if (!result) {
diff --git a/src/middleware/etag/index.ts b/src/middleware/etag/index.ts
index 613070af..9b4c3c76 100644
--- a/src/middleware/etag/index.ts
+++ b/src/middleware/etag/index.ts
@@ -3,6 +3,7 @@
  * ETag Middleware for Hono.
  */

+import type { Context } from '../../context'
 import type { MiddlewareHandler } from '../../types'
 import { generateDigest } from './digest'

@@ -10,6 +11,7 @@ type ETagOptions = {
   retainedHeaders?: string[]
   weak?: boolean
   generateDigest?: (body: Uint8Array) => ArrayBuffer | Promise<ArrayBuffer>
+  getBodyFromContext?: (c: Context) => ArrayBuffer | Promise<ArrayBuffer>
 }

 /**
@@ -93,7 +95,12 @@ export const etag = (options?: ETagOptions): MiddlewareHandler => {
       if (!generator) {
         return
       }
-      const hash = await generateDigest(res.clone().body, generator)
+      const body = options?.getBodyFromContext
+        ? await options.getBodyFromContext(c)
+        : res.clone().body
+
+      const hash = await generateDigest(body, generator)
+
       if (hash === null) {
         return

The usage:

const app = new Hono()
app.use(
  '/etag/*',
  etag({
    getBodyFromContext: async (c) => {
      return await c.res.arrayBuffer()
    },
  })
)

This does not break the logic on platforms/runtimes other than AWS Lambda.

@cjnoname
Copy link
Author

cjnoname commented Jun 7, 2025

My idea is the following:

diff --git a/src/middleware/etag/digest.ts b/src/middleware/etag/digest.ts
index 8c6b6f97..27c7941a 100644
--- a/src/middleware/etag/digest.ts
+++ b/src/middleware/etag/digest.ts
@@ -9,23 +9,25 @@ const mergeBuffers = (buffer1: ArrayBuffer | undefined, buffer2: Uint8Array): Ui
 }

 export const generateDigest = async (
-  stream: ReadableStream<Uint8Array> | null,
+  input: ReadableStream<Uint8Array> | ArrayBuffer | null,
   generator: (body: Uint8Array) => ArrayBuffer | Promise<ArrayBuffer>
 ): Promise<string | null> => {
-  if (!stream) {
+  if (!input) {
     return null
   }

   let result: ArrayBuffer | undefined = undefined
-
-  const reader = stream.getReader()
-  for (;;) {
-    const { value, done } = await reader.read()
-    if (done) {
-      break
+  if (input instanceof ArrayBuffer) {
+    result = await generator(new Uint8Array(input))
+  } else {
+    const reader = input.getReader()
+    for (;;) {
+      const { value, done } = await reader.read()
+      if (done) {
+        break
+      }
+      result = await generator(mergeBuffers(result, value))
     }
-
-    result = await generator(mergeBuffers(result, value))
   }

   if (!result) {
diff --git a/src/middleware/etag/index.ts b/src/middleware/etag/index.ts
index 613070af..9b4c3c76 100644
--- a/src/middleware/etag/index.ts
+++ b/src/middleware/etag/index.ts
@@ -3,6 +3,7 @@
  * ETag Middleware for Hono.
  */

+import type { Context } from '../../context'
 import type { MiddlewareHandler } from '../../types'
 import { generateDigest } from './digest'

@@ -10,6 +11,7 @@ type ETagOptions = {
   retainedHeaders?: string[]
   weak?: boolean
   generateDigest?: (body: Uint8Array) => ArrayBuffer | Promise<ArrayBuffer>
+  getBodyFromContext?: (c: Context) => ArrayBuffer | Promise<ArrayBuffer>
 }

 /**
@@ -93,7 +95,12 @@ export const etag = (options?: ETagOptions): MiddlewareHandler => {
       if (!generator) {
         return
       }
-      const hash = await generateDigest(res.clone().body, generator)
+      const body = options?.getBodyFromContext
+        ? await options.getBodyFromContext(c)
+        : res.clone().body
+
+      const hash = await generateDigest(body, generator)
+
       if (hash === null) {
         return

The usage:

const app = new Hono()
app.use(
  '/etag/*',
  etag({
    getBodyFromContext: async (c) => {
      return await c.res.arrayBuffer()
    },
  })
)

This does not break the logic on platforms/runtimes other than AWS Lambda.

This should work, mate. I'll try it out for a few days before merging the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants