Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

page.goto() throwing error continuously in AWS Lambda with playwright-aws-lambda #74

Open
PrinceT078 opened this issue Apr 20, 2024 · 13 comments

Comments

@PrinceT078
Copy link

Lambda 1:
playwright-aws-lambda v0.9.0
Node 16

Lambda 2:
playwright-aws-lambda v0.10.0
Node 18

AWS Lambda
CI/CD Gitlab

When trying to go to a URL with page.goto(), getting continuous error - 'Error: page.goto: Target page, context or browser has been closed'.
At handler level the error says - "TypeError: Cannot read properties of undefined (reading 'close')\n at d. (/var/task/src/handler.js:1:17387)\n at Generator.next ()\n at a (/var/task/src/handler.js:1:15631)"

The above error in occurring in lambdas which got built by Gitlab pipeline on/after April 18. The same code is still working fine for the lambdas that were build and deployed before April 18.

The deployment packages for old and new deployments are totally same - node module versions and configs.

Any recent updates somewhere would have affected this? TIA.

@alecmocatta
Copy link

alecmocatta commented Apr 21, 2024

We saw this when the runtime version (Node.js 18.x, x86_64) was automatically updated from

arn:aws:lambda:us-east-1::runtime:0cdcfbdefbc5e7d3343f73c2e2dd3cba17d61dea0686b404502a0c9ce83931b9

to

arn:aws:lambda:us-east-1::runtime:b475b23763329123d9e6f79f51886d0e1054f727f5b90ec945fcb2a3ec09afdd

Switching to manual and using the former fixed for us.

Related:

@PrinceT078
Copy link
Author

@alecmocatta Thank you! That helped.

@hdformat
Copy link

hdformat commented Apr 24, 2024

We saw this when the runtime version (Node.js 18.x, x86_64) was automatically updated from

arn:aws:lambda:us-east-1::runtime:0cdcfbdefbc5e7d3343f73c2e2dd3cba17d61dea0686b404502a0c9ce83931b9

to

arn:aws:lambda:us-east-1::runtime:b475b23763329123d9e6f79f51886d0e1054f727f5b90ec945fcb2a3ec09afdd

Switching to manual and using the former fixed for us.

For someone using ap-northeast-2:

arn:aws:lambda:ap-northeast-2::runtime:464db84005d4f68e67934a6df703e04e6e8782a50ec2882673a8e1c996ab814a

@lugfug
Copy link

lugfug commented Apr 28, 2024

I can confirm that this issue is affecting the us-west-1 region.

Everything was working perfectly fine up to nodejs:18.v26 with our code running the playwright-aws-lambda module up until 04-20-2024, with INIT_START Runtime Version: nodejs:18.v26 Runtime Version ARN: arn:aws:lambda:us-west-1::runtime:0cdcfbdefbc5e7d3343f73c2e2dd3cba17d61dea0686b404502a0c9ce83931b9

On 04-24-2024 I noticed our code was experiencing the "goto" issues others have described, like this error.
2024-04-27T22:15:23.989Z 7a9520ec-54ae-4f40-b7d0-ff7e7abd69c5 INFO Error details: Error: page.goto: Page closed =========================== logs =========================== navigating to "https://ipv4.icanhazip.com/", waiting until "load" ============================================================

It would not matter what site I tried to "goto" the error would always get thrown.

I had a devil of a time trying to diagnose the issue.

I even containerized (in a Docker image) my node.js code which presented a whole series of other UN-related issues...

In a last-ditch effort for some sort of sanity-check, I came here and found that I was not alone in this issue!!!

I can confirm that the runtime nodejs:18.v28 absolutely borked the "goto" functionality of playwright-aws-lambda.
AVOID THIS VERSION:
INIT_START Runtime Version: nodejs:18.v28 Runtime Version ARN: arn:aws:lambda:us-west-1::runtime:b475b23763329123d9e6f79f51886d0e1054f727f5b90ec945fcb2a3ec09afdd

My solution was to edit the AWS Lambda Function Runtime Settings and manually enter the old arn that was previously working, aka nodejs:18.v26,
arn:aws:lambda:us-west-1::runtime:0cdcfbdefbc5e7d3343f73c2e2dd3cba17d61dea0686b404502a0c9ce83931b9

This is a temporary solution.

We all need to find a way to update this Module so that it is compatible with the current versions of the AWS node.js runtime.

How can I help???

@PrinceT078
Copy link
Author

Hi

FYI, same issue started happening since nodejs:16.v35, so going back to current working version nodejs:16.v33

Any permanent solutions for this?

@PrinceT078 PrinceT078 changed the title page.goto() throwing error continuously in AWS Lambda page.goto() throwing error continuously in AWS Lambda with playwright-aws-lambda Apr 30, 2024
@StevenSawtelle
Copy link

We saw this when the runtime version (Node.js 18.x, x86_64) was automatically updated from

arn:aws:lambda:us-east-1::runtime:0cdcfbdefbc5e7d3343f73c2e2dd3cba17d61dea0686b404502a0c9ce83931b9

to

arn:aws:lambda:us-east-1::runtime:b475b23763329123d9e6f79f51886d0e1054f727f5b90ec945fcb2a3ec09afdd

Switching to manual and using the former fixed for us.

going into the AWS console and switching this also resolved the problem for us (although we are on us-west-2, but otherwise same).

however, our project is terraform managed, and from what i can tell, there is no way to specify a specific nodejs runtime arn as part of terraform, although there looks to be a request for that here. curious to know if anyone has a solution for this im not aware of, as otherwise this prevents our deploy flow from working without significant manual steps

@lugfug
Copy link

lugfug commented Apr 30, 2024

I completely resolved my issue with a pretty dramatic solution.

Unfortunately, it won't be the ideal solution for everyone.

Since the underlying issue seems to be a compatibility conflict with the playwright-aws-lambda version 0.10.0 and the updated AWS (Lambda) runtime environment, which we don't have control over, the writing is on the wall.

AWS node.js runtime updates will, most likely, continue to be problematic for the playwright-aws-lambda module for the foreseeable future.

My Solution was to migrate our existing node.js application code into a Docker image to power an AWS Lambda Function from an AWS ECR image.

The original code, before the version conflict problems, was written for the AWS version of node.js 18. Which stopped working when the AWS Lambda Function node.js Runtime was updated to nodejs:18.v28

I accepted the challenge of containerizing my application code, which was a long overdue item on my to-do list.

Given that the container configuration would be fully isolated from the AWS data center, I decided to fully upgrade to node.js 20 and the most current versions of all modules.

Before you ask, NO playwright-aws-lambda still doesn't work in the container 😢

I migrated my code to work with node.js 20 and the Full Version of Playwright.

This is totally worth the effort.

If you have your AWS Lambda Function powered by an image, stored in the AWS ECR (registry) then your image can be up to 10 GB in size before Lambda complains.

This is awesome because you have a lot of storage overhead as your code starts to expand its module requirements.

The biggest advantage of containerizing your code is that YOU are FULLY in control of the runtime environment which will NEVER change unless YOU make the change to the dependencies or the underlying runtime software!

A quick word of caution...

If your code is going to run on an AWS Lambda Function with an x86 CPU, YOU MUST BUILD THE DOCKER IMAGE ON AN x86 CPU LOCALLY.

NO, you can't build the Docker image on Apple silicone or a Desktop ARM CPU and run it on an AWS Lambda Function even if the function is configured to run with an ARM CPU.

The Amazon Graviton2 ARM CPU used in AWS is missing multiple instruction sets that a desktop ARM CPU has.
Building software on a desktop ARM CPU is NOT guaranteed to work on an Amazon Graviton2 ARM-based CPU.

TRUST ME this was a big issue for me because I'm 100% a Mac guy with mostly Apple silicone in all my devices.

You can write your code in VS Code on an M1, M2, M3, M4 Mac with no problem.

THEN you MUST build the Docker image on an Intel (Mac) CPU.

I just sync a folder between my M1 laptop and my Intel Mac, building the docker image (and pushing it to ECR) with the Intel Mac.

I will include my version of Dockerfile and package.json below.

You should ONLY use them for research purposes and DO NOT use them without adapting to your environment.

I wish you all well and hope this information is helpful.

I can now confirm that my code is running faster and more stable than it ever has since I containerized it in a Docker image.

BEST OF LUCK!

Lugfug

Dockerfile Example:

# FILE NAME: Dockerfile

# LOCATION: /lambda_function

# PROJECT NAME: YOUR_PROJECT_NAME

# DEPLOYMENT ENVIRONMENT: AWS Lambda Function

# ARCHITECTURE: x86_64

# AWS LAMBDA FUNCTION NAME: YOUR_FUNCTION_NAME

# DESCRIPTION:
# Dockerfile for Lambda Function
# Builds a custom runtime environment on AWS Lambda node.js base image.
# Installs dependencies, copies source code, and sets the Lambda handler.

# This Dockerfile defines the custom runtime environment for the Lambda function.
# It includes the base OS, installation of necessary dependencies, setting of 
# environment variables, and configuration of the file system and entry point 
# for the Lambda function. This ensures that the Lambda function has everything 
# it needs to execute in the AWS cloud environment.

# !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
# VERY IMPORTANT NOTE:
# This Project Code MUST be deployed on an AWS Lambda Function with x86_64 CPU architecture!
# This Dockerfile MUST be Built on a computer with an x86_64 CPU!
# You may edit these files on a Mac with an Apple Silicon (M1, M2, M3) processor...
# HOWEVER, you MUST build the Dockerfile on a computer with x86 CPU architecture, aka on a MacPro 2013, Windows, Ubuntu.
# Don't use a Virtual Machine if it's on an ARM CPU either.
# You CAN use Proxmox VM as long as Proxmox is running on an x86 CPU.
# !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

# Working with Lambda container images
# https://docs.aws.amazon.com/lambda/latest/dg/images-create.html

# Deploy Node.js Lambda functions with container images
# https://docs.aws.amazon.com/lambda/latest/dg/nodejs-image.html#nodejs-image-instructions

# Package management tool: Amazon Linux 2023
# https://docs.aws.amazon.com/linux/al2023/ug/package-management.html

# Amazon Linux 2023 RPM packages as of the 2023.3.20240219 release
# https://docs.aws.amazon.com/linux/al2023/release-notes/all-packages-AL2023.3.html

# AWS base images for NodeJS
# https://gallery.ecr.aws/lambda/nodejs

# NoddeJS 20 (Amazon linux 2023 "AL2023")
FROM public.ecr.aws/lambda/nodejs:20.2024.04.24.10-x86_64

# Install Dependencies for AL2023 to run Playwright
RUN dnf -y install \
    nss \
    dbus \
    atk \
    cups \
    at-spi2-atk \
    libdrm \
    libXcomposite \
    libXdamage \
    libXfixes \
    libXrandr \
    mesa-libgbm \
    pango \
    alsa-lib

# Copy package*.json to the Lambda task Root directory
COPY package*.json ${LAMBDA_TASK_ROOT}

# Copy all files in ./lambda_function to the Lambda task Root directory
# NOTE: The .dockerignore file is explicitly ignoring the `node_modules` directory. 
COPY lambda_function/ ${LAMBDA_TASK_ROOT}

# Set Correct File Permissions Before Building the Docker Image
RUN chmod 755 -R ${LAMBDA_TASK_ROOT}

# Run npm install to install all the dependencies
RUN npm install

# Set the path where Playwright should install Chromium
ENV PLAYWRIGHT_BROWSERS_PATH=/var/task

# Install Playwright and specific browser binaries
RUN npx playwright install chromium

# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "index.handler" ]

package.json:

{
  "name": "function_name",
  "version": "0.0.1463",
  "date": "04-29-2024",
  "state": "(In Development)",
  "description": "This is an AWS Lambda Function that connects to a URL and extracts the data from the webpage.",
  "note": "This package.json file be in the lambda_function directory to run the zip command or any of the scripts",
  "main": "index.js",
  "scripts": {
  },
  "keywords": [],
  "author": "ME",
  "license": "ISC",
  "dependencies": {
    "@mozilla/readability": "^0.5.0",
    "adm-zip": "^0.5.12",
    "aws-sdk": "^2.1608.0",
    "axios": "^1.6.8",
    "cheerio": "^1.0.0-rc.12",
    "fingerprint-generator": "^2.1.50",
    "fingerprint-injector": "^2.1.50",
    "generative-bayesian-network": "^2.1.50",
    "got-scraping": "^4.0.5",
    "header-generator": "^2.1.50",
    "https-proxy-agent": "^7.0.4",
    "jsdom": "^24.0.0",
    "playwright": "^1.43.1",
    "playwright-aws-lambda": "^0.10.0",
    "playwright-chromium": "^1.43.1",
    "playwright-core": "^1.43.1",
    "@playwright/browser-chromium": "^1.43.1",
    "top-user-agents": "^2.1.20",
    "tslib": "^2.6.2",
    "turndown": "^7.1.3",
    "unique-random-array": "^3.0.0"
  }
}

@PrinceT078
Copy link
Author

PrinceT078 commented May 1, 2024

We saw this when the runtime version (Node.js 18.x, x86_64) was automatically updated from

arn:aws:lambda:us-east-1::runtime:0cdcfbdefbc5e7d3343f73c2e2dd3cba17d61dea0686b404502a0c9ce83931b9

to

arn:aws:lambda:us-east-1::runtime:b475b23763329123d9e6f79f51886d0e1054f727f5b90ec945fcb2a3ec09afdd

Switching to manual and using the former fixed for us.

going into the AWS console and switching this also resolved the problem for us (although we are on us-west-2, but otherwise same).

however, our project is terraform managed, and from what i can tell, there is no way to specify a specific nodejs runtime arn as part of terraform, although there looks to be a request for that here. curious to know if anyone has a solution for this im not aware of, as otherwise this prevents our deploy flow from working without significant manual steps

@StevenSawtelle If you want to specify nodejs runtime arn through terraform give this a try. We are using the same and it works!

resource "null_resource" "my_lambda"{
 provisioner "local-exec" {
   command = "aws lambda put-runtime-management-config --function-name ARN** --update-runtime-on Manual --runtime-version-arn NodeARN**"
 }
}

ARN** - Your Lambda function arn.
NodeARN** for nodejs:18.v26 in us-east-1 = arn:aws:lambda:us-east-1::runtime:b475b23763329123d9e6f79f51886d0e1054f727f5b90ec945fcb2a3ec09afdd

@LinusU
Copy link

LinusU commented May 1, 2024

Going from the latest (nodejs:18.v28 arn:aws:lambda:eu-north-1::runtime:b475b23763329123d9e6f79f51886d0e1054f727f5b90ec945fcb2a3ec09afdd) down to the previous (nodejs:18.v26 arn:aws:lambda:eu-north-1::runtime:0cdcfbdefbc5e7d3343f73c2e2dd3cba17d61dea0686b404502a0c9ce83931b9) resolved this for us! 🚀

@SabatinoMasala
Copy link

For anyone in the eu-central-1 region, I confirmed this ARN works:

arn:aws:lambda:eu-central-1::runtime:0cdcfbdefbc5e7d3343f73c2e2dd3cba17d61dea0686b404502a0c9ce83931b9

@StevenSawtelle
Copy link

We saw this when the runtime version (Node.js 18.x, x86_64) was automatically updated from

arn:aws:lambda:us-east-1::runtime:0cdcfbdefbc5e7d3343f73c2e2dd3cba17d61dea0686b404502a0c9ce83931b9

to

arn:aws:lambda:us-east-1::runtime:b475b23763329123d9e6f79f51886d0e1054f727f5b90ec945fcb2a3ec09afdd

Switching to manual and using the former fixed for us.

going into the AWS console and switching this also resolved the problem for us (although we are on us-west-2, but otherwise same).
however, our project is terraform managed, and from what i can tell, there is no way to specify a specific nodejs runtime arn as part of terraform, although there looks to be a request for that here. curious to know if anyone has a solution for this im not aware of, as otherwise this prevents our deploy flow from working without significant manual steps

@StevenSawtelle If you want to specify nodejs runtime arn through terraform give this a try. We are using the same and it works!

resource "null_resource" "my_lambda"{
 provisioner "local-exec" {
   command = "aws lambda put-runtime-management-config --function-name ARN** --update-runtime-on Manual --runtime-version-arn NodeARN**"
 }
}

ARN** - Your Lambda function arn. NodeARN** for nodejs:18.v26 in us-east-1 = arn:aws:lambda:us-east-1::runtime:b475b23763329123d9e6f79f51886d0e1054f727f5b90ec945fcb2a3ec09afdd

you're awesome, thanks for this idea! for others: i had to tweak this a decent bit to include an assume-role command before the put-runtime-management-config, because we have a whole aws codebuild pipeline with tight privileges and such involved, but this idea does work wonderfully.

hope this issue still gets resolved so we can go back to auto runtime instead of being stuck on this one, but i'm super happy we dont have to worry about this in the meantime :)

@PrinceT078
Copy link
Author

Hi,

Anyone facing similar issue with Node.js v20?
After upgrading my lambda to Node.js v20, I am getting browser launch error - #78

@ryan-dutton
Copy link

ryan-dutton commented Aug 12, 2024

After trying various approaches and coming up against "context or browser has been closed" sometimes consistently, sometimes intermittently I decided to go with the recommendation from lugfug and build a custom container image.
Since the docs for Playwright references Debian bookworm I decided to go with the flow and use that and not an AWS base image.

My dockerfile looks like this:

FROM node:20-bookworm-slim

# Define custom function directory
ARG FUNCTION_DIR="/function"

# Copy function code
RUN mkdir -p ${FUNCTION_DIR}

# Fixes browser binaries not being found
ENV PLAYWRIGHT_BROWSERS_PATH=0

WORKDIR ${FUNCTION_DIR}

# Install build dependencies
RUN apt-get update && \
    apt-get install -y \
    g++ make cmake unzip libcurl4-openssl-dev poppler-utils \
	build-essential autoconf automake libtool m4 python3 libssl-dev

# Install Node.js dependencies
COPY package.json ${FUNCTION_DIR}

RUN npm update & npm install

# Install the runtime interface client
RUN npm install aws-lambda-ric [email protected] --save

# As per the Playwright documentation, we need to install the browsers
RUN npx -y [email protected] install --with-deps

# Required for Node runtimes which use [email protected]+ because
# by default npm writes logs under /home/.npm and Lambda fs is read-only
ENV NPM_CONFIG_CACHE=/tmp/.npm

# Do this last so code changes don't cause a full rebuild
COPY index.js ${FUNCTION_DIR}

# Set runtime interface client as default command for the container runtime
# and for some reason /usr/local/bin/npx is a symbolic link to /usr/local/lib/node_modules/npm/bin/npx-cli.js
ENTRYPOINT ["/usr/local/lib/node_modules/npm/bin/npx-cli.js", "aws-lambda-ric"]
# Pass the name of the function handler as an argument to the runtime
CMD ["index.handler"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants