Skip to content

filipopo/undetected-chromedriver-lambda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

undetected-chromedriver-lambda

A minimal example of using undetected-chromedriver on AWS Lambda, based on docker-selenium-lambda

Quick start guide

These instructions are based on AWS provided instructions found here

To deploy this on AWS you will need an ECR repository, you can run the following command to create it

aws ecr create-repository --repository-name undetected-chromedriver-lambda  --image-scanning-configuration scanOnPush=true --image-tag-mutability MUTABLE

Next you will need login credentials to use with docker for your region and aws_account_id e.g us-east-1 and 100000000000, be sure to replace these with your own

aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 100000000000.dkr.ecr.us-east-1.amazonaws.com

Next get the image

docker pull filipmania/undetected-chromedriver-lambda:latest

or

docker build --platform linux/amd64 -t filipmania/undetected-chromedriver-lambda:latest .

and tag it

docker tag filipmania/undetected-chromedriver-lambda:latest 100000000000.dkr.ecr.us-east-1.amazonaws.com/undetected-chromedriver-lambda:latest

Now you are ready to push it

docker push 100000000000.dkr.ecr.us-east-1.amazonaws.com/undetected-chromedriver-lambda:latest

After this you can create a Lambda function using the image url which is 100000000000.dkr.ecr.us-east-1.amazonaws.com/undetected-chromedriver-lambda:latest (you'll need to set the Lambda timeout to more than the default 3 seconds)

Advanced usage

To get more than a minimal example, change the code, rebuild the image, and then run the container

Dynamic url

Change this line to chrome.get(event.get('url')), prepare the container

docker build --platform linux/amd64 -t filipmania/undetected-chromedriver-lambda:latest .
docker run -p 9000:8080 filipmania/undetected-chromedriver-lambda:latest

Then invoke the url set by amazon/aws-lambda-python, any JSON in -d is passed to the function's event

curl 'http://localhost:9000/2015-03-31/functions/function/invocations' -d '{"url": "https://example.com"}'

You can also have a fallback like chrome.get(event.get('url', 'https://example.com')), in which case you can invoke the url like this

curl 'http://localhost:9000/2015-03-31/functions/function/invocations' -d '{}'

Local execution

Add the following to the end of main.py

if __name__ == '__main__' and os.getenv('AWS_LAMBDA_FUNCTION_NAME') is None:
    print(handler())

Now you can run the function outside of Lambda

docker build --platform linux/amd64 -t filipmania/undetected-chromedriver-lambda:latest .
docker run --entrypoint python filipmania/undetected-chromedriver-lambda:latest main.py

You can add a dynamic url to this setup with something like

event = {'url': sys.argv[1]} if len(sys.argv) > 1 else {}

Thanks to

https://github.com/umihico/docker-selenium-lambda

https://github.com/ultrafunkamsterdam/undetected-chromedriver

Unknown ghost from issue 2

About

A minimal working example of using undetected-chromedriver on AWS Lambda with Selenium and Docker

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •