Skip to content

screenshot webpages and write all text to file within a rendered page

License

Notifications You must be signed in to change notification settings

reallygoodprogrammer/screenshawty

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

screenshawty

A quick go application for taking a screenshot of a webpage and writing a list of words that appear within the rendered text using playwright-go. I use this for scraping visible page data for large sets of urls.

install

# install playwright-go
go install github.com/playwright-community/playwright-go/cmd/[email protected]
playwright install --with-deps

# install screenshawty
go install github.com/reallygoodprogrammer/screenshawty@latest

example and usage

# pipe urls into screenshawty
# writes files to ./shawty_output/...
cat urls | screenshawty
Usage of screenshawty:
  -concurrency int
        concurrency for requests (default 5)
  -dir string
        directory to write data to (default "shawty_output")
  -help
        display help message
  -timeout float
        timeout in ms (default 10000)
  -wait-time int
        wait time before taking screenshot (default 2)

About

screenshot webpages and write all text to file within a rendered page

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages