6-fetch

Jan 2, 2023

8524380 · Jan 2, 2023

Name	Name	Last commit message	Last commit date
parent directory ..
src	src	first	Jan 2, 2023
readme.md	readme.md	first	Jan 2, 2023

readme.md

Crawl the page

Let's start crawling real web pages! For these remaining steps, you'll need a website you can crawl. Preferably a small one with less than 100 pages so the crawling doesn't take all day. You can use my personal blog, https://wagslane.dev if you don't have another in mind.

crawlPage(base_url, url, pages)

Create a crawlPage function in crawl.js. For now, it will just take a base URL (the root of the site we're going to crawl).

For now, your function should:

Use fetch to fetch the webpage of the baseURL
If the HTTP status code is an error level code, print an error and return
If the response content-type header isn't text/html print and error and return
Otherwise, just print the HTML body as a string and be done

Remember to use try/catch as appropriate for anything that could result in an error!

main.js

Import crawlPage into your main function, and call it with the base_url passed in and an empty dictionary. Give your program a shot! It should print some HTML that it fetched from the internet!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

6-fetch

6-fetch

readme.md

Crawl the page

crawlPage(base_url, url, pages)

main.js

Files

6-fetch

Directory actions

More options

Directory actions

More options

Latest commit

History

6-fetch

Folders and files

parent directory

readme.md

Crawl the page

crawlPage(base_url, url, pages)

main.js