Skip to content

Commit 6534ece

Browse files
authored
Merge pull request #1532 from unclecode/fix/update-documentation
Standardize C4A-Script tutorial, add CLI identity-based crawling, and add sponsorship CTA
2 parents 89e28d4 + 7dfe528 commit 6534ece

File tree

7 files changed

+55
-19
lines changed

7 files changed

+55
-19
lines changed

docs/examples/c4a_script/tutorial/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ A comprehensive web-based tutorial for learning and experimenting with C4A-Scrip
1818

1919
2. **Install Dependencies**
2020
```bash
21-
pip install flask
21+
pip install -r requirements.txt
2222
```
2323

2424
3. **Launch the Server**
@@ -28,7 +28,7 @@ A comprehensive web-based tutorial for learning and experimenting with C4A-Scrip
2828

2929
4. **Open in Browser**
3030
```
31-
http://localhost:8080
31+
http://localhost:8000
3232
```
3333

3434
**🌐 Try Online**: [Live Demo](https://docs.crawl4ai.com/c4a-script/demo)
@@ -325,7 +325,7 @@ Powers the recording functionality:
325325
### Configuration
326326
```python
327327
# server.py configuration
328-
PORT = 8080
328+
PORT = 8000
329329
DEBUG = True
330330
THREADED = True
331331
```
@@ -343,9 +343,9 @@ THREADED = True
343343
**Port Already in Use**
344344
```bash
345345
# Kill existing process
346-
lsof -ti:8080 | xargs kill -9
346+
lsof -ti:8000 | xargs kill -9
347347
# Or use different port
348-
python server.py --port 8081
348+
python server.py --port 8001
349349
```
350350

351351
**Blockly Not Loading**

docs/examples/c4a_script/tutorial/server.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -216,7 +216,7 @@ def get_examples():
216216
'name': 'Handle Cookie Banner',
217217
'description': 'Accept cookies and close newsletter popup',
218218
'script': '''# Handle cookie banner and newsletter
219-
GO http://127.0.0.1:8080/playground/
219+
GO http://127.0.0.1:8000/playground/
220220
WAIT `body` 2
221221
IF (EXISTS `.cookie-banner`) THEN CLICK `.accept`
222222
IF (EXISTS `.newsletter-popup`) THEN CLICK `.close`'''

docs/md_v2/advanced/identity-based-crawling.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,42 @@ If you installed Crawl4AI (which installs Playwright under the hood), you alread
8282

8383
---
8484

85+
### Creating a Profile Using the Crawl4AI CLI (Easiest)
86+
87+
If you prefer a guided, interactive setup, use the built-in CLI to create and manage persistent browser profiles.
88+
89+
1.⠀Launch the profile manager:
90+
```bash
91+
crwl profiles
92+
```
93+
94+
2.⠀Choose "Create new profile" and enter a profile name. A Chromium window opens so you can log in to sites and configure settings. When finished, return to the terminal and press `q` to save the profile.
95+
96+
3.⠀Profiles are saved under `~/.crawl4ai/profiles/<profile_name>` (for example: `/home/<you>/.crawl4ai/profiles/test_profile_1`) along with a `storage_state.json` for cookies and session data.
97+
98+
4.⠀Optionally, choose "List profiles" in the CLI to view available profiles and their paths.
99+
100+
5.⠀Use the saved path with `BrowserConfig.user_data_dir`:
101+
```python
102+
from crawl4ai import AsyncWebCrawler, BrowserConfig
103+
104+
profile_path = "/home/<you>/.crawl4ai/profiles/test_profile_1"
105+
106+
browser_config = BrowserConfig(
107+
headless=True,
108+
use_managed_browser=True,
109+
user_data_dir=profile_path,
110+
browser_type="chromium",
111+
)
112+
113+
async with AsyncWebCrawler(config=browser_config) as crawler:
114+
result = await crawler.arun(url="https://example.com/private")
115+
```
116+
117+
The CLI also supports listing and deleting profiles, and even testing a crawl directly from the menu.
118+
119+
---
120+
85121
## 3. Using Managed Browsers in Crawl4AI
86122

87123
Once you have a data directory with your session data, pass it to **`BrowserConfig`**:

docs/md_v2/apps/c4a-script/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ A comprehensive web-based tutorial for learning and experimenting with C4A-Scrip
1818

1919
2. **Install Dependencies**
2020
```bash
21-
pip install flask
21+
pip install -r requirements.txt
2222
```
2323

2424
3. **Launch the Server**
@@ -28,7 +28,7 @@ A comprehensive web-based tutorial for learning and experimenting with C4A-Scrip
2828

2929
4. **Open in Browser**
3030
```
31-
http://localhost:8080
31+
http://localhost:8000
3232
```
3333

3434
**🌐 Try Online**: [Live Demo](https://docs.crawl4ai.com/c4a-script/demo)
@@ -325,7 +325,7 @@ Powers the recording functionality:
325325
### Configuration
326326
```python
327327
# server.py configuration
328-
PORT = 8080
328+
PORT = 8000
329329
DEBUG = True
330330
THREADED = True
331331
```
@@ -343,9 +343,9 @@ THREADED = True
343343
**Port Already in Use**
344344
```bash
345345
# Kill existing process
346-
lsof -ti:8080 | xargs kill -9
346+
lsof -ti:8000 | xargs kill -9
347347
# Or use different port
348-
python server.py --port 8081
348+
python server.py --port 8001
349349
```
350350

351351
**Blockly Not Loading**

docs/md_v2/apps/c4a-script/server.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -216,7 +216,7 @@ def get_examples():
216216
'name': 'Handle Cookie Banner',
217217
'description': 'Accept cookies and close newsletter popup',
218218
'script': '''# Handle cookie banner and newsletter
219-
GO http://127.0.0.1:8080/playground/
219+
GO http://127.0.0.1:8000/playground/
220220
WAIT `body` 2
221221
IF (EXISTS `.cookie-banner`) THEN CLICK `.accept`
222222
IF (EXISTS `.newsletter-popup`) THEN CLICK `.close`'''
@@ -283,7 +283,7 @@ def get_examples():
283283
return jsonify(examples)
284284

285285
if __name__ == '__main__':
286-
port = int(os.environ.get('PORT', 8080))
286+
port = int(os.environ.get('PORT', 8000))
287287
print(f"""
288288
╔══════════════════════════════════════════════════════════╗
289289
║ C4A-Script Interactive Tutorial Server ║

docs/md_v2/core/c4a-script.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -69,12 +69,12 @@ The tutorial includes a Flask-based web interface with:
6969
cd docs/examples/c4a_script/tutorial/
7070

7171
# Install dependencies
72-
pip install flask
72+
pip install -r requirements.txt
7373

7474
# Launch the tutorial server
75-
python app.py
75+
python server.py
7676

77-
# Open http://localhost:5000 in your browser
77+
# Open http://localhost:8000 in your browser
7878
```
7979

8080
## Core Concepts
@@ -111,8 +111,8 @@ CLICK `.submit-btn`
111111
# By attribute
112112
CLICK `button[type="submit"]`
113113
114-
# By text content
115-
CLICK `button:contains("Sign In")`
114+
# By accessible attributes
115+
CLICK `button[aria-label="Search"][title="Search"]`
116116
117117
# Complex selectors
118118
CLICK `.form-container input[name="email"]`

docs/md_v2/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@
5757

5858
Crawl4AI is the #1 trending GitHub repository, actively maintained by a vibrant community. It delivers blazing-fast, AI-ready web crawling tailored for large language models, AI agents, and data pipelines. Fully open source, flexible, and built for real-time performance, **Crawl4AI** empowers developers with unmatched speed, precision, and deployment ease.
5959

60-
> **Note**: If you're looking for the old documentation, you can access it [here](https://old.docs.crawl4ai.com).
60+
> Enjoy using Crawl4AI? Consider **[becoming a sponsor](https://github.com/sponsors/unclecode)** to support ongoing development and community growth!
6161
6262
## 🆕 AI Assistant Skill Now Available!
6363

0 commit comments

Comments
 (0)