A full-stack job search assistant that crawls postings from major tech companies, analyzes them with LLMs, parses your resume, and runs AI mock interviews — so you can focus on preparing, not sifting through job boards.
Four pieces wired into one flow: a crawler pulls postings from company career sites, an LLM reads each one for requirements and skills, your resume gets parsed and scored against them, and any posting can drive an AI mock interview straight off its job description. The React frontend ties it together, so you go from "what's out there" to "let me practice for this one" without leaving the app.
- Job crawler — pulls postings from Tencent, NetEase, ByteDance, Amazon and more, via API or Selenium, with automatic cleaning and normalization.
- LLM analysis — extracts education/major requirements, scores skill tags (1–5), and classifies each posting into a job taxonomy.
- Resume parsing & matching — parses PDF/Word resumes, scores skills, and computes a case-insensitive job-resume match percentage.
- AI mock interview — generates questions from any job description and runs a multi-turn interview with real-time feedback.
FindJobs-Agent/
├── FrontEnd/ # React frontend
│ ├── src/
│ │ ├── components/ # Page components
│ │ │ ├── JobsPage.tsx # Job browsing
│ │ │ ├── ResumePage.tsx # Resume analysis
│ │ │ └── InterviewPage.tsx # AI interview
│ │ └── App.tsx
│ └── package.json
├── job_crawler_v2.py # Multi-company crawler (primary)
├── job_crawler_selenium.py # Selenium crawler
├── job_agent.py # LLM job analysis agent
├── pipeline.py # Data processing pipeline
├── api_server.py # Flask API server
├── interview_agent.py # AI interview module
├── resume_parser.py # Resume parser
├── tag_rate.py # Skill scoring
├── llm_client.py # LLM client
├── tech_taxonomy.json # Job taxonomy
├── all_labels.csv # Skill tag library
└── requirements.txt
- Python 3.9+
- Node.js 18+
- Chrome (required for Selenium crawler)
git clone https://github.com/he-yufeng/FindJobs-Agent.git
cd FindJobs-Agentpip install -r requirements.txtCreate an API_key.md file with your OpenAI API key:
sk-your-api-key-here
python api_server.pycd FrontEnd
npm install
npm run devVisit http://localhost:8080 in your browser.
pipeline.py chains crawl → analyze → score → serve. Run the whole thing, or a single stage:
python pipeline.py # crawl + analyze + build site data
python job_crawler_v2.py -c tencent netease amazon -m 300 # crawl only (--list shows companies)
python pipeline.py --analyze-only --max-jobs 50 # analyze only (for testing)| Endpoint | Method | Description |
|---|---|---|
/api/jobs |
GET | List job postings |
/api/jobs/<id> |
GET | Get job details |
/api/resume/upload |
POST | Upload resume |
/api/resume/analyze |
POST | Analyze resume |
/api/interview/start |
POST | Start mock interview |
/api/interview/answer |
POST | Submit interview answer |
Crawl, analyze, resume match, and mock interview work end to end. The next steps widen the funnel and follow the hunt past the match:
- More job sources — extend the crawler beyond the current company set to job boards and aggregators, so matching isn't limited to a fixed list.
- Incremental crawls — track which postings were already seen and fetch only new ones, instead of re-crawling and re-analyzing the full set each run.
- Application tracking — a simple board for where each application stands (applied / replied / interview), so the tool follows the job hunt past the match step.
- Voice mock interviews — speech in and out for the AI interviewer, closer to a real screen than a text chat.
FindJobs-Agent is one of the applied agents I've built. A few others you might find useful:
- CoreCoder — want to understand how a coding agent really works? Read the whole ~1k-line engine end to end, not a black box.
- RepoWiki — dropped into an unfamiliar codebase? It gives you a guided wiki and a where-to-start reading path, a self-hostable DeepWiki alternative.
- ContractGuard — catch the risky clauses before you sign: it reads contracts and flags the dangerous bits.
- GitSense — want to contribute to open source? It finds issues worth your time and gauges whether your PR will get merged.
- CodeABC — understand any codebase even if you don't code, built for non-programmers.
Issues and pull requests are welcome!
MIT License

