This Python script automatically filters CV files based on keywords. It extracts text from PDFs, DOCX documents, and TXT files, checks for target keywords, and then sorts the CVs into two folders:
match/โ CV contains at least one keyworddo_not_match/โ CV does not contain any keywords
It also supports importing a LinkedIn applicant ZIP file automatically.
- Reads PDF, DOCX, and TXT CVs.
- Automatically extracts text.
- Keyword-based filtering (Python, Excel, Machine Learning, etc.).
- Creates output folders automatically.
- Supports LinkedIn applicant export ZIP files.
- Clean and fast filtering โ works with large batches of CVs.
pip install -r requirements.txt1.Place your LinkedIn exported ZIP file as:
linkedin_applicants.zip2.Run the script:
python Cv_filter.py3.Results:
-
Matched CVs โ match/
-
Not matched โ do_not_match/
-
Extracted files โ cvs/
Edit in Cv_filter.py/ Add or remove keywords as needed.
KEYWORDS = ["python", "excel", "data analysis"]