OnionDomainExtractor is a Python script designed to automate the search for and extraction of .onion
domains from the dark web using the OnionSearch tool, published by megadose. This project enhances the capabilities of the OnionScanner project by streamlining the collection and classification of onion domain information, making it a valuable tool for researchers, cybersecurity experts, and privacy advocates.
- Concurrent Processing: Efficiently searches multiple keywords simultaneously, drastically reducing search time.
- Automated Classification: Continuously checks and classifies domains based on HTML content, enabling a thorough exploration of relevant sites.
- Comprehensive Output: Generates detailed lists of found domains, categorizing them based on relevance to specified keywords.
The script starts the Tor service to access the .onion network. It reads keywords from a specified text file and employs OnionSearch
to find matching domains. After extraction, it uses curl
to fetch the HTML content of these domains and searches for keyword matches. This loop continues, ensuring that new domains are regularly classified based on their content.
- Cybersecurity Research: Analyze hidden services and gather intelligence on potential threats.
- Privacy Studies: Investigate the dark web's impact on personal privacy and anonymity.
- Market Research: Explore niche markets that exist within the dark web.
Before running the script, ensure you have the following dependencies installed:
python3
tor
onionsearch
torsocks
You can install the required Python libraries using pip. Create a requirements.txt
file with the following contents:
concurrent.futures
- Install Tor:
sudo apt-get install tor
- Install OnionSearch:
pip3 install onionsearch
- Clone the repository:
git clone https://github.com/N4rr34n6/OnionDomainExtractor.git cd OnionDomainExtractor
- Prepare the
Keywords.txt
file with relevant keywords. - Run the script:
python3 onion_domain_extractor.py
- Follow the prompts to initiate the search with
OnionSearch
.
The output files generated by the script include:
- onion_domains_list.txt: A list of extracted .onion domains.
- domains_to_scan.txt: Domains classified as matching specified keywords.
- domains_no_match.txt: Domains that do not match the keywords.
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). See the LICENSE file for more details.
Contributions are welcome! Please feel free to open issues or submit pull requests.
-
What if the script fails to find domains? Ensure the Tor service is running and that
OnionSearch
is correctly configured. -
Can I use custom keywords? Absolutely! Modify the
Keywords.txt
file to include any keywords relevant to your search.
For support or inquiries, please reach out to me via GitHub.