Epic DOMAINCLAW-1
closedMail-Hound Prototype — Domain Probe, Redirect Tracking, Contact Extraction
Description
Goal¶
Build a working end-to-end MVP flow for Mail-Hound.
The expected workflow is:
Input domains → Fast Precheck → Select domains → Deep Crawl → Review Results → Export CSV/JSON → Review Logs → Run locally and with Docker.
The MVP should allow a user to input a list of domains, quickly validate them, select valid domains for deeper crawling, extract crawl/contact data, and export the results in a structured format.
Scope¶
This sprint focuses on stability, usability, and producing usable output.
The goal is not to fully solve advanced anti-bot, TLS fingerprinting, or complex crawler-hardening cases in this sprint. Those improvements can be handled in a later sprint.
Expected Outcome¶
By the end of the sprint, the prototype should:
- Run from the Streamlit UI.
- Accept multiple domains as input.
- Perform a fast precheck for each domain.
- Allow the user to select domains for deep crawling.
- Crawl selected domains and extract pages, redirects, and contacts.
- Export CSV and JSON files per run.
- Write logs per run.
- Run locally.
- Run with Docker Compose.
Execution Priority¶
- Fast Precheck and Domain Selection
- Deep Crawl Selected Domains and Extract Emails
- Export CSV/JSON and Per-Run Logging
- Verify Local and Docker Run
Final Deliverable¶
A usable Mail-Hound MVP that can run end-to-end and produce real crawl data, contact/email extraction results, export files, logs, and a basic Docker deployment setup.