Actions

Copy link

Feature DOMAINCLAW-3

closed

RA NK

Epic DOMAINCLAW-1: Mail-Hound Prototype — Domain Probe, Redirect Tracking, Contact Extraction

Export CSV/JSON and Per-Run Logging

Feature DOMAINCLAW-3: Export CSV/JSON and Per-Run Logging

Added by Redmine Admin about 1 month ago. Updated about 1 month ago.

Status:

Closed

Priority:

Normal

Assignee:

Nguyen tuan kiet

Start date:

05/08/2026

Due date:

% Done:

100%

Estimated time:

Description

Objective¶

Run a deeper crawl only on the domains selected from the fast precheck result and extract useful crawl/contact data.

Description¶

Implement the deep crawl flow for selected domains.

The crawler should collect and display:

Crawled pages
Redirect events
Contact information
Extracted email addresses

The existing redirect rules should be respected, including:

Recording cross-host redirects.
Handling origin pages correctly when a soft redirect is detected.
Continuing to scrape emails from the origin page if the origin still serves valid content.

Acceptance Criteria¶

The user can start a deep crawl using the selected domains.
The crawl result is displayed in the related UI tabs.
The system shows data for:
- Per-domain summary
- Pages
- Redirects
- Contacts
Extracted emails or contact records are shown when available.
Redirects are recorded according to the existing redirect rules.
The result reflects the domains selected by the user.

Definition of Done¶

The deep crawl flow runs end-to-end without blocker errors.
Crawl results are visible in the UI.
Contact/email extraction works for valid pages.
Redirect data is captured consistently.
A crawl failure on one domain does not stop the entire run.

Sub-task 3: Export CSV/JSON and Per-Run Logging¶

Type¶

Feature / Technical

Estimate¶

1 SP

Objective¶

Store every crawl run in a structured and traceable format so that users can review, debug, or share the results after the run is complete.

Description¶

Each run should generate a unique run ID.

For every run, the system should create an output folder using the following structure:

exports/<run_id>/

The following export files must be generated:

summary.csv
pages.csv
redirects.csv
contacts.csv
results.json

A separate log file should also be created for each run under:

logs/

The exported data should match what is shown in the UI.

Acceptance Criteria¶

Each run creates a unique output folder under exports/<run_id>/.
The following required files are created after each run:
- summary.csv
- pages.csv
- redirects.csv
- contacts.csv
- results.json
A dedicated log file is created for each run.
Exported CSV/JSON data can be opened and read successfully.
Exported data matches the data displayed in the UI.
Logs contain enough information to trace errors or failed domains.

Definition of Done¶

Output folder structure is stable and easy to inspect.
Export files are not lost when the UI session ends.
Logs are persisted per run.
A third party can review the exported files without needing access to the running UI.

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

DomainClaw

Custom queries

Feature DOMAINCLAW-3

Export CSV/JSON and Per-Run Logging

Objective¶

Description¶

Acceptance Criteria¶

Definition of Done¶

Sub-task 3: Export CSV/JSON and Per-Run Logging¶

Type¶

Estimate¶

Objective¶

Description¶

Acceptance Criteria¶

Definition of Done¶

RA Updated by Redmine Admin about 1 month ago Actions
Copy link
#1

RA Updated by Redmine Admin about 1 month ago Actions
Copy link
#2

NK Updated by Nguyen tuan kiet about 1 month ago Actions
Copy link
#3

Project

General

Profile

DomainClaw

Custom queries

Feature DOMAINCLAW-3

Export CSV/JSON and Per-Run Logging

Objective¶

Description¶

Acceptance Criteria¶

Definition of Done¶

Sub-task 3: Export CSV/JSON and Per-Run Logging¶

Type¶

Estimate¶

Objective¶

Description¶

Acceptance Criteria¶

Definition of Done¶

RA Updated by Redmine Admin about 1 month ago ActionsCopy link #1

RA Updated by Redmine Admin about 1 month ago ActionsCopy link #2

NK Updated by Nguyen tuan kiet about 1 month ago ActionsCopy link #3

RA Updated by Redmine Admin about 1 month ago Actions
Copy link
#1

RA Updated by Redmine Admin about 1 month ago Actions
Copy link
#2

NK Updated by Nguyen tuan kiet about 1 month ago Actions
Copy link
#3