The framework operates in two distinct modes to bridge the gap between theoretical planning and actual execution: Logical Attack Mode
No regulator currently permits fully autonomous pentesting across organizational boundaries. The DRL agent’s exploratory actions – which deliberately test malformed inputs or race conditions – can crash legacy systems. Thus, real implementations always include a human-in-the-loop gate that vets high-impact actions (e.g., write file to system32 ).
| Action | Reward | |--------|--------| | New service discovered | +0.1 | | New low-priv shell | +1.0 | | Privilege escalation to root | +10.0 | | Compromise domain controller | +100.0 | | Detection / Honeypot triggered | -5.0 | | Crash a critical service | -20.0 |