Opening Framing: The Art of Triage
Alerts flood in constantly. Dozens, hundreds, sometimes thousands per day. Most are noise—false positives, benign activity, or low-priority events. Hidden among them are real threats that require immediate action.
Triage is the skill of quickly separating signal from noise. It's not about being fast for speed's sake—it's about efficiently identifying what matters so you can focus your investigation efforts where they'll have impact.
This week covers the triage mindset, systematic investigation methodology, and the practical skills needed to turn alerts into answers.
Key insight: Good triage isn't about closing tickets quickly. It's about making accurate decisions quickly. A fast wrong decision is worse than a slightly slower right one.
1) The Triage Mindset
Effective triage requires a specific mental approach:
Triage Questions (in order):
1. What triggered this alert?
- Understand the detection logic
- What was the SIEM/tool looking for?
2. Is this expected behavior?
- For this user/system/time?
- Is there a legitimate explanation?
3. What's the potential impact?
- If this is real, how bad is it?
- What assets are at risk?
4. What additional context do I need?
- What other logs/data would help?
- What questions remain unanswered?
5. What's my decision?
- False positive → Close with documentation
- Needs investigation → Dig deeper
- Confirmed threat → Escalate/respond
Classification Framework:
Alert Classifications:
True Positive (TP):
- Alert correctly identified malicious activity
- Action: Investigate and respond
False Positive (FP):
- Alert triggered on benign activity
- Action: Close, consider tuning rule
Benign True Positive (BTP):
- Alert correctly identified activity that looks suspicious
but is actually authorized/expected
- Example: Pen test, authorized admin activity
- Action: Close, document exception
True Negative (TN):
- No alert, no threat (working as expected)
- Not visible in alert queue
False Negative (FN):
- Threat present but no alert
- Only discovered through hunting or incident
- Action: Improve detection
Priority Matrix:
Impact
Low Medium High
┌────────┬────────┬────────┐
High │ P3 │ P2 │ P1 │
Confidence ├────────┼────────┼────────┤
Medium │ P4 │ P3 │ P2 │
├────────┼────────┼────────┤
Low │ P5 │ P4 │ P3 │
└────────┴────────┴────────┘
P1: Immediate response (confirmed critical threat)
P2: Urgent investigation (high-impact possible threat)
P3: Standard investigation (moderate concern)
P4: Low priority (investigate when time permits)
P5: Minimal concern (quick review, likely close)
Key insight: Prioritization prevents thrashing. Without clear priority, analysts bounce between alerts without completing investigations.
2) Initial Alert Assessment
The first 2-5 minutes determine your path forward:
Initial Assessment Checklist:
□ Read the alert details
- What rule/signature triggered?
- What are the key fields?
□ Identify the entities
- Source: Who/what initiated?
- Destination: Who/what was targeted?
- User: Which account involved?
□ Check timestamps
- When did this occur?
- Business hours or off-hours?
- One-time or repeated?
□ Assess asset criticality
- Is the affected system important?
- What data/access does it have?
□ Quick context lookup
- Is this user/system known?
- Any recent tickets for same entity?
- Known maintenance or testing?
Entity Analysis:
For Source IP/Host:
Internal:
- Who owns this system?
- What's its normal function?
- Who normally uses it?
- Any recent alerts for it?
External:
- Reputation check (threat intel)
- Geolocation (expected region?)
- ASN/ownership
- Historical activity in logs
For User Account:
- What's their role?
- Normal working hours?
- Normal systems accessed?
- Recent password changes?
- Privileged account?
For Process/File:
- Known good or suspicious?
- Hash reputation
- Signed/unsigned?
- Normal for this system?
Quick Wins - Fast Closure Patterns:
Patterns that often indicate false positives:
1. Known scanning/testing
- Source is authorized scanner
- Scheduled vulnerability assessment
- Red team exercise
2. Administrative activity
- IT admin doing expected work
- Matches change ticket
- Normal tool for that role
3. Repeated known FP
- Same alert type closed as FP before
- Same entities, same context
- Note: Still document, consider tuning
4. Automated systems
- Backup jobs
- Monitoring systems
- Patch management
Always verify—don't assume!
Key insight: Initial assessment should take 2-5 minutes. If you can't classify in that time, it needs deeper investigation.
3) Investigation Methodology
When initial assessment indicates a real threat, investigate systematically:
Investigation Framework:
1. SCOPE
- What systems/users are involved?
- What's the timeframe?
- What data sources do I need?
2. EVIDENCE
- Collect relevant logs
- Preserve artifacts
- Document everything
3. ANALYZE
- Build timeline of events
- Identify attack progression
- Determine root cause
4. CONCLUDE
- What happened?
- What's the impact?
- What action is needed?
Building the Timeline:
Timeline Template:
Time (UTC) Source Event Significance
─────────────────────────────────────────────────────────────────
09:15:32 Email GW Phishing email received Initial delivery
09:17:45 Endpoint Attachment opened User interaction
09:17:48 Endpoint PowerShell executed Payload execution
09:18:02 Firewall Outbound to C2 IP C2 established
09:25:33 AD Logs Service account auth Lateral movement
09:28:15 File Server Large file access Data staging
09:35:00 Firewall Large outbound transfer Exfiltration
Timeline reveals:
- Attack progression
- Time to detect (gap between first event and alert)
- Scope of compromise
- Response urgency
Pivot Analysis:
Pivoting: Using one finding to discover more
Found suspicious IP? Pivot to find:
→ All connections to/from that IP
→ All users who connected to it
→ All systems that contacted it
→ DNS queries for associated domains
Found malicious file hash? Pivot to find:
→ All systems with that hash
→ Parent process that created it
→ Child processes it spawned
→ Network connections it made
Found compromised user? Pivot to find:
→ All authentications by that user
→ All systems accessed
→ All files touched
→ Any new accounts created
Each pivot may reveal new IOCs to pivot on again
Key insight: Investigation is iterative. Each finding opens new questions. Keep pivoting until you understand the full scope.
4) Common Alert Scenarios
Learn patterns for frequent alert types:
Brute Force / Failed Logins:
Alert: Multiple failed logins detected
Triage questions:
- How many failures? Over what time?
- Same source hitting multiple accounts? (spray)
- Same account from multiple sources? (distributed)
- Any successes after failures? (compromised!)
Investigation:
1. Query all auth events for source IP/user
2. Check if any succeeded
3. If success: investigate post-auth activity
4. Check threat intel for source IP
5. Determine if targeted or opportunistic
Common outcomes:
- External scanning → Block IP, close
- Credential spray with success → Incident!
- User forgot password → Close as BTP
- Service account misconfigured → Fix config
Malware Detection:
Alert: Malware detected on endpoint
Triage questions:
- Was it blocked or just detected?
- Known malware family or generic detection?
- How did it arrive? (email, web, USB)
- Was it executed?
Investigation:
1. Check EDR for full process tree
2. Identify delivery mechanism
3. Check for persistence mechanisms
4. Look for lateral movement
5. Search for same hash elsewhere
Common outcomes:
- Blocked before execution → Improve prevention, close
- Executed but contained → Clean system, investigate scope
- Active infection → Major incident response
Data Exfiltration Alert:
Alert: Large outbound data transfer
Triage questions:
- What system initiated?
- What was the destination?
- How much data?
- What type of data?
Investigation:
1. Identify user and process
2. Determine destination reputation
3. Check if business-justified
4. Review what data was accessed
5. Look for prior suspicious activity
Common outcomes:
- Backup to cloud service → Close as BTP
- Developer uploading to GitHub → Policy reminder
- Unknown destination, sensitive data → Incident!
Suspicious Process Execution:
Alert: Suspicious PowerShell/CMD activity
Triage questions:
- What command was executed?
- Who ran it?
- What was the parent process?
- Is this normal for this user/system?
Investigation:
1. Full command line analysis
2. Check parent process chain
3. Review user's recent activity
4. Check for encoded commands
5. Look for network connections
Common outcomes:
- Admin running script → Verify authorization
- Encoded download command → Likely malicious
- IT automation tool → Close as BTP
Key insight: Pattern recognition speeds triage. As you see more alerts, you'll recognize scenarios faster.
5) Documentation and Handoff
Investigation without documentation is incomplete:
Ticket Documentation Standard:
Summary:
- One-line description of what happened
Classification:
- TP / FP / BTP
- Severity / Priority
Timeline:
- Key events in chronological order
Investigation Steps:
- What you searched
- What you found
- Queries used (for reproducibility)
Findings:
- Root cause (if determined)
- Scope of impact
- IOCs identified
Actions Taken:
- Containment measures
- Remediation steps
- Escalations made
Recommendations:
- Detection improvements
- Prevention measures
- Follow-up needed
Writing Good Notes:
Bad note:
"Checked logs, looks fine, closing as FP"
Good note:
"Alert triggered by user jsmith authenticating to
server SQL01 from IP 10.0.1.50 at 14:32 UTC.
Verified: jsmith is DBA, SQL01 is their assigned server,
source IP is their workstation per CMDB. Activity occurred
during business hours and matches normal pattern.
Classification: Benign True Positive
Action: No action needed, closing.
Recommendation: Consider excluding DBA group from this rule
for known database servers."
The good note:
- Explains what was checked
- Shows reasoning
- Enables future reference
- Suggests improvement
Escalation Communication:
When escalating, provide:
1. What happened (brief summary)
2. Why it's being escalated (severity/complexity)
3. What's been done (actions taken)
4. What's needed (specific ask)
5. Urgency level (how fast is response needed)
Example escalation:
"Escalating to Tier 2: Confirmed malware execution on
FINANCE-WS-42 (user: jdoe, Finance Dept).
Malware: Emotet dropper (hash: abc123...)
Delivery: Phishing email at 10:15 UTC
Execution: 10:23 UTC
C2 connection: Observed to 185.x.x.x
Actions taken: Endpoint isolated via EDR
Need: Deep malware analysis, scope assessment across
Finance department, potential IR escalation.
Urgency: High - active threat, user has access to
sensitive financial data."
Key insight: Your notes are the organizational memory. Future analysts (including future you) will rely on them.
Real-World Context: Triage Under Pressure
Real SOC triage has unique challenges:
Alert Fatigue: When 90% of alerts are false positives, it's tempting to close quickly without investigating. Discipline matters—the one you skip might be real.
Time Pressure: Queue is growing, shift is ending, management wants metrics. Resist the urge to rush. Accurate triage is more valuable than fast triage.
Uncertainty: Many alerts end with "probably benign but not certain." Document your uncertainty and reasoning. If it comes back, you'll know what was checked.
MITRE ATT&CK Application:
- Technique Identification: Map alerts to ATT&CK techniques
- Attack Chain Analysis: Use tactics to understand progression
- Gap Identification: What techniques led to this alert?
Key insight: Triage is a skill developed over thousands of alerts. Every investigation teaches you something.
Guided Lab: Alert Triage Simulation
Practice triaging alerts using systematic methodology.
Scenario 1: Failed Login Alert
ALERT: Multiple Failed Logins
Time: 2024-01-15 14:32:00 UTC
Source IP: 203.0.113.50
Target: VPN Gateway
User: admin
Failures: 47 in 5 minutes
Status: All failed, no success
Your triage:
1. What type of attack does this suggest?
2. What additional information do you need?
3. What's your priority assessment?
4. What action would you take?
Scenario 2: Malware Alert
ALERT: Malware Detected
Time: 2024-01-15 09:45:00 UTC
Host: ACCT-WS-15 (Accounting workstation)
User: mjohnson (Accounts Payable)
Detection: Emotet.Gen!A
File: C:\Users\mjohnson\Downloads\Invoice_Dec2023.doc
Action: Quarantined
Your triage:
1. Was this blocked or does it require investigation?
2. What's the likely delivery mechanism?
3. What should you check next?
4. What's the risk level?
Scenario 3: Data Transfer Alert
ALERT: Large Outbound Transfer
Time: 2024-01-15 03:15:00 UTC (Sunday)
Source: FILESVR-01 (File Server)
Destination: 185.100.87.XX (Unknown IP, Eastern Europe)
Size: 4.7 GB
Protocol: HTTPS
Your triage:
1. What makes this suspicious?
2. What's your priority assessment?
3. What do you need to investigate?
4. Should this be escalated?
Scenario 4: Suspicious Command
ALERT: Suspicious PowerShell
Time: 2024-01-15 11:22:00 UTC
Host: DEV-WS-07 (Developer workstation)
User: tsmith (Software Developer)
Command: powershell -enc [base64 string]
Parent Process: outlook.exe
Your triage:
1. Why is this suspicious?
2. What does the parent process tell you?
3. What would you investigate?
4. What's your classification?
Step 5: Reflection (mandatory)
- Which scenario was hardest to triage? Why?
- What information was missing that would have helped?
- How did the time and day affect your assessment?
- What patterns did you notice across scenarios?
Week 5 Outcome Check
By the end of this week, you should be able to:
- Apply a systematic triage methodology to alerts
- Classify alerts as TP, FP, or BTP
- Prioritize alerts based on impact and confidence
- Conduct structured investigations using pivot analysis
- Build investigation timelines
- Document findings and escalate appropriately
Next week: Incident Response Foundations—what happens when triage confirms a real incident.
🎯 Hands-On Labs (Free & Essential)
Practice triage workflows before moving to reading resources.
🎮 TryHackMe: Splunk 101 (Alert Triage)
What you'll do: Run searches, pivot from an alert, and extract key context.
Why it matters: Triage relies on fast log pivots and clear evidence.
Time estimate: 1.5-2 hours
📝 Lab Exercise: Alert Triage Worksheet
Task: Classify 8 sample alerts as TP/FP/BTP and assign priority.
Deliverable: Triage table with rationale + next investigative step for each.
Why it matters: Consistent triage reduces mistakes and speeds response.
Time estimate: 60-90 minutes
🏁 PicoCTF Practice: Forensics (Alert Validation)
What you'll do: Solve beginner forensics challenges that validate alert evidence.
Why it matters: Triage decisions depend on artifact confirmation.
Time estimate: 1-2 hours
💡 Lab Tip: Document the detection logic before you start pivoting so your reasoning is traceable.
🛡️ SIEM Query Optimization
Fast queries keep triage moving. Optimized searches reduce noise and surface high-signal evidence quickly.
Query optimization checklist:
- Filter early (time, host, user, index)
- Use structured fields instead of free-text
- Start narrow, then expand scope
- Save high-value queries as templates
📚 Building on CSY101 Week-14: Use standards-driven logging requirements to choose key fields.
Resources
Complete the required resources to build your foundation.
- SANS - Alert Triage Best Practices · 30-45 min · 50 XP · Resource ID: csy201_w5_r1 (Required)
- SANS - Hunt Evil Poster (Investigation Reference) · Reference · 50 XP · Resource ID: csy201_w5_r2 (Required)
- MITRE CAR - Cyber Analytics Repository · Reference · 25 XP · Resource ID: csy201_w5_r3 (Optional)
Lab: Full Alert Investigation
Goal: Conduct a complete investigation from initial alert through documented findings.
Scenario
You receive the following alert at the start of your shift:
ALERT: Possible Credential Theft
Time: 2024-01-15 08:15:00 UTC
Host: HR-WS-03 (HR Manager workstation)
User: lthompson (HR Manager)
Detection: Mimikatz-like activity detected
Process: C:\Windows\Temp\update.exe
Parent: explorer.exe
Network: Connection to 45.33.XX.XX:443 observed
Part 1: Initial Assessment (10 min)
- Document your initial assessment
- List your triage questions
- Assign initial priority
- Identify data sources needed
Part 2: Investigation (30 min)
- Document what queries you would run
- Create hypothetical findings for each query
- Build an attack timeline
- Identify all IOCs
Part 3: Scope Assessment
- What systems might be affected?
- What data might be at risk?
- What additional investigation is needed?
Part 4: Documentation
- Write complete ticket documentation
- Include all sections from the template
- Write escalation communication
Deliverable (submit):
- Initial assessment document
- Investigation queries and hypothetical findings
- Attack timeline
- Complete ticket documentation
- Escalation communication
Checkpoint Questions
- What is the difference between a True Positive and a Benign True Positive?
- What are the five questions to ask during initial triage?
- What is pivot analysis and why is it important?
- What should be included in ticket documentation?
- When should an alert be escalated to Tier 2?
- Why is documentation important even for false positives?
Week 05 Quiz
Test your understanding of alert triage, investigation workflow, and escalation decisions.
Format: 10 multiple-choice questions. Passing score: 70%. Time: Untimed.
Take QuizWeekly Reflection
Reflection Prompt (200-300 words):
This week you learned alert triage and investigation—the core daily work of SOC analysts. You practiced systematic approaches to turning alerts into answers.
Reflect on these questions:
- How do you balance speed with thoroughness in triage? What would help you make better decisions faster?
- False positive rates are high in most SOCs. How does this affect analyst behavior and what can be done about it?
- Documentation takes time but provides value. How would you convince a time-pressured analyst to document thoroughly?
- What aspects of investigation feel most challenging to you? What would help you improve?
A strong reflection will honestly assess the challenges of triage work and identify strategies for improvement.
Verified Resources & Videos
- Investigation Techniques: Malware Archaeology Cheat Sheets
- Windows Investigation: LOLBAS Project (Living Off the Land)
- MITRE ATT&CK: Credential Access Tactic
Triage and investigation skills improve with practice. Every alert teaches you something—patterns, techniques, data sources. Keep notes on what you learn. Next week: when triage confirms a threat, incident response takes over.