CSY201 Week 05 - Practice triage workflows before moving to reading resources.

Opening Framing: The Art of Triage

Alerts flood in constantly. Dozens, hundreds, sometimes thousands per day. Most are noise—false positives, benign activity, or low-priority events. Hidden among them are real threats that require immediate action.

Triage is the skill of quickly separating signal from noise. It's not about being fast for speed's sake—it's about efficiently identifying what matters so you can focus your investigation efforts where they'll have impact.

This week covers the triage mindset, systematic investigation methodology, and the practical skills needed to turn alerts into answers.

Key insight: Good triage isn't about closing tickets quickly. It's about making accurate decisions quickly. A fast wrong decision is worse than a slightly slower right one.

1) The Triage Mindset

Effective triage requires a specific mental approach:

Triage Questions (in order):

1. What triggered this alert?
   - Understand the detection logic
   - What was the SIEM/tool looking for?

2. Is this expected behavior?
   - For this user/system/time?
   - Is there a legitimate explanation?

3. What's the potential impact?
   - If this is real, how bad is it?
   - What assets are at risk?

4. What additional context do I need?
   - What other logs/data would help?
   - What questions remain unanswered?

5. What's my decision?
   - False positive → Close with documentation
   - Needs investigation → Dig deeper
   - Confirmed threat → Escalate/respond

Classification Framework:

Alert Classifications:

True Positive (TP):
- Alert correctly identified malicious activity
- Action: Investigate and respond

False Positive (FP):
- Alert triggered on benign activity
- Action: Close, consider tuning rule

Benign True Positive (BTP):
- Alert correctly identified activity that looks suspicious
  but is actually authorized/expected
- Example: Pen test, authorized admin activity
- Action: Close, document exception

True Negative (TN):
- No alert, no threat (working as expected)
- Not visible in alert queue

False Negative (FN):
- Threat present but no alert
- Only discovered through hunting or incident
- Action: Improve detection

Priority Matrix:

                    Impact
                Low    Medium    High
           ┌────────┬────────┬────────┐
     High  │   P3   │   P2   │   P1   │
Confidence ├────────┼────────┼────────┤
    Medium │   P4   │   P3   │   P2   │
           ├────────┼────────┼────────┤
      Low  │   P5   │   P4   │   P3   │
           └────────┴────────┴────────┘

P1: Immediate response (confirmed critical threat)
P2: Urgent investigation (high-impact possible threat)
P3: Standard investigation (moderate concern)
P4: Low priority (investigate when time permits)
P5: Minimal concern (quick review, likely close)

Key insight: Prioritization prevents thrashing. Without clear priority, analysts bounce between alerts without completing investigations.

2) Initial Alert Assessment

The first 2-5 minutes determine your path forward:

Initial Assessment Checklist:

□ Read the alert details
  - What rule/signature triggered?
  - What are the key fields?

□ Identify the entities
  - Source: Who/what initiated?
  - Destination: Who/what was targeted?
  - User: Which account involved?

□ Check timestamps
  - When did this occur?
  - Business hours or off-hours?
  - One-time or repeated?

□ Assess asset criticality
  - Is the affected system important?
  - What data/access does it have?

□ Quick context lookup
  - Is this user/system known?
  - Any recent tickets for same entity?
  - Known maintenance or testing?

Entity Analysis:

For Source IP/Host:

Internal:
- Who owns this system?
- What's its normal function?
- Who normally uses it?
- Any recent alerts for it?

External:
- Reputation check (threat intel)
- Geolocation (expected region?)
- ASN/ownership
- Historical activity in logs

For User Account:
- What's their role?
- Normal working hours?
- Normal systems accessed?
- Recent password changes?
- Privileged account?

For Process/File:
- Known good or suspicious?
- Hash reputation
- Signed/unsigned?
- Normal for this system?

Quick Wins - Fast Closure Patterns:

Patterns that often indicate false positives:

1. Known scanning/testing
   - Source is authorized scanner
   - Scheduled vulnerability assessment
   - Red team exercise

2. Administrative activity
   - IT admin doing expected work
   - Matches change ticket
   - Normal tool for that role

3. Repeated known FP
   - Same alert type closed as FP before
   - Same entities, same context
   - Note: Still document, consider tuning

4. Automated systems
   - Backup jobs
   - Monitoring systems
   - Patch management

Always verify—don't assume!

Key insight: Initial assessment should take 2-5 minutes. If you can't classify in that time, it needs deeper investigation.

3) Investigation Methodology

When initial assessment indicates a real threat, investigate systematically:

Investigation Framework:

1. SCOPE
   - What systems/users are involved?
   - What's the timeframe?
   - What data sources do I need?

2. EVIDENCE
   - Collect relevant logs
   - Preserve artifacts
   - Document everything

3. ANALYZE
   - Build timeline of events
   - Identify attack progression
   - Determine root cause

4. CONCLUDE
   - What happened?
   - What's the impact?
   - What action is needed?

Building the Timeline:

Timeline Template:

Time (UTC)    Source        Event                    Significance
─────────────────────────────────────────────────────────────────
09:15:32      Email GW      Phishing email received  Initial delivery
09:17:45      Endpoint      Attachment opened        User interaction
09:17:48      Endpoint      PowerShell executed      Payload execution
09:18:02      Firewall      Outbound to C2 IP        C2 established
09:25:33      AD Logs       Service account auth     Lateral movement
09:28:15      File Server   Large file access        Data staging
09:35:00      Firewall      Large outbound transfer  Exfiltration

Timeline reveals:
- Attack progression
- Time to detect (gap between first event and alert)
- Scope of compromise
- Response urgency

Pivot Analysis:

Pivoting: Using one finding to discover more

Found suspicious IP? Pivot to find:
→ All connections to/from that IP
→ All users who connected to it
→ All systems that contacted it
→ DNS queries for associated domains

Found malicious file hash? Pivot to find:
→ All systems with that hash
→ Parent process that created it
→ Child processes it spawned
→ Network connections it made

Found compromised user? Pivot to find:
→ All authentications by that user
→ All systems accessed
→ All files touched
→ Any new accounts created

Each pivot may reveal new IOCs to pivot on again

Key insight: Investigation is iterative. Each finding opens new questions. Keep pivoting until you understand the full scope.

4) Common Alert Scenarios

Learn patterns for frequent alert types:

Brute Force / Failed Logins:

Alert: Multiple failed logins detected

Triage questions:
- How many failures? Over what time?
- Same source hitting multiple accounts? (spray)
- Same account from multiple sources? (distributed)
- Any successes after failures? (compromised!)

Investigation:
1. Query all auth events for source IP/user
2. Check if any succeeded
3. If success: investigate post-auth activity
4. Check threat intel for source IP
5. Determine if targeted or opportunistic

Common outcomes:
- External scanning → Block IP, close
- Credential spray with success → Incident!
- User forgot password → Close as BTP
- Service account misconfigured → Fix config

Malware Detection:

Alert: Malware detected on endpoint

Triage questions:
- Was it blocked or just detected?
- Known malware family or generic detection?
- How did it arrive? (email, web, USB)
- Was it executed?

Investigation:
1. Check EDR for full process tree
2. Identify delivery mechanism
3. Check for persistence mechanisms
4. Look for lateral movement
5. Search for same hash elsewhere

Common outcomes:
- Blocked before execution → Improve prevention, close
- Executed but contained → Clean system, investigate scope
- Active infection → Major incident response

Data Exfiltration Alert:

Alert: Large outbound data transfer

Triage questions:
- What system initiated?
- What was the destination?
- How much data?
- What type of data?

Investigation:
1. Identify user and process
2. Determine destination reputation
3. Check if business-justified
4. Review what data was accessed
5. Look for prior suspicious activity

Common outcomes:
- Backup to cloud service → Close as BTP
- Developer uploading to GitHub → Policy reminder
- Unknown destination, sensitive data → Incident!

Suspicious Process Execution:

Alert: Suspicious PowerShell/CMD activity

Triage questions:
- What command was executed?
- Who ran it?
- What was the parent process?
- Is this normal for this user/system?

Investigation:
1. Full command line analysis
2. Check parent process chain
3. Review user's recent activity
4. Check for encoded commands
5. Look for network connections

Common outcomes:
- Admin running script → Verify authorization
- Encoded download command → Likely malicious
- IT automation tool → Close as BTP

Key insight: Pattern recognition speeds triage. As you see more alerts, you'll recognize scenarios faster.

5) Documentation and Handoff

Investigation without documentation is incomplete:

Ticket Documentation Standard:

Summary:
- One-line description of what happened

Classification:
- TP / FP / BTP
- Severity / Priority

Timeline:
- Key events in chronological order

Investigation Steps:
- What you searched
- What you found
- Queries used (for reproducibility)

Findings:
- Root cause (if determined)
- Scope of impact
- IOCs identified

Actions Taken:
- Containment measures
- Remediation steps
- Escalations made

Recommendations:
- Detection improvements
- Prevention measures
- Follow-up needed

Writing Good Notes:

Bad note:
"Checked logs, looks fine, closing as FP"

Good note:
"Alert triggered by user jsmith authenticating to 
server SQL01 from IP 10.0.1.50 at 14:32 UTC.

Verified: jsmith is DBA, SQL01 is their assigned server,
source IP is their workstation per CMDB. Activity occurred
during business hours and matches normal pattern.

Classification: Benign True Positive
Action: No action needed, closing.
Recommendation: Consider excluding DBA group from this rule
for known database servers."

The good note:
- Explains what was checked
- Shows reasoning
- Enables future reference
- Suggests improvement

Escalation Communication:

When escalating, provide:

1. What happened (brief summary)
2. Why it's being escalated (severity/complexity)
3. What's been done (actions taken)
4. What's needed (specific ask)
5. Urgency level (how fast is response needed)

Example escalation:
"Escalating to Tier 2: Confirmed malware execution on 
FINANCE-WS-42 (user: jdoe, Finance Dept).

Malware: Emotet dropper (hash: abc123...)
Delivery: Phishing email at 10:15 UTC
Execution: 10:23 UTC
C2 connection: Observed to 185.x.x.x

Actions taken: Endpoint isolated via EDR

Need: Deep malware analysis, scope assessment across
Finance department, potential IR escalation.

Urgency: High - active threat, user has access to 
sensitive financial data."

Key insight: Your notes are the organizational memory. Future analysts (including future you) will rely on them.

Real-World Context: Triage Under Pressure

Real SOC triage has unique challenges:

Alert Fatigue: When 90% of alerts are false positives, it's tempting to close quickly without investigating. Discipline matters—the one you skip might be real.

Time Pressure: Queue is growing, shift is ending, management wants metrics. Resist the urge to rush. Accurate triage is more valuable than fast triage.

Uncertainty: Many alerts end with "probably benign but not certain." Document your uncertainty and reasoning. If it comes back, you'll know what was checked.

MITRE ATT&CK Application:

Technique Identification: Map alerts to ATT&CK techniques
Attack Chain Analysis: Use tactics to understand progression
Gap Identification: What techniques led to this alert?

Key insight: Triage is a skill developed over thousands of alerts. Every investigation teaches you something.

Guided Lab: Alert Triage Simulation

Practice triaging alerts using systematic methodology.

Scenario 1: Failed Login Alert

ALERT: Multiple Failed Logins
Time: 2024-01-15 14:32:00 UTC
Source IP: 203.0.113.50
Target: VPN Gateway
User: admin
Failures: 47 in 5 minutes
Status: All failed, no success

Your triage:
1. What type of attack does this suggest?
2. What additional information do you need?
3. What's your priority assessment?
4. What action would you take?

Scenario 2: Malware Alert

ALERT: Malware Detected
Time: 2024-01-15 09:45:00 UTC
Host: ACCT-WS-15 (Accounting workstation)
User: mjohnson (Accounts Payable)
Detection: Emotet.Gen!A
File: C:\Users\mjohnson\Downloads\Invoice_Dec2023.doc
Action: Quarantined

Your triage:
1. Was this blocked or does it require investigation?
2. What's the likely delivery mechanism?
3. What should you check next?
4. What's the risk level?

Scenario 3: Data Transfer Alert

ALERT: Large Outbound Transfer
Time: 2024-01-15 03:15:00 UTC (Sunday)
Source: FILESVR-01 (File Server)
Destination: 185.100.87.XX (Unknown IP, Eastern Europe)
Size: 4.7 GB
Protocol: HTTPS

Your triage:
1. What makes this suspicious?
2. What's your priority assessment?
3. What do you need to investigate?
4. Should this be escalated?

Scenario 4: Suspicious Command

ALERT: Suspicious PowerShell
Time: 2024-01-15 11:22:00 UTC
Host: DEV-WS-07 (Developer workstation)
User: tsmith (Software Developer)
Command: powershell -enc [base64 string]
Parent Process: outlook.exe

Your triage:
1. Why is this suspicious?
2. What does the parent process tell you?
3. What would you investigate?
4. What's your classification?

Step 5: Reflection (mandatory)

Which scenario was hardest to triage? Why?
What information was missing that would have helped?
How did the time and day affect your assessment?
What patterns did you notice across scenarios?

Week 5 Outcome Check

By the end of this week, you should be able to:

Apply a systematic triage methodology to alerts
Classify alerts as TP, FP, or BTP
Prioritize alerts based on impact and confidence
Conduct structured investigations using pivot analysis
Build investigation timelines
Document findings and escalate appropriately

Next week: Incident Response Foundations—what happens when triage confirms a real incident.

🎯 Hands-On Labs (Free & Essential)

Practice triage workflows before moving to reading resources.

🎮 TryHackMe: Splunk 101 (Alert Triage)

What you'll do: Run searches, pivot from an alert, and extract key context.
Why it matters: Triage relies on fast log pivots and clear evidence.
Time estimate: 1.5-2 hours

Start TryHackMe Splunk 101 →

📝 Lab Exercise: Alert Triage Worksheet

Task: Classify 8 sample alerts as TP/FP/BTP and assign priority.
Deliverable: Triage table with rationale + next investigative step for each.
Why it matters: Consistent triage reduces mistakes and speeds response.
Time estimate: 60-90 minutes

🏁 PicoCTF Practice: Forensics (Alert Validation)

What you'll do: Solve beginner forensics challenges that validate alert evidence.
Why it matters: Triage decisions depend on artifact confirmation.
Time estimate: 1-2 hours

Start PicoCTF Forensics →

💡 Lab Tip: Document the detection logic before you start pivoting so your reasoning is traceable.

🛡️ SIEM Query Optimization

Fast queries keep triage moving. Optimized searches reduce noise and surface high-signal evidence quickly.

Query optimization checklist:
- Filter early (time, host, user, index)
- Use structured fields instead of free-text
- Start narrow, then expand scope
- Save high-value queries as templates

📚 Building on CSY101 Week-14: Use standards-driven logging requirements to choose key fields.

Resources

Complete the required resources to build your foundation.

SANS - Alert Triage Best Practices · 30-45 min · 50 XP · Resource ID: csy201_w5_r1 (Required)
SANS - Hunt Evil Poster (Investigation Reference) · Reference · 50 XP · Resource ID: csy201_w5_r2 (Required)
MITRE CAR - Cyber Analytics Repository · Reference · 25 XP · Resource ID: csy201_w5_r3 (Optional)

Lab: Full Alert Investigation

Goal: Conduct a complete investigation from initial alert through documented findings.

Scenario

You receive the following alert at the start of your shift:

ALERT: Possible Credential Theft
Time: 2024-01-15 08:15:00 UTC
Host: HR-WS-03 (HR Manager workstation)
User: lthompson (HR Manager)
Detection: Mimikatz-like activity detected
Process: C:\Windows\Temp\update.exe
Parent: explorer.exe
Network: Connection to 45.33.XX.XX:443 observed

Part 1: Initial Assessment (10 min)

Document your initial assessment
List your triage questions
Assign initial priority
Identify data sources needed

Part 2: Investigation (30 min)

Document what queries you would run
Create hypothetical findings for each query
Build an attack timeline
Identify all IOCs

Part 3: Scope Assessment

What systems might be affected?
What data might be at risk?
What additional investigation is needed?

Part 4: Documentation

Write complete ticket documentation
Include all sections from the template
Write escalation communication

Deliverable (submit):

Initial assessment document
Investigation queries and hypothetical findings
Attack timeline
Complete ticket documentation
Escalation communication

Checkpoint Questions

What is the difference between a True Positive and a Benign True Positive?
What are the five questions to ask during initial triage?
What is pivot analysis and why is it important?
What should be included in ticket documentation?
When should an alert be escalated to Tier 2?
Why is documentation important even for false positives?

Week 05 Quiz

Test your understanding of alert triage, investigation workflow, and escalation decisions.

Format: 10 multiple-choice questions. Passing score: 70%. Time: Untimed.

Take Quiz

Weekly Reflection

Reflection Prompt (200-300 words):

This week you learned alert triage and investigation—the core daily work of SOC analysts. You practiced systematic approaches to turning alerts into answers.

Reflect on these questions:

How do you balance speed with thoroughness in triage? What would help you make better decisions faster?
False positive rates are high in most SOCs. How does this affect analyst behavior and what can be done about it?
Documentation takes time but provides value. How would you convince a time-pressured analyst to document thoroughly?
What aspects of investigation feel most challenging to you? What would help you improve?

A strong reflection will honestly assess the challenges of triage work and identify strategies for improvement.

Verified Resources & Videos

Investigation Techniques: Malware Archaeology Cheat Sheets
Windows Investigation: LOLBAS Project (Living Off the Land)
MITRE ATT&CK: Credential Access Tactic

Triage and investigation skills improve with practice. Every alert teaches you something—patterns, techniques, data sources. Keep notes on what you learn. Next week: when triage confirms a threat, incident response takes over.