CSY201 Week 11 - Build automation muscle before moving to reading resources.

Opening Framing: Scaling Security Operations

Alert volumes grow faster than analyst headcount. Without automation, SOCs drown in repetitive tasks: enriching alerts, gathering context, executing routine responses. Analysts burn out doing the same thing thousands of times.

Security automation changes this equation. SOAR (Security Orchestration, Automation, and Response) platforms automate routine tasks, orchestrate tool integrations, and execute playbooks consistently. Analysts focus on decisions that require human judgment.

This week covers automation strategy, SOAR capabilities, playbook design, and how to build automation that enhances rather than replaces human analysts.

Key insight: Good automation doesn't replace analysts—it makes them more effective by handling tasks that don't require human judgment.

1) Security Automation Fundamentals

Understanding what to automate and why:

Automation Candidates:

Good for automation:
- Repetitive, high-volume tasks
- Well-defined procedures
- Consistent inputs and outputs
- Low decision complexity
- Time-sensitive actions

Poor for automation:
- Novel situations
- High judgment required
- Complex context needed
- Significant business impact
- Unclear procedures

Examples:

Automate:                    Don't automate:
- IOC enrichment            - Incident escalation decisions
- Alert triage (initial)    - Complex investigations
- Ticket creation           - Customer communications
- Blocking known bad        - Legal/compliance decisions
- Report generation         - Novel threat analysis

Automation Benefits:

Speed:
- Automated enrichment in seconds vs. minutes
- Immediate response to clear threats
- 24/7 operation without fatigue

Consistency:
- Same process every time
- No steps forgotten
- Documented actions

Scale:
- Handle thousands of alerts
- No linear analyst scaling
- Process during volume spikes

Analyst focus:
- Reduce tedious work
- Focus on interesting problems
- Improve job satisfaction
- Reduce burnout

Automation Risks:

Over-automation:
- Automated actions without oversight
- Business disruption from false positives
- Analysts lose skills/context

Under-automation:
- Analysts overwhelmed
- Inconsistent response
- Slow response times

Poor automation:
- Unreliable integrations
- Brittle playbooks
- Maintenance burden
- False confidence

Mitigation:
- Start small, expand gradually
- Human approval for impactful actions
- Monitor automation effectiveness
- Regular review and tuning

Key insight: Automation amplifies both good and bad processes. Fix the process before automating it.

2) SOAR Platforms

SOAR brings orchestration, automation, and response together:

SOAR Components:

Orchestration:
- Connect disparate security tools
- Coordinate workflows across systems
- Central management of integrations

Automation:
- Execute tasks without human intervention
- Trigger-based actions
- Scheduled jobs

Response:
- Execute containment actions
- Update tickets and cases
- Communicate with stakeholders

Case Management:
- Track incidents
- Document investigations
- Collaboration features

SOAR Architecture:

┌─────────────────────────────────────────────────────┐
│                    SOAR Platform                     │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐          │
│  │ Playbook │  │   Case   │  │ Dashboards│          │
│  │  Engine  │  │ Manager  │  │ /Reports  │          │
│  └────┬─────┘  └────┬─────┘  └──────────┘          │
│       │             │                               │
│  ┌────┴─────────────┴────┐                         │
│  │    Integration Layer   │                         │
│  └───────────┬───────────┘                         │
└──────────────┼──────────────────────────────────────┘
               │
    ┌──────────┼──────────┐
    │          │          │
┌───┴───┐  ┌───┴───┐  ┌───┴───┐
│ SIEM  │  │  EDR  │  │Firewall│  ...more tools
└───────┘  └───────┘  └───────┘

Popular SOAR Platforms:

Commercial:
- Splunk SOAR (Phantom)
- Palo Alto XSOAR (Demisto)
- IBM Resilient
- Swimlane
- ServiceNow SecOps

Open Source:
- Shuffle
- TheHive + Cortex
- StackStorm

Cloud-Native:
- Microsoft Sentinel (built-in)
- Chronicle SOAR
- AWS Security Hub (limited)

Selection factors:
- Integration with existing tools
- Playbook development ease
- Case management needs
- Cost and licensing
- Cloud vs. on-premises

Key insight: SOAR value comes from integrations. A SOAR platform with few integrations is just an expensive ticketing system.

3) Playbook Design

Playbooks codify response procedures for automation:

Playbook Structure:

Trigger:
- What starts this playbook?
- SIEM alert, manual, scheduled, API

Inputs:
- What data does it need?
- Alert fields, IOCs, context

Steps:
- Actions to perform
- Decision points
- Error handling

Outputs:
- Results produced
- Updates made
- Notifications sent

Human tasks:
- Where is approval needed?
- What requires analyst judgment?

Example: Phishing Alert Playbook

Trigger: SIEM alert "Phishing Email Detected"

┌─────────────────────────────────────────┐
│ 1. EXTRACT IOCs                         │
│    - Sender email                       │
│    - URLs in body                       │
│    - Attachment hashes                  │
└─────────────┬───────────────────────────┘
              ↓
┌─────────────────────────────────────────┐
│ 2. ENRICH IOCs                          │
│    - Check URL reputation (VirusTotal)  │
│    - Check domain age (WHOIS)           │
│    - Check hash (malware DBs)           │
└─────────────┬───────────────────────────┘
              ↓
┌─────────────────────────────────────────┐
│ 3. ASSESS RISK                          │
│    - High risk indicators?              │
│    - Known campaign?                    │
└─────────────┬───────────────────────────┘
              ↓
         ┌────┴────┐
    High │         │ Low
         ↓         ↓
┌────────────┐ ┌────────────┐
│ 4a. SCOPE  │ │ 4b. Close  │
│ - Find all │ │ - Update   │
│   recipients│ │   ticket   │
│ - Check    │ │ - Log      │
│   clicks   │ │   findings │
└─────┬──────┘ └────────────┘
      ↓
┌─────────────────────────────────────────┐
│ 5. CONTAIN                              │
│    - Block sender                       │
│    - Block URLs                         │
│    - Quarantine from mailboxes          │
└─────────────┬───────────────────────────┘
              ↓
┌─────────────────────────────────────────┐
│ 6. HUMAN REVIEW                         │
│    - Analyst reviews actions            │
│    - Approves or modifies               │
└─────────────────────────────────────────┘

Playbook Best Practices:

Design principles:

1. Start simple
   - Basic automation first
   - Add complexity gradually
   - Validate each step works

2. Include human checkpoints
   - Approval for destructive actions
   - Review points for complex decisions
   - Escalation paths

3. Handle errors gracefully
   - What if enrichment fails?
   - What if tool is down?
   - Don't leave incidents in limbo

4. Document thoroughly
   - What the playbook does
   - When to use it
   - Expected outcomes
   - Known limitations

5. Test before production
   - Use test alerts
   - Verify each integration
   - Check error handling

6. Monitor and improve
   - Track playbook performance
   - Gather analyst feedback
   - Iterate and improve

Key insight: The best playbooks handle 80% of cases automatically and make the remaining 20% easier for analysts.

4) Integration and APIs

Automation depends on tool integration:

Common Integration Patterns:

SIEM Integration:
- Receive alerts (trigger)
- Query for additional data
- Create notable events
- Update alert status

EDR Integration:
- Query endpoint telemetry
- Isolate endpoints
- Collect forensic data
- Kill processes

Firewall Integration:
- Query connection logs
- Block IPs/domains
- Update blocklists
- Check rule status

Email Integration:
- Search for emails
- Quarantine messages
- Block senders
- Pull headers

Threat Intel Integration:
- Lookup IOC reputation
- Get related IOCs
- Check threat reports
- Submit samples

Working with APIs:

REST API basics (most common):

# Example: Check IP reputation
GET https://api.threatintel.com/v1/ip/192.168.1.100
Headers:
  Authorization: Bearer YOUR_API_KEY
  Content-Type: application/json

Response:
{
  "ip": "192.168.1.100",
  "reputation": "malicious",
  "confidence": 95,
  "categories": ["c2", "malware"],
  "last_seen": "2024-01-15T10:30:00Z"
}

# Example: Block IP on firewall
POST https://firewall.company.com/api/v1/blocklist
Headers:
  Authorization: Bearer YOUR_API_KEY
  Content-Type: application/json
Body:
{
  "ip": "192.168.1.100",
  "duration": "permanent",
  "reason": "SOAR: Confirmed C2 server"
}

Integration Challenges:

Common issues:

Authentication:
- API keys, OAuth, certificates
- Key rotation and management
- Permission scoping

Rate limits:
- APIs limit requests per minute
- Batch requests where possible
- Implement backoff/retry

Data formats:
- Different tools, different schemas
- Normalization required
- Field mapping maintenance

Availability:
- External APIs may be down
- Timeout handling
- Fallback procedures

Security:
- Secure credential storage
- Audit API usage
- Least privilege for integrations

Key insight: Integration is the hard part. Budget significant time for building, testing, and maintaining integrations.

5) Measuring Automation Effectiveness

Track whether automation delivers value:

Automation Metrics:

Volume metrics:
- Alerts processed by automation
- Playbook executions per day
- Percentage of alerts auto-enriched
- Percentage of alerts auto-resolved

Time metrics:
- Mean time to enrich (automated vs. manual)
- Mean time to respond (automated vs. manual)
- Time saved per alert
- Total analyst hours saved

Quality metrics:
- False positive rate of auto-closures
- Escalations from auto-triage
- Playbook failure rate
- Analyst satisfaction

ROI Calculation:

Simple automation ROI:

Time saved:
- Manual enrichment: 5 min/alert
- Automated enrichment: 30 sec/alert
- Savings: 4.5 min/alert

Volume:
- 500 alerts/day requiring enrichment
- 4.5 min × 500 = 2,250 min = 37.5 hours/day

Value:
- Analyst cost: $50/hour
- Daily savings: 37.5 × $50 = $1,875
- Annual savings: ~$480,000

Costs:
- SOAR platform license
- Integration development
- Maintenance time
- Training

ROI = (Savings - Costs) / Costs

Continuous Improvement:

Improvement cycle:

1. Measure current state
   - How long do tasks take?
   - Where is time spent?
   - What's repetitive?

2. Identify opportunities
   - High volume tasks
   - Consistent procedures
   - Integration availability

3. Implement automation
   - Start simple
   - Test thoroughly
   - Deploy gradually

4. Measure improvement
   - Did metrics improve?
   - Any negative impacts?
   - Analyst feedback?

5. Iterate
   - Expand successful automation
   - Fix problems
   - Find new opportunities

Key insight: If you can't measure the improvement, you can't prove the value. Track metrics from day one.

Real-World Context: Automation in Practice

Automation transforms SOC operations:

Alert Enrichment: Before automation, analysts manually checked reputation for every IP, domain, and hash. Now SOAR does this automatically in seconds, presenting analysts with enriched alerts ready for decision-making.

Phishing Response: Automated phishing playbooks can identify all recipients, check who clicked, quarantine remaining emails, and block IOCs—all before an analyst even reviews the alert.

Threat Intelligence: New IOCs from threat feeds are automatically checked against historical data, added to blocklists, and searched across endpoints—continuous protection without manual effort.

Challenges Observed:

Integration maintenance: APIs change, breaking playbooks
Over-reliance: Analysts lose skills when automation fails
Complexity creep: Playbooks become unmaintainable

Key insight: Successful automation programs balance efficiency gains with maintaining analyst skills and judgment.

Guided Lab: Design an Automated Playbook

Practice designing automation for a common scenario.

Scenario: Malware Alert Automation

Current manual process:

1. Alert received: "Malware detected on endpoint"
2. Analyst checks EDR for details (2 min)
3. Analyst looks up hash on VirusTotal (2 min)
4. Analyst checks if file was executed (3 min)
5. Analyst queries SIEM for other occurrences (3 min)
6. Analyst decides: isolate or not (1 min)
7. If isolate: analyst isolates endpoint (2 min)
8. Analyst documents in ticket (5 min)

Total: ~18 minutes per alert
Volume: 50 alerts/day

Step 1: Design Playbook Flow

Draw the playbook workflow:
- What triggers it?
- What data is extracted?
- What enrichments occur?
- Where are decision points?
- What actions are automated?
- Where is human review needed?

Step 2: Define Integrations

List required integrations:
- EDR system (which actions?)
- Threat intel (which lookups?)
- SIEM (which queries?)
- Ticketing (which updates?)

For each integration:
- What API calls needed?
- What data is sent/received?
- What errors might occur?

Step 3: Define Decision Logic

Create decision criteria:

Auto-isolate if:
- [condition 1]
- [condition 2]

Require analyst review if:
- [condition 1]
- [condition 2]

Auto-close if:
- [condition 1]
- [condition 2]

Step 4: Calculate Expected Improvement

Estimate new timing:
- Automated steps: X seconds
- Analyst review: Y minutes
- Total time: Z

Calculate savings:
- Time saved per alert
- Daily time saved
- Monthly analyst hours recovered

Reflection (mandatory)

What was hardest about designing this playbook?
Where did you choose human review vs. full automation? Why?
What could go wrong with this automation?
How would you test this before production?

Week 11 Outcome Check

By the end of this week, you should be able to:

Identify tasks appropriate for automation
Understand SOAR platform capabilities
Design effective security playbooks
Understand API integration concepts
Measure automation effectiveness
Balance automation with human judgment

Next week: Capstone—bringing everything together in a SOC simulation exercise.

🎯 Hands-On Labs (Free & Essential)

Build automation muscle before moving to reading resources.

🎮 TryHackMe: SOAR 101

What you'll do: Explore SOAR concepts, integrations, and automated workflows.
Why it matters: Automation scales response without burning out analysts.
Time estimate: 1.5-2 hours

Start TryHackMe SOAR 101 →

📝 Lab Exercise: Playbook Design

Task: Draft a phishing triage playbook with enrichment, decision, and response steps.
Deliverable: Playbook diagram + inputs, outputs, and approval gates.
Why it matters: Clear playbooks enable safe automation.
Time estimate: 60-90 minutes

🎮 TryHackMe: Shuffle (Automation Workflows)

What you'll do: Build a simple automated workflow to enrich and route alerts.
Why it matters: Orchestration links tools into repeatable response.
Time estimate: 1-1.5 hours

Start TryHackMe Shuffle →

🛡️ Lab: Deploy Wazuh EDR + Rules

What you'll do: Install Wazuh agent + manager and create a basic detection rule.
Deliverable: Rule snippet and screenshot of alert triggered.
Why it matters: EDR adds endpoint visibility beyond logs.
Time estimate: 90-120 minutes

💡 Lab Tip: Automate enrichment first; keep decisions human until you're confident.

🛡️ Endpoint Detection & Response (EDR)

EDR closes visibility gaps. It captures process behavior, file activity, and command execution that SIEM logs often miss.

EDR core capabilities:
- Process tree and command-line telemetry
- File and registry monitoring
- Behavioral detection rules
- Isolation and response actions

📚 Building on CSY102: Process and service hardening; apply to endpoint telemetry.

Resources

Complete the required resources to build your foundation.

Splunk SOAR Overview · 30-45 min · 50 XP · Resource ID: csy201_w11_r1 (Required)
TheHive Documentation · 45-60 min · 50 XP · Resource ID: csy201_w11_r2 (Required)
Shuffle - Open Source SOAR · Reference · 25 XP · Resource ID: csy201_w11_r3 (Optional)

Lab: Build a Simple Automation

Goal: Create a working automation script that demonstrates integration and orchestration concepts.

Part 1: IOC Enrichment Script

Build a Python script that automates IOC enrichment:

Requirements:
- Input: List of IOCs (IPs, domains, hashes)
- Process: Query free threat intel APIs
- Output: Enriched IOC report

APIs to use (free):
- VirusTotal (with free API key)
- AbuseIPDB
- URLhaus

Script should:
1. Read IOCs from file
2. Determine IOC type
3. Query appropriate API
4. Compile results
5. Output report

Part 2: Decision Logic

Add automated decision-making:

Based on enrichment results:
- If malicious score > 80%: Flag as "Block immediately"
- If malicious score 50-80%: Flag as "Investigate"
- If malicious score < 50%: Flag as "Likely benign"

Add to output report

Part 3: Action Simulation

Simulate response actions:

For "Block immediately" IOCs:
- Generate firewall rule (simulated)
- Create ticket (simulated)
- Log action taken

Output:
- Actions that would be taken
- Commands that would be executed

Part 4: Documentation

Document how the script works
Explain decision logic
Describe how this would integrate with real tools
Identify limitations and improvements

Deliverable (submit):

Python script (or pseudocode)
Sample input file
Sample output report
Documentation

Checkpoint Questions

What types of tasks are good candidates for automation?
What does SOAR stand for and what are its components?
Why should playbooks include human checkpoints?
What are common challenges with API integrations?
How do you measure automation effectiveness?
What's the risk of over-automation in a SOC?

Week 11 Quiz

Test your understanding of automation strategy and SOAR playbooks.

Format: 10 multiple-choice questions. Passing score: 70%. Time: Untimed.

Take Quiz

Weekly Reflection

Reflection Prompt (200-300 words):

This week you learned about security automation—using technology to scale SOC operations. You designed playbooks, considered integrations, and thought about measuring effectiveness.

Reflect on these questions:

Automation can reduce analyst workload but also reduce analyst skills. How would you balance efficiency with skill development?
Many automation projects fail. What factors do you think contribute to success vs. failure?
Where is the line between "automate" and "requires human judgment"? How would you decide?
If you were building a SOC automation program from scratch, what would you automate first and why?

A strong reflection will consider both the benefits and risks of automation, with practical recommendations.

Verified Resources & Videos

SOAR Playbooks: XSOAR Content Repository
Security APIs: Security API Collection
Python for Security: Python for Security Scripts

Automation is a force multiplier for security operations. The skills you've practiced—playbook design, integration thinking, decision logic—enable you to build systems that scale. Next week: your capstone brings everything together in a realistic SOC simulation.