Opening Framing: From Logs to Intelligence
Last week you analyzed logs with command-line tools. Powerful, but it doesn't scale. When you have billions of events across thousands of systems, you need a platform that can aggregate, normalize, correlate, and alert—automatically and in real-time.
That platform is a SIEM (Security Information and Event Management). SIEMs are the central nervous system of security operations. They collect data from everywhere, make it searchable, correlate events across sources, and generate alerts when threats are detected.
This week covers SIEM architecture, core capabilities, query languages, and how to build effective detections. You'll work with real SIEM concepts that apply to any platform.
Key insight: A SIEM is only as good as what you put in and how you use it. Bad data and poor rules produce noise. Good data and thoughtful rules produce actionable intelligence.
1) SIEM Architecture
Understanding SIEM components helps you use them effectively:
SIEM Core Components:
┌─────────────────────────────────────────────────────────┐
│ SIEM │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Collect │→ │ Parse/ │→ │ Index/ │→ │ Analyze │ │
│ │ │ │Normalize│ │ Store │ │ /Alert │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
│ ↑ ↓ │
│ Log Sources Dashboards │
│ - Firewalls Alerts │
│ - Servers Reports │
│ - Endpoints Cases │
│ - Cloud │
└─────────────────────────────────────────────────────────┘
Data flow:
1. Collection: Ingest logs via agents, syslog, APIs
2. Parsing: Extract fields, normalize format
3. Enrichment: Add context (geo, asset, user info)
4. Indexing: Store for fast retrieval
5. Correlation: Match patterns across events
6. Alerting: Notify on detected threats
7. Visualization: Dashboards and reports
Popular SIEM Platforms:
Commercial:
- Splunk Enterprise Security
- Microsoft Sentinel
- IBM QRadar
- LogRhythm
- Exabeam
- Securonix
Open Source / Free Tier:
- Elastic Security (ELK Stack)
- Wazuh
- Graylog
- OSSIM (AlienVault)
Cloud-Native:
- Microsoft Sentinel (Azure)
- Chronicle (Google)
- Amazon Security Lake + OpenSearch
Key differentiators:
- Query language and ease of use
- Correlation capabilities
- Integration ecosystem
- Pricing model (data volume, users, features)
- Cloud vs on-premises deployment
Deployment Considerations:
Sizing factors:
- Events per second (EPS)
- Data retention requirements
- Number of log sources
- Search performance needs
- Number of concurrent users
Architecture patterns:
Small (< 5,000 EPS):
- Single server or small cluster
- 30-90 day hot storage
Medium (5,000-50,000 EPS):
- Distributed collection
- Search head cluster
- Tiered storage (hot/warm/cold)
Large (> 50,000 EPS):
- Globally distributed
- Multiple indexer clusters
- Data lake integration
- Heavy use of summarization
Key insight: SIEM architecture decisions affect everything— search speed, storage costs, and detection capability. Plan carefully before deployment.
2) SIEM Query Languages
Every SIEM has a query language. Learn the concepts, adapt to any syntax:
Splunk SPL (Search Processing Language):
# Basic search
index=security sourcetype=WinEventLog EventCode=4625
# Filter and select fields
index=security EventCode=4625
| table _time, user, src_ip, dest
# Count by field
index=security EventCode=4625
| stats count by src_ip
| sort -count
# Time-based analysis
index=security EventCode=4625
| timechart span=1h count by src_ip
# Multiple conditions
index=security (EventCode=4625 OR EventCode=4624)
| stats count by EventCode, user
# Subsearch correlation
index=security EventCode=4625
| stats count by src_ip
| where count > 10
| map search="search index=security EventCode=4624 src_ip=$src_ip$"
Elastic/KQL (Kibana Query Language):
# Basic search
event.code: 4625
# Field search
event.code: 4625 AND source.ip: 192.168.1.100
# Wildcards
user.name: admin*
# Range queries
@timestamp >= "2024-01-15" AND @timestamp < "2024-01-16"
# Boolean logic
(event.code: 4625 OR event.code: 4624) AND user.name: jsmith
# Elasticsearch DSL for complex queries
{
"query": {
"bool": {
"must": [
{ "match": { "event.code": "4625" } }
],
"filter": [
{ "range": { "@timestamp": { "gte": "now-24h" } } }
]
}
},
"aggs": {
"by_ip": { "terms": { "field": "source.ip" } }
}
}
Microsoft Sentinel KQL (Kusto):
// Basic query
SecurityEvent
| where EventID == 4625
// Filter and project
SecurityEvent
| where EventID == 4625
| project TimeGenerated, Account, IpAddress, Computer
// Aggregation
SecurityEvent
| where EventID == 4625
| summarize count() by IpAddress
| order by count_ desc
// Time analysis
SecurityEvent
| where EventID == 4625
| summarize count() by bin(TimeGenerated, 1h), IpAddress
| render timechart
// Join tables
SecurityEvent
| where EventID == 4625
| join kind=inner (
SecurityEvent | where EventID == 4624
) on IpAddress, Account
Key insight: Query language fluency is essential for SOC analysts. Practice until searching feels natural—speed matters during incidents.
3) Detection Rules and Correlation
SIEM value comes from detection rules that surface threats:
Detection Rule Components:
1. Data Source
- What logs does this rule need?
- Are they being collected?
2. Logic/Query
- What pattern indicates the threat?
- How specific vs. broad?
3. Threshold
- Single event or multiple?
- Time window for correlation?
4. Severity
- How critical if this fires?
- Drives response priority
5. Response
- What action when triggered?
- Who gets notified?
Detection Rule Examples:
# Brute Force Detection (Splunk)
index=security EventCode=4625
| stats count as failures by src_ip, user
| where failures > 5
| alert severity=medium
# Impossible Travel (Sentinel KQL)
SigninLogs
| summarize Locations=make_set(Location),
Times=make_list(TimeGenerated) by UserPrincipalName
| where array_length(Locations) > 1
// Additional logic to calculate travel time vs distance
# Suspicious Process Execution (Sigma - generic format)
title: Suspicious PowerShell Download
logsource:
product: windows
service: powershell
detection:
selection:
CommandLine|contains|all:
- 'IEX'
- 'WebClient'
- 'DownloadString'
condition: selection
level: high
Correlation Rule Types:
Single Event:
- One event triggers alert
- Example: Known malware hash detected
- Low false positive, limited context
Threshold:
- Count exceeds limit in time window
- Example: >10 failed logins in 5 minutes
- Common for brute force, scanning
Sequence:
- Events occur in specific order
- Example: Login failure → success → privilege escalation
- Powerful for attack chains
Anomaly:
- Deviation from baseline
- Example: User accessing unusual systems
- Requires learning period, tuning
Absence:
- Expected event doesn't occur
- Example: No heartbeat from critical system
- Useful for availability monitoring
Sigma: Universal Detection Format:
# Sigma rules are platform-agnostic
# Convert to Splunk, Elastic, Sentinel, etc.
title: Mimikatz Command Line
status: experimental
description: Detects Mimikatz execution
logsource:
category: process_creation
product: windows
detection:
selection:
CommandLine|contains:
- 'sekurlsa::'
- 'kerberos::'
- 'crypto::'
- 'lsadump::'
condition: selection
falsepositives:
- Security tools that use similar patterns
level: critical
tags:
- attack.credential_access
- attack.t1003
# Convert with sigmac tool:
sigmac -t splunk mimikatz.yml
sigmac -t es-qs mimikatz.yml
Key insight: Good detection rules are specific enough to catch real threats, but not so narrow they miss variations. Balance is an art developed through tuning.
4) SIEM Operations Best Practices
Effective SIEM use requires disciplined operations:
Data Quality:
Ensure quality data:
Collection:
□ All critical sources sending?
□ No gaps in data flow?
□ Timestamps accurate (NTP sync)?
Parsing:
□ Fields extracted correctly?
□ No parsing failures?
□ Normalized to common schema?
Enrichment:
□ Asset information accurate?
□ User context available?
□ Threat intel integrated?
Monitoring data health:
- Track EPS by source
- Alert on collection failures
- Regular data quality audits
Rule Management:
Rule lifecycle:
Development:
1. Identify detection gap
2. Research attack technique
3. Write initial rule
4. Test against historical data
Deployment:
1. Enable in detection-only mode
2. Monitor for false positives
3. Tune thresholds/logic
4. Promote to production
Maintenance:
1. Regular review of rule performance
2. Tune based on analyst feedback
3. Update for new attack variants
4. Deprecate obsolete rules
Documentation:
- What the rule detects
- Why it matters
- Expected false positives
- Investigation steps
Performance Optimization:
Query optimization:
Slow:
index=* | search error
Fast:
index=application sourcetype=app_log error
Tips:
- Specify index and sourcetype
- Filter early, aggregate late
- Use time ranges
- Avoid wildcards at start of terms
- Use summary indexes for dashboards
Resource management:
- Schedule heavy searches off-peak
- Use data models for common queries
- Archive old data to cheaper storage
- Monitor search concurrency
Key insight: SIEM is an ongoing investment. Without continuous tuning and maintenance, it becomes an expensive log warehouse instead of a detection platform.
5) Building Effective Dashboards
Dashboards transform data into actionable visibility:
Dashboard Types:
Operational Dashboard:
- Real-time metrics
- Alert queue status
- Current incidents
- Used by: SOC analysts, shift leads
Executive Dashboard:
- High-level KPIs
- Trends over time
- Risk posture
- Used by: CISO, management
Investigation Dashboard:
- Deep-dive views
- Drill-down capability
- Used by: Tier 2/3 analysts
Threat Dashboard:
- Specific threat monitoring
- Campaign tracking
- IOC hits
- Used by: Threat intel team
Effective Dashboard Design:
SOC Operations Dashboard:
┌─────────────────────────────────────────────────────┐
│ Open Alerts: 47 │ Critical: 3 │ MTTD: 4.2h │
├─────────────────────┴──────────────────────────────┤
│ │
│ [Alert Volume - Last 24h - Time Chart] │
│ │
├─────────────────────┬───────────────────────────────┤
│ Top Alert Types │ Top Source IPs │
│ 1. Failed Login 23 │ 1. 192.168.1.50 15 │
│ 2. Malware Det. 12 │ 2. 10.0.0.25 12 │
│ 3. Policy Viol. 8 │ 3. 203.0.113.5 8 │
├─────────────────────┴───────────────────────────────┤
│ Recent Critical Alerts │
│ • 10:23 - Ransomware detected - SERVER01 │
│ • 10:15 - Data exfil alert - WS-FINANCE-42 │
│ • 09:58 - Brute force success - DC01 │
└─────────────────────────────────────────────────────┘
Design principles:
- Most important info at top
- Use color purposefully (red=critical)
- Enable drill-down to details
- Refresh appropriately (not too fast)
- Avoid chart junk—every element should inform
Key insight: A good dashboard answers questions at a glance. If analysts have to think hard to interpret it, redesign it.
Real-World Context: SIEM in SOC Operations
SIEM is the SOC's primary tool:
Daily Operations: Analysts start their day in the SIEM—checking the alert queue, reviewing overnight activity, searching for anomalies. The SIEM is where investigations begin and evidence is gathered.
Incident Response: During incidents, the SIEM provides the timeline. What systems were accessed? What data was touched? When did the attack start? Incident commanders rely on SIEM queries to understand scope and impact.
Compliance: Auditors want evidence of security monitoring. SIEM provides logs, alerts, and reports that demonstrate due diligence. Many compliance frameworks require SIEM or equivalent.
MITRE ATT&CK Integration:
- Detection Coverage: Map rules to ATT&CK techniques
- Gap Analysis: Identify techniques without detection
- Threat Intel: Search for technique-specific IOCs
Key insight: SIEM mastery is career-defining for SOC analysts. The analyst who can quickly find answers in the SIEM is invaluable during incidents.
Guided Lab: SIEM Query Practice
Let's practice SIEM queries using common scenarios. These exercises use generic syntax—adapt to your platform.