Opening Framing: The Foundation of Detection
When an incident occurs, logs tell the story. Who accessed the system? What commands were executed? Where did the data go? Without logs, investigations are guesswork. With comprehensive logs, you can reconstruct exactly what happened.
Log management is the foundation of security operations. Before you can detect threats, you must collect the right data. Before you can investigate incidents, you must be able to search and correlate that data. Before you can prove what happened, you must have reliable log retention.
This week covers the complete log lifecycle: collection, normalization, storage, analysis, and retention. You'll learn to work with logs from multiple sources and extract the signals that matter.
Key insight: The SOC analyst who masters log analysis can investigate any incident. Logs are your primary evidence source.
1) The Log Lifecycle
Effective log management follows a structured lifecycle:
Log Lifecycle Stages:
1. Generation
- Systems create log entries
- Applications record events
- Network devices log traffic
2. Collection
- Agents forward logs
- Syslog receives messages
- APIs pull from cloud services
3. Normalization
- Convert to common format
- Parse fields consistently
- Enrich with context
4. Storage
- Index for fast search
- Compress for efficiency
- Tier based on age
5. Analysis
- Search and query
- Correlate across sources
- Alert on patterns
6. Retention
- Keep per policy/compliance
- Archive older logs
- Eventually delete
Log Collection Architecture:
┌─────────────┐
│ SIEM │
│ (Central) │
└──────┬──────┘
│
┌─────────────────┼─────────────────┐
│ │ │
┌────┴────┐ ┌────┴────┐ ┌────┴────┐
│Log │ │Log │ │Log │
│Forwarder│ │Forwarder│ │Collector│
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
┌────┴────┐ ┌────┴────┐ ┌────┴────┐
│Windows │ │Linux │ │Network │
│Servers │ │Servers │ │Devices │
└─────────┘ └─────────┘ └─────────┘
Collection methods:
- Agent-based: Software on each system
- Agentless: Pull via API, WMI, SSH
- Syslog: Standard protocol (UDP/TCP 514)
- Cloud API: AWS CloudTrail, Azure Activity
Log Volume Considerations:
Typical daily volumes:
Source Events/Day Size/Day
─────────────────────────────────────────────────
Windows DC 1-10 million 1-10 GB
Firewall 10-100 million 10-100 GB
Web Proxy 10-50 million 5-50 GB
DNS Server 50-500 million 10-100 GB
EDR (per endpoint) 10,000-100,000 100 MB-1 GB
Challenges:
- Storage costs grow quickly
- Search performance degrades with volume
- Network bandwidth for collection
- Processing power for parsing
Solutions:
- Selective collection (not everything)
- Tiered storage (hot/warm/cold)
- Log aggregation and summarization
- Cloud-based elastic scaling
Key insight: Collect what you need, not everything possible. More logs isn't always better—targeted collection beats drowning in data.
2) Critical Log Sources
Not all logs are equal. Focus on high-value sources:
Windows Event Logs:
Security Log (most important):
- 4624: Successful logon
- 4625: Failed logon
- 4648: Explicit credentials used
- 4672: Special privileges assigned
- 4688: Process created (if enabled)
- 4698: Scheduled task created
- 4720: User account created
- 4732: Member added to security group
- 4768: Kerberos TGT requested
- 4769: Kerberos service ticket requested
System Log:
- 7045: Service installed
- 7036: Service state change
PowerShell Logs (if enabled):
- 4103: Module logging
- 4104: Script block logging
Sysmon (if deployed):
- 1: Process creation with hashes
- 3: Network connection
- 7: Image loaded (DLLs)
- 11: File created
- 13: Registry value set
Linux Logs:
/var/log/auth.log (Debian) or /var/log/secure (RHEL):
- SSH authentication
- sudo usage
- User switching (su)
- PAM authentication
/var/log/syslog or /var/log/messages:
- System events
- Service starts/stops
- Kernel messages
/var/log/audit/audit.log (if auditd enabled):
- System calls
- File access
- Process execution
- User commands
Application-specific:
- /var/log/apache2/ or /var/log/nginx/
- /var/log/mysql/
- Application custom logs
Network Device Logs:
Firewall:
- Allowed connections
- Denied connections
- NAT translations
- VPN connections
IDS/IPS:
- Alert events
- Blocked attacks
- Protocol anomalies
Proxy:
- URL requests
- User agent strings
- Bytes transferred
- Categories blocked
DNS:
- Query requests
- Response codes
- Query types
Cloud Logs:
AWS:
- CloudTrail: API calls
- VPC Flow Logs: Network traffic
- GuardDuty: Threat findings
- S3 Access Logs
Azure:
- Activity Log: Control plane
- Sign-in Logs: Authentication
- Audit Logs: Directory changes
- NSG Flow Logs: Network
Google Cloud:
- Cloud Audit Logs
- VPC Flow Logs
- Access Transparency Logs
Key insight: Start with authentication logs—they tell you who did what. Add process execution and network logs for complete visibility.
3) Log Parsing and Normalization
Raw logs must be parsed into structured data:
Raw log entry:
Jan 15 10:23:45 webserver sshd[12345]: Failed password for
invalid user admin from 192.168.1.100 port 54321 ssh2
Parsed fields:
{
"timestamp": "2024-01-15T10:23:45",
"hostname": "webserver",
"program": "sshd",
"pid": 12345,
"event_type": "authentication_failure",
"user": "admin",
"user_exists": false,
"src_ip": "192.168.1.100",
"src_port": 54321,
"protocol": "ssh2"
}
Common Log Formats:
Syslog (RFC 5424):
<priority>version timestamp hostname app-name procid msgid msg
Common Event Format (CEF):
CEF:Version|Vendor|Product|Version|ID|Name|Severity|Extension
Log Event Extended Format (LEEF):
LEEF:Version|Vendor|Product|Version|EventID|
JSON (increasingly common):
{"timestamp":"...","event":"...","src":"...","dst":"..."}
Windows Event XML:
<Event><System>...</System><EventData>...</EventData></Event>
Normalization Challenges:
Same event, different formats:
Firewall A: "DENY tcp 10.0.0.5 → 192.168.1.10:443"
Firewall B: "action=block proto=6 src=10.0.0.5 dst=192.168.1.10 dport=443"
Firewall C: {"action":"denied","protocol":"TCP","source":"10.0.0.5"...}
Normalized form:
{
"action": "deny",
"protocol": "tcp",
"src_ip": "10.0.0.5",
"dst_ip": "192.168.1.10",
"dst_port": 443
}
Normalization enables:
- Consistent queries across sources
- Correlation rules that work everywhere
- Dashboards that aggregate data
- Analysts who don't need to know every format
Field Enrichment:
Add context to raw logs:
Original: src_ip=192.168.1.100
Enriched:
{
"src_ip": "192.168.1.100",
"src_hostname": "workstation-42",
"src_user": "jsmith",
"src_department": "Finance",
"src_location": "Building A",
"src_is_critical": false,
"src_geo_country": "US"
}
Enrichment sources:
- Asset inventory (hostname, owner)
- Directory services (user info)
- Threat intel (reputation)
- GeoIP databases (location)
Key insight: Time spent on parsing and normalization pays off massively during investigations. Messy logs = slow analysis.
4) Log Analysis Techniques
Effective log analysis requires systematic techniques:
Search Strategies:
Start broad, narrow down:
1. Time window
- When did the suspicious activity occur?
- What's the relevant timeframe?
2. Source filter
- Which systems are involved?
- What log sources are relevant?
3. Event type
- What kind of activity?
- Authentication? Network? Process?
4. Specific indicators
- Known bad IPs, domains, hashes?
- Specific user accounts?
Example progression:
- All events, last 24 hours (too broad)
- All auth events, last 24 hours
- Failed auth events, last 24 hours
- Failed auth events, external IPs, last 24 hours
- Failed auth for user "admin", external IPs, last 24 hours
Pattern Recognition:
Anomaly patterns to recognize:
Brute Force:
- Many failed logins, same target
- Sequential attempts in short time
- Often followed by success
Lateral Movement:
- Successful auth to multiple systems
- Short time between connections
- Often using same credentials
Data Exfiltration:
- Large outbound data transfers
- Unusual destination
- Outside business hours
Persistence:
- New scheduled tasks
- New services installed
- Registry run key modifications
C2 Communication:
- Regular beacon intervals
- Unusual ports or protocols
- Base64 or encoded data
Correlation Techniques:
Connecting related events:
Temporal correlation:
- Events within same time window
- "What else happened at 3:42 AM?"
Entity correlation:
- Same user across systems
- Same IP across log sources
- Same process across events
Sequence correlation:
- Login → privilege escalation → data access
- Download → execution → network connection
- Recon → exploit → persist
Example correlation rule:
IF (failed_login > 5 in 5 minutes from same src_ip)
AND (successful_login from same src_ip within 10 minutes)
THEN alert "Possible Brute Force Success"
Key insight: Analysis is about asking the right questions. Start with "what am I looking for?" not "let me search for everything."
5) Command-Line Log Analysis
Master command-line tools for quick analysis:
Essential Commands:
# View logs
cat /var/log/auth.log
less /var/log/syslog
tail -f /var/log/auth.log # Follow in real-time
# Search with grep
grep "Failed password" /var/log/auth.log
grep -i "error" /var/log/syslog # Case insensitive
grep -v "CRON" /var/log/syslog # Exclude pattern
grep -E "Failed|Invalid" /var/log/auth.log # Multiple patterns
# Extract fields with awk
awk '{print $1, $2, $3, $11}' /var/log/auth.log
awk -F',' '{print $3}' file.csv # CSV field extraction
# Count and sort
grep "Failed password" /var/log/auth.log | wc -l
grep -oE "from [0-9.]+" /var/log/auth.log | sort | uniq -c | sort -rn
# Time-based filtering
awk '/^Jan 15 1[0-2]:/' /var/log/auth.log # Jan 15, 10-12 hours
Practical Analysis Examples:
# Find brute force attacks
# Top 10 IPs with failed SSH logins
grep "Failed password" /var/log/auth.log | \
grep -oE "from [0-9.]+" | \
sort | uniq -c | sort -rn | head -10
# Find successful logins after failures
grep -E "(Failed|Accepted)" /var/log/auth.log | \
grep -B5 "Accepted"
# Extract unique usernames attempted
grep "Failed password" /var/log/auth.log | \
grep -oE "for (invalid user )?[^ ]+" | \
sort | uniq -c | sort -rn
# Timeline of sudo commands
grep "sudo:" /var/log/auth.log | \
grep "COMMAND" | \
awk '{print $1, $2, $3, $6, $NF}'
JSON Log Analysis with jq:
# Parse JSON logs
cat events.json | jq '.'
# Extract specific fields
cat events.json | jq '.src_ip, .event_type'
# Filter events
cat events.json | jq 'select(.event_type == "login_failed")'
# Count by field
cat events.json | jq -r '.src_ip' | sort | uniq -c | sort -rn
# Complex queries
cat events.json | jq 'select(.event_type == "login_failed" and
.src_ip != "10.0.0.0/8")'
Windows Log Analysis (PowerShell):
# Get security events
Get-WinEvent -LogName Security -MaxEvents 100
# Filter by Event ID
Get-WinEvent -FilterHashtable @{LogName='Security';ID=4625}
# Failed logins in last 24 hours
$start = (Get-Date).AddDays(-1)
Get-WinEvent -FilterHashtable @{
LogName='Security'
ID=4625
StartTime=$start
}
# Extract specific fields
Get-WinEvent -FilterHashtable @{LogName='Security';ID=4624} |
Select-Object TimeCreated,
@{N='User';E={$_.Properties[5].Value}},
@{N='SourceIP';E={$_.Properties[18].Value}}
Key insight: Command-line analysis is fast and powerful. These skills work anywhere—you don't always have SIEM access.
Real-World Context: Log Analysis in Investigations
Log analysis drives every security investigation:
Incident Timeline: When investigating a breach, logs provide the timeline. What was the first malicious event? How did the attacker move through the environment? When did exfiltration occur? Logs answer these questions.
Alert Validation: When an alert fires, logs provide context. Is this a real threat or false positive? What activity preceded the alert? What happened after? Logs turn a single alert into a complete picture.
Threat Hunting: Hunters proactively search logs for threats that didn't trigger alerts. Anomalous patterns, rare events, suspicious combinations—all found through log analysis.
MITRE ATT&CK Data Sources:
- DS0002 - User Account: Authentication logs
- DS0009 - Process: Process creation logs
- DS0029 - Network Traffic: Firewall, proxy, flow logs
- DS0022 - File: File creation/modification logs
Key insight: Logs are your evidence. Preserve them, understand them, and learn to extract their secrets efficiently.
Guided Lab: Command-Line Log Analysis
Let's practice log analysis using command-line tools on real log data.