Skip to content
CSY103 Week 06 Beginner

Practice lists and dictionaries before moving to reading resources.

Programming Fundamentals

Track your progress through this week's content

Opening Framing: Beyond Single Values

So far, you've worked with individual variables: one IP address, one port, one username. But security data comes in collections: lists of blocked IPs, tables of user permissions, mappings of ports to services, collections of IOCs from threat intelligence feeds.

Data structures let you organize related data together. A list holds an ordered collection you can iterate through. A dictionary maps keys to values for instant lookup. Together, they handle virtually any data organization challenge in security scripting.

This week marks a turning point: you'll move from processing single items to managing collections—the foundation of real security tools.

Key insight: The right data structure makes code simple; the wrong one makes it painful. Lists for sequences, dictionaries for lookups—choose based on how you'll use the data.

1) Lists: Ordered Collections

Lists store ordered sequences of items. Items can be any type and can be accessed by position (index):

# Creating lists
blocked_ips = ["192.168.1.50", "10.0.0.25", "172.16.0.100"]
open_ports = [22, 80, 443, 8080]
mixed_data = ["admin", 5, True, 3.14]

# Accessing by index (0-based)
print(blocked_ips[0])   # "192.168.1.50" (first)
print(blocked_ips[-1])  # "172.16.0.100" (last)
print(open_ports[1:3])  # [80, 443] (slice)

# Length
print(len(blocked_ips))  # 3

Modifying Lists:

# Add items
blocked_ips.append("203.0.113.50")       # Add to end
blocked_ips.insert(0, "198.51.100.1")    # Insert at position

# Remove items
blocked_ips.remove("10.0.0.25")          # Remove by value
removed = blocked_ips.pop()               # Remove and return last
del blocked_ips[0]                        # Remove by index

# Check membership
if "192.168.1.50" in blocked_ips:
    print("IP is blocked")

List Operations:

# Combine lists
list1 = [1, 2, 3]
list2 = [4, 5, 6]
combined = list1 + list2  # [1, 2, 3, 4, 5, 6]

# Sort
ports = [443, 22, 80, 8080]
ports.sort()              # In-place: [22, 80, 443, 8080]
sorted_ports = sorted(ports, reverse=True)  # New list, descending

# Reverse
ports.reverse()           # In-place reversal

Key insight: Lists maintain order and allow duplicates. Use lists when sequence matters (log entries, scan results) or when you need to iterate through items.

2) Dictionaries: Key-Value Mappings

Dictionaries store key-value pairs. Instead of accessing by position, you access by key—perfect for lookups:

# Creating dictionaries
port_services = {
    22: "SSH",
    80: "HTTP",
    443: "HTTPS",
    3389: "RDP"
}

user_info = {
    "username": "admin",
    "role": "administrator",
    "failed_logins": 3,
    "is_locked": False
}

# Accessing by key
print(port_services[22])         # "SSH"
print(user_info["username"])     # "admin"

# Safe access with .get() (no error if missing)
print(port_services.get(8080, "Unknown"))  # "Unknown"

Modifying Dictionaries:

# Add or update
port_services[8080] = "HTTP-Alt"    # Add new
port_services[22] = "Secure Shell"  # Update existing

# Remove
del port_services[3389]             # Remove by key
removed = port_services.pop(80)     # Remove and return value

# Check if key exists
if 443 in port_services:
    print("HTTPS mapping exists")

Iterating Dictionaries:

# Iterate keys
for port in port_services:
    print(port)

# Iterate values
for service in port_services.values():
    print(service)

# Iterate both (most common)
for port, service in port_services.items():
    print(f"Port {port}: {service}")

Key insight: Dictionaries provide O(1) lookup—instant access regardless of size. Use dictionaries when you need to look up values by a unique key.

3) Security Data Patterns

Let's see how lists and dictionaries model real security data:

Pattern 1: Blocklist (List)

# Simple blocklist - order doesn't matter, just membership
ip_blocklist = [
    "192.168.1.50",
    "203.0.113.100",
    "198.51.100.25"
]

def is_blocked(ip):
    return ip in ip_blocklist

# Check incoming connection
incoming_ip = "203.0.113.100"
if is_blocked(incoming_ip):
    print(f"DENIED: {incoming_ip} is blocklisted")

Pattern 2: Threat Intelligence (Dictionary)

# IOC database with metadata
ioc_database = {
    "5d41402abc4b2a76b9719d911017c592": {
        "type": "MD5",
        "malware_family": "Emotet",
        "severity": "HIGH",
        "first_seen": "2024-01-15"
    },
    "192.168.1.50": {
        "type": "IP",
        "category": "C2 Server",
        "severity": "CRITICAL",
        "first_seen": "2024-01-10"
    }
}

# Look up an IOC
hash_to_check = "5d41402abc4b2a76b9719d911017c592"
if hash_to_check in ioc_database:
    info = ioc_database[hash_to_check]
    print(f"MATCH: {info['malware_family']} ({info['severity']})")

Pattern 3: Event Counter (Dictionary)

# Count events by source
login_attempts = [
    "192.168.1.50", "10.0.0.25", "192.168.1.50",
    "192.168.1.50", "172.16.0.1", "10.0.0.25"
]

# Build counter dictionary
ip_counts = {}
for ip in login_attempts:
    if ip in ip_counts:
        ip_counts[ip] += 1
    else:
        ip_counts[ip] = 1

# Or use .get() for cleaner code
ip_counts = {}
for ip in login_attempts:
    ip_counts[ip] = ip_counts.get(ip, 0) + 1

print(ip_counts)
# {'192.168.1.50': 3, '10.0.0.25': 2, '172.16.0.1': 1}

Key insight: The counter pattern (dictionary counting occurrences) is fundamental to security analytics—detecting anomalies, finding top talkers, identifying patterns.

4) Nested Structures

Real security data often requires nested structures—lists of dictionaries or dictionaries containing lists:

# List of dictionaries: Multiple events
security_events = [
    {
        "timestamp": "2024-01-15 09:23:45",
        "event_type": "login_failure",
        "source_ip": "203.0.113.50",
        "username": "admin"
    },
    {
        "timestamp": "2024-01-15 09:24:12",
        "event_type": "login_failure",
        "source_ip": "203.0.113.50",
        "username": "root"
    },
    {
        "timestamp": "2024-01-15 09:25:00",
        "event_type": "login_success",
        "source_ip": "192.168.1.10",
        "username": "jsmith"
    }
]

# Process events
for event in security_events:
    if event["event_type"] == "login_failure":
        print(f"Failed login: {event['username']} from {event['source_ip']}")

Dictionary with Lists:

# User permissions model
user_permissions = {
    "admin": ["read", "write", "delete", "admin"],
    "analyst": ["read", "write"],
    "viewer": ["read"]
}

# Check permission
def has_permission(username, action):
    if username not in user_permissions:
        return False
    return action in user_permissions[username]

print(has_permission("analyst", "read"))    # True
print(has_permission("analyst", "delete"))  # False

Complex Nesting: Firewall Rules

# Firewall rule structure
firewall_rules = {
    "inbound": [
        {"action": "allow", "port": 443, "source": "any"},
        {"action": "allow", "port": 22, "source": "192.168.0.0/16"},
        {"action": "deny", "port": 23, "source": "any"}
    ],
    "outbound": [
        {"action": "allow", "port": 443, "source": "any"},
        {"action": "deny", "port": 25, "source": "any"}
    ]
}

# Process inbound rules
print("Inbound Rules:")
for rule in firewall_rules["inbound"]:
    print(f"  {rule['action'].upper()} port {rule['port']} from {rule['source']}")

Key insight: Most API responses, log formats (JSON), and configuration files use nested structures. Master navigation through nested data and you can parse anything.

5) List Comprehensions: Pythonic Processing

List comprehensions provide a concise way to create lists from existing data—extremely useful for filtering and transforming security data:

# Traditional loop approach
ports = [22, 80, 443, 8080, 3389]
privileged = []
for port in ports:
    if port < 1024:
        privileged.append(port)

# List comprehension (same result, one line)
privileged = [port for port in ports if port < 1024]
print(privileged)  # [22, 80, 443]

Security Applications:

# Extract all failed login IPs
events = [
    {"type": "login_fail", "ip": "10.0.0.1"},
    {"type": "login_success", "ip": "10.0.0.2"},
    {"type": "login_fail", "ip": "10.0.0.3"},
]

failed_ips = [e["ip"] for e in events if e["type"] == "login_fail"]
print(failed_ips)  # ['10.0.0.1', '10.0.0.3']

# Transform data: uppercase all usernames
usernames = ["admin", "root", "guest"]
upper_names = [name.upper() for name in usernames]
print(upper_names)  # ['ADMIN', 'ROOT', 'GUEST']

# Filter and transform: get lengths of long passwords
passwords = ["abc", "password123", "x", "SecureP@ssw0rd!"]
long_pwd_lengths = [len(p) for p in passwords if len(p) >= 8]
print(long_pwd_lengths)  # [11, 15]

Dictionary Comprehensions:

# Create port:service mapping from lists
ports = [22, 80, 443]
services = ["SSH", "HTTP", "HTTPS"]

port_map = {port: service for port, service in zip(ports, services)}
print(port_map)  # {22: 'SSH', 80: 'HTTP', 443: 'HTTPS'}

# Invert a dictionary
service_to_port = {v: k for k, v in port_map.items()}
print(service_to_port)  # {'SSH': 22, 'HTTP': 80, 'HTTPS': 443}

Key insight: Comprehensions are "Pythonic"—they express intent clearly and run faster than equivalent loops. Security professionals who read Python encounter them constantly.

Real-World Context: Data Structures in Security Tools

Data structures are the backbone of security tools:

SIEM Event Storage: Security events are stored as lists of dictionaries—each event is a dictionary with timestamp, type, source, etc. Queries filter and aggregate these structures. When you search "source_ip=10.0.0.1", you're filtering a list of dictionaries.

Threat Intelligence Platforms: IOC databases are dictionaries mapping indicators to metadata. VirusTotal's API returns JSON (nested dictionaries) with scan results, vendor detections, and file metadata—all accessed by key.

Configuration Management: Security tool configs (Snort rules, YARA, Suricata) often parse into dictionaries. A Suricata rule becomes a dictionary with action, protocol, source, destination, and options as keys.

MITRE ATT&CK Reference: The ATT&CK framework itself is a data structure! Techniques map to tactics (dictionary), each technique has metadata (nested dictionary), and mitigations are lists. The STIX format represents this as nested JSON.

Key insight: JSON—the universal data exchange format—is just nested dictionaries and lists. Master Python data structures and you can work with any API, any log format, any configuration.

Guided Lab: Threat Intelligence Database

Let's build a simple threat intelligence database using dictionaries and implement lookup and reporting functions.

Step 1: Create the Script

Create threat_intel.py:

# Threat Intelligence Database
# Demonstrates lists, dictionaries, and nested structures

# IOC Database (dictionary of dictionaries)
threat_db = {
    "5d41402abc4b2a76b9719d911017c592": {
        "type": "hash",
        "hash_type": "MD5",
        "malware": "Emotet",
        "severity": "HIGH",
        "tags": ["banking", "trojan", "botnet"]
    },
    "203.0.113.50": {
        "type": "ip",
        "category": "C2",
        "malware": "Cobalt Strike",
        "severity": "CRITICAL",
        "tags": ["apt", "c2", "beacon"]
    },
    "evil-domain.com": {
        "type": "domain",
        "category": "Phishing",
        "malware": "Credential Harvester",
        "severity": "MEDIUM",
        "tags": ["phishing", "credentials"]
    },
    "a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2": {
        "type": "hash",
        "hash_type": "SHA256",
        "malware": "Ransomware",
        "severity": "CRITICAL",
        "tags": ["ransomware", "encryption", "extortion"]
    }
}

def lookup_ioc(indicator):
    """Look up an IOC in the threat database."""
    if indicator in threat_db:
        return threat_db[indicator]
    return None

def search_by_tag(tag):
    """Find all IOCs with a specific tag."""
    results = []
    for ioc, info in threat_db.items():
        if tag in info["tags"]:
            results.append({"indicator": ioc, "info": info})
    return results

def search_by_severity(severity):
    """Find all IOCs with specific severity."""
    return [
        {"indicator": ioc, "info": info}
        for ioc, info in threat_db.items()
        if info["severity"] == severity
    ]

def generate_report():
    """Generate threat intelligence summary."""
    # Count by type
    type_counts = {}
    severity_counts = {}
    all_tags = []
    
    for ioc, info in threat_db.items():
        # Count types
        ioc_type = info["type"]
        type_counts[ioc_type] = type_counts.get(ioc_type, 0) + 1
        
        # Count severities
        sev = info["severity"]
        severity_counts[sev] = severity_counts.get(sev, 0) + 1
        
        # Collect tags
        all_tags.extend(info["tags"])
    
    # Count tag frequency
    tag_counts = {}
    for tag in all_tags:
        tag_counts[tag] = tag_counts.get(tag, 0) + 1
    
    return {
        "total_iocs": len(threat_db),
        "by_type": type_counts,
        "by_severity": severity_counts,
        "top_tags": sorted(tag_counts.items(), key=lambda x: x[1], reverse=True)[:5]
    }


# Main execution
if __name__ == "__main__":
    print("=" * 50)
    print("THREAT INTELLIGENCE DATABASE")
    print("=" * 50)
    
    # Test lookup
    print("\n[1] IOC Lookup Test")
    test_ioc = "203.0.113.50"
    result = lookup_ioc(test_ioc)
    if result:
        print(f"  Found: {test_ioc}")
        print(f"  Type: {result['type']}")
        print(f"  Malware: {result['malware']}")
        print(f"  Severity: {result['severity']}")
    
    # Test tag search
    print("\n[2] Search by Tag: 'c2'")
    c2_results = search_by_tag("c2")
    for item in c2_results:
        print(f"  {item['indicator']}: {item['info']['malware']}")
    
    # Test severity search
    print("\n[3] Search by Severity: 'CRITICAL'")
    critical = search_by_severity("CRITICAL")
    for item in critical:
        print(f"  {item['indicator']}: {item['info']['malware']}")
    
    # Generate report
    print("\n[4] Threat Intelligence Report")
    report = generate_report()
    print(f"  Total IOCs: {report['total_iocs']}")
    print(f"  By Type: {report['by_type']}")
    print(f"  By Severity: {report['by_severity']}")
    print(f"  Top Tags: {report['top_tags']}")
    
    print("\n" + "=" * 50)

Step 2: Run and Analyze

Run the script and observe how data structures enable complex queries.

Step 3: Reflection (mandatory)

  1. Why is a dictionary the right choice for the threat database?
  2. How does search_by_severity() use list comprehension?
  3. What data structure does generate_report() return?
  4. How would you add a new IOC to the database?

Week 6 Outcome Check

By the end of this week, you should be able to:

Next week: File Operations—where we read real log files and write reports, connecting our data structures to persistent storage.

🎯 Hands-On Labs (Free & Essential)

Practice lists and dictionaries before moving to reading resources.

🎮 TryHackMe: Python Basics (Data Structures)

What you'll do: Work with lists and dictionaries in short exercises.
Why it matters: Most security data is best modeled as collections.
Time estimate: 1-1.5 hours

Start TryHackMe Python Basics →

📝 Lab Exercise: IOC Dictionary Builder

Task: Build a dictionary that maps IPs to severity and tags.
Deliverable: A script that prints lookups and counts by severity.
Why it matters: Lookups are a constant part of triage and detection.
Time estimate: 45-60 minutes

🏁 PicoCTF Practice: General Skills (Data Structures)

What you'll do: Solve beginner challenges that require list/dict usage.
Why it matters: Data structures keep your scripts efficient and readable.
Time estimate: 1-2 hours

Start PicoCTF General Skills →

💡 Lab Tip: Dictionaries are for fast lookup; lists are for ordered processing. Choose intentionally.

🛡️ Secure Coding: Safe Data Structures

Lists and dictionaries often encode policy: allowlists, deny rules, and detection mappings. Handle them defensively.

Data structure safety checklist:
- Normalize keys (case, whitespace) before lookup
- Use dict.get(key, default) for safe fallbacks
- Prefer sets for allowlists/denylists
- Avoid mutating a list while iterating

📚 Building on CSY101 Week-13: Model bypasses that exploit unnormalized keys.

Resources

Complete the required resources to build your foundation.

Lab: Security Event Aggregator

Goal: Build a script that aggregates security events and produces statistical summaries.

Linux/Windows Path (same for both)

  1. Create event_aggregator.py
  2. Create a list of 20+ security event dictionaries with fields: timestamp, event_type, source_ip, destination_port, severity
  3. Implement these functions:
    • count_by_type(events) - return dict of event_type counts
    • count_by_source(events) - return dict of source_ip counts
    • filter_by_severity(events, severity) - return filtered list
    • get_top_sources(events, n) - return top n source IPs
  4. Use at least one list comprehension
  5. Print a formatted summary report

Deliverable (submit):

Checkpoint Questions

  1. What is the difference between a list and a dictionary?
  2. How do you access the third element of a list?
  3. How do you safely access a dictionary key that might not exist?
  4. What does .items() return when iterating a dictionary?
  5. Write a list comprehension to get all even numbers from 1-10.
  6. Why is the counter pattern useful in security analytics?

Weekly Reflection

Reflection Prompt (200-300 words):

This week you learned data structures—the organizing principles that make complex security data manageable. Lists and dictionaries are fundamental to every security tool and data format.

Reflect on these questions:

A strong reflection will connect data structures to practical security data management challenges.

Verified Resources & Videos

Data structures are the foundation of data processing. With lists and dictionaries, you can model virtually any security data. Next week: reading and writing files to work with real data.

← Previous: Week 05 Next: Week 07 →

Week 06 Quiz

Test your understanding of the weekly concepts.

Format: 10 multiple-choice questions. Passing score: 70%. Time: Untimed.

Take Quiz