Opening Framing: Data Is Everything
In Week 1, you wrote your first script—but it just printed static text. Real security scripts work with data: IP addresses, usernames, timestamps, hash values, log entries, threat scores. To manipulate this data, we need to store it somewhere. That's what variables do.
But not all data is the same. An IP address is text, a port number is a number, and "is this IP malicious?" is a yes/no question. Python handles these differently, and understanding data types prevents bugs that could make your security tools fail when you need them most.
This week, you'll learn to store, retrieve, and manipulate the fundamental building blocks of all security data.
Key insight: Every piece of security data—from a single password character to a massive log file—is ultimately stored in variables with specific types. Master this, and you master the foundation of all data processing.
1) Variables: Naming and Assignment
A variable is a name that refers to a value stored in memory. Think of it as a labeled box: the label is the variable name, and the contents are the value.
# Creating variables (assignment)
ip_address = "192.168.1.100"
port_number = 443
is_malicious = False
# Using variables
print(ip_address)
print(port_number)
print(is_malicious)
Naming Rules:
- Must start with a letter or underscore
- Can contain letters, numbers, and underscores
- Case-sensitive:
IPandipare different variables - Cannot use Python keywords like
if,for,print
Naming Conventions (best practices):
- Use descriptive names:
source_ipnotx - Use snake_case:
failed_login_countnotfailedLoginCount - Be consistent throughout your script
Key insight: Good variable names make code self-documenting. When you read
blocked_ip_list, you immediately know what it contains. When you
read data2, you have no idea.
2) Strings: Text Data
Strings are sequences of characters—text. In security, strings hold: IP addresses, usernames, file paths, hash values, log messages, URLs, email addresses, and much more.
# Creating strings (use quotes)
username = "admin"
file_path = '/var/log/auth.log'
hash_value = "5d41402abc4b2a76b9719d911017c592"
# String operations
print(len(username)) # Length: 5
print(username.upper()) # ADMIN
print(username.lower()) # admin
print(hash_value[0:8]) # First 8 chars: 5d41402a
Essential String Methods for Security:
log_line = "Failed password for admin from 192.168.1.50"
# Check if string contains something
print("admin" in log_line) # True
print(log_line.startswith("Failed")) # True
print(log_line.endswith("50")) # True
# Split string into parts
parts = log_line.split(" ")
print(parts) # ['Failed', 'password', 'for', 'admin', 'from', '192.168.1.50']
# Strip whitespace (critical for parsing!)
messy = " 192.168.1.1 \n"
clean = messy.strip()
print(clean) # "192.168.1.1"
Key insight: Most security data arrives as strings—log files, network packets, user input. String manipulation is the most common operation in security scripts.
3) Numbers: Integers and Floats
Python has two main number types: integers (whole numbers) and floats (decimal numbers). In security, numbers represent: port numbers, byte counts, timestamps, risk scores, thresholds, and counts.
# Integers (whole numbers)
port = 443
failed_attempts = 5
byte_count = 1048576
# Floats (decimal numbers)
risk_score = 7.5
percentage = 0.85
response_time = 0.023
# Arithmetic operations
total = failed_attempts + 10 # Addition: 15
remaining = 100 - failed_attempts # Subtraction: 95
doubled = failed_attempts * 2 # Multiplication: 10
average = 100 / 4 # Division: 25.0 (always float!)
integer_div = 100 // 4 # Integer division: 25
remainder = 100 % 3 # Modulo: 1
Security-Relevant Calculations:
# Threshold checking
max_attempts = 5
current_attempts = 3
attempts_remaining = max_attempts - current_attempts
print(f"Attempts remaining: {attempts_remaining}")
# Percentage calculation
total_requests = 1000
blocked_requests = 150
block_rate = (blocked_requests / total_requests) * 100
print(f"Block rate: {block_rate}%") # 15.0%
Key insight: Integer vs. float matters! Port numbers must be integers (you can't connect to port 443.5). Risk scores might be floats for precision. Know which type your data requires.
4) Booleans: True/False
Booleans represent truth values: True or False.
In security, booleans answer yes/no questions: Is this IP blocked? Is the
user authenticated? Did the scan find vulnerabilities?
# Boolean values
is_authenticated = True
is_blocked = False
has_vulnerabilities = True
# Booleans from comparisons
port = 22
is_ssh = (port == 22) # True
is_high_port = (port > 1024) # False
is_privileged = (port < 1024) # True
# String comparisons
username = "admin"
is_admin = (username == "admin") # True
is_root = (username == "root") # False
Boolean Operators:
# and - both must be True
is_admin = True
is_active = True
can_access = is_admin and is_active # True
# or - at least one must be True
is_blocked = False
is_suspicious = True
needs_review = is_blocked or is_suspicious # True
# not - inverts the value
is_safe = True
is_dangerous = not is_safe # False
Key insight: Security decisions are fundamentally boolean—allow/deny, safe/unsafe, detected/missed. Booleans are how we encode these decisions in code.
5) Type Conversion and Type Errors
Sometimes you need to convert between types. Data from files and user input always arrives as strings, even if it represents numbers. You must convert explicitly.
# String to integer
port_string = "443"
port_number = int(port_string)
print(port_number + 1) # 444
# String to float
score_string = "7.5"
score_number = float(score_string)
# Number to string
port = 443
port_text = str(port)
message = "Connected to port " + port_text
# Check type
print(type(port_string)) # <class 'str'>
print(type(port_number)) # <class 'int'>
Common Type Errors:
# ERROR: Can't add string and integer
port = "443"
# next_port = port + 1 # TypeError!
next_port = int(port) + 1 # Correct: 444
# ERROR: Can't concatenate string and integer
port = 443
# message = "Port: " + port # TypeError!
message = "Port: " + str(port) # Correct: "Port: 443"
message = f"Port: {port}" # Better: f-strings handle conversion
Key insight: Type errors are among the most common bugs. When your script crashes with "TypeError," you're mixing incompatible types. Always know what type your data is.
Real-World Context: Types and Security Vulnerabilities
Type handling isn't just about avoiding bugs—it's about security:
Type Confusion Vulnerabilities: Many exploits abuse how programs handle unexpected types. When a program expects an integer but receives a specially crafted string, the results can be catastrophic. Buffer overflows, format string attacks, and injection attacks all exploit type handling.
SQL Injection Example: If a login form takes a username
as a string without validation, an attacker can input
' OR '1'='1 to bypass authentication. The database interprets
the string as SQL code—a type confusion at the application layer.
Integer Overflow: In 2014, a bug in OpenSSL (Heartbleed) involved improper handling of length values. An attacker could specify a length larger than the actual data, causing the server to return extra memory—potentially containing passwords and private keys.
MITRE ATT&CK Reference: T1027 (Obfuscated Files or Information) often involves encoding data as different types to evade detection—base64 encoding binary as text, hex encoding strings, etc.
Key insight: Understanding types isn't academic—it's security-critical. Attackers exploit type confusion; defenders must understand types to write secure code and recognize attacks.
Guided Lab: Password Strength Analyzer
Let's build a script that analyzes password characteristics using variables and types.