Opening Framing
The Anatomy of a Black Box: When you hold an IoT device, you are holding a sealed computer. The vendor does not want you to know how it works. They glue the case shut, strip the debug headers, and encrypt the updates. Your job is to break that seal.
Why This Matters: Firmware analysis is the most efficient path to critical vulnerabilities. While network scanning (Nmap) might show you port 80 is open, firmware analysis shows you the PHP source code running behind port 80, the hardcoded AWS keys in `config.json`, and the logic flaw in the authentication binary.
Real-World Relevance: The Mirai botnet succeeded because it knew the
default Telnet passwords (admin:12345) hardcoded into the firmware of 600,000 devices.
If security researchers had analyzed that firmware sooner, the vulnerability could have been patched
before the internet broke.
- Identify common firmware image formats (Raw, TRX, uImage).
- Extract filesystems using automated (Binwalk) and manual (dd) techniques.
- Analyze entropy graphs to distinguish between code, data, and encryption.
- Modify and Repack firmware to inject backdoors or enable debugging.
- Bypass basic obfuscation techniques used by vendors to hide firmware contents.
- Understanding Flash Translation Layers (FTL) and UBIFS for raw NAND analysis.
Reading firmware directly from flash memory chips.
Analyzing firmware to identify kernel versions and libraries.
Enumerating the filesystem for sensitive config files.
1) Firmware Acquisition: Getting the Blob
Before you can analyze firmware, you must obtain it. There are three primary paths, ranked by difficulty.
Method A: The Vendor's Website (Easy)
Most vendors publish "updates" on their support page. This is the cleanest source—no hardware required.
- Pro: Easy to get, usually unencrypted (older devices).
- Con: May only contain the "update" partition, not the full bootloader or factory settings (NVRAM).
Method B: Man-in-the-Middle (Medium)
If the vendor only allows "Over-the-Air" (OTA) updates via the app, you capture the traffic.
# 1. Set up a Wi-Fi Access Point on your Kali machine
# 2. Connect the IoT device to your AP
# 3. Run Wireshark or tcpdump
$ sudo tcpdump -i wlan0 -w update_capture.pcap
# 4. Trigger "Check for Updates" in the mobile app
# 5. Look for HTTP GET requests or FTP transfers in the PCAP
# 6. Extract the URL: http://update.vendor.com/fw/v2.bin
Method C: Hardware Extraction (Hard)
If no update is available, you must pull it from the chip (SPI Flash) using the techniques we will cover in Week 06 (Hardware Hacking). This gives you the "Ground Truth"—exactly what is on the device right now, including user data (Wi-Fi passwords).
2) The Anatomy of a Firmware Image
A firmware file is not a single executable. It is a container, like a ZIP file, but often without a standard header. It typically contains:
0x00000000 +-----------------------+
| Bootloader | (e.g., U-Boot)
| (Initializes HW) |
0x00040000 +-----------------------+
| U-Boot Environment | (Config: boot_delay=0)
0x00050000 +-----------------------+
| Kernel Image | (Linux Kernel, uImage)
| (LZMA Compressed) |
0x00200000 +-----------------------+
| Root Filesystem | (The Holy Grail)
| (SquashFS / JFFS2) | Contains /etc/shadow, /bin/sh
0x00F00000 +-----------------------+
| User Configuration | (NVRAM / OverlayFS)
| (Saved Settings) |
0x01000000 +-----------------------+
Common Internals
| Component | Signature (Magic Bytes) | Description |
|---|---|---|
| uImage | 27 05 19 56 |
U-Boot wrapped kernel. Contains header with CRC and Load Address. |
| SquashFS | 68 73 71 73 (hsqs) |
Read-only, highly compressed filesystem. Standard for routers. |
| CramFS | 28 CD 3D 45 |
Obsolete, simple, read-only. Found in old IP cameras. |
| UBIFS | UBI# |
Unsorted Block Image. Managed flash (FTL) filesystem for raw NAND. |
| TRX | HDR0 |
Header format used by Broadcom/Linksys devices. |
| GZIP | 1F 8B 08 |
Standard compression header. |
Filesystem Comparison Guide
Choosing the right extractor depends on understanding what you are looking at.
| FS Type | Read/Write | Compression | Use Case | Extraction Tool |
|---|---|---|---|---|
| SquashFS | Read-Only | GZIP / LZMA / XZ | Router RootFS (90% of cases) | unsquashfs / sasquatch |
| JFFS2 | Read/Write | ZLIB | Journaling Flash (Older) | jefferson |
| UBIFS | Read/Write | LZO / ZLIB | Raw NAND (Smart Home Hubs) | ubi_reader |
| YAFFS2 | Read/Write | None | Android / Old NAND | unyaffs |
| RomFS | Read-Only | None | Very small embedded | binwalk |
Concept Check: NOR vs NAND Flash
When extracting from hardware, the type of chip determines the format of the data.
- Used for small data (Bootloaders, WiFi settings).
- Linear Addressing: Just like a hard drive.
- Result: Cleaner images, easier to Binwalk.
- Filesystem: SquashFS, JFFS2.
- Used for large data (Android phones, Advanced IoT).
- OOB (Out of Band) Data: Each page has extra bytes for Error Correction (ECC).
- Result: Dumps contain "spare bytes" every 512/2048 bytes. You MUST strip these before Binwalking!
- Filesystem: UBIFS, YAFFS2.
Firmware Architecture Families
Not everything is Linux! You will encounter different operating systems.
Has a filesystem (/bin, /etc). Uses
init, bash. Easy to analyze (it's just Linux).
Real-Time Operating System. No filesystem (usually). It's one giant static binary. Analysis requires Ghidra/IDA Pro.
No OS. Just a while(1) loop running directly on the CPU
(e.g. Arduino style). Hardest to analyze.
3) Automated Extraction: Binwalk
Binwalk is the primary tool for firmware analysis. It scans the binary for known signatures (Magic Bytes) and extracts them.
Advanced Binwalk Flags
Master these flags to become a power user:
-e (Extract): Automatically extract recognized file types.-M (Matryoshka): Recursively scan extracted files. (e.g. gzip -> tar -> elf).-A (Opcode Scan): Scan for executable code signatures (ARM/MIPS/PPC). Useful if headers are stripped!-E (Entropy): Visualize entropy.--dd=".*": Extract EVERYTHING, regardless of type.
Rule of Thumb: Check the Decimal offset. If a SquashFS header appears at offset
1234567, is that aligned? Most partitions align to 64KB (0x10000) boundaries.
A signature at offset 1234563 is likely garbage noise.
Binwalk Workflow Diagram
[ FIRMWARE.BIN ]
|
v
[ 1. Signature Scan ] ---> Does it have Magic bytes? (e.g. 0x27051956)
|
+---> YES: [ 2. Extraction ] ---> Extract using 'dd'
| |
| +---> [ 3. Recursive Scan ] ---> Is there a file inside the file?
| (e.g. gzip inside uImage)
v
[ 4. Entropy Scan ] ---> Is it random noise? (Encryption)
4) Entropy Analysis: Seeing the Invisible
What if `binwalk` finds nothing? The firmware might be encrypted, or obfuscated. Entropy measures the randomness of data (0 to 1).
- Entropy 0.0 - 0.2: Zero padding, large blocks of same characters.
- Entropy 0.5 - 0.8: English text, machine code (ARM/MIPS instructions).
- Entropy 0.999...: Compressed data (Zip) OR Encrypted data (AES).
(Visualizing Entropy): A flat line at 1.0 usually means encryption. A line at 1.0 that "dips" is often Compression (headers lower the average).
# How Binwalk calculates entropy (simplified)
import math
import matplotlib.pyplot as plt
def shannon_entropy(data):
if not data:
return 0
entropy = 0
for x in range(256):
p_x = float(data.count(bytes([x]))) / len(data)
if p_x > 0:
entropy += - p_x * math.log(p_x, 2)
return entropy / 8.0 # Normalize to 0-1 range
def scan_file(filename, block_size=1024):
entropies = []
with open(filename, 'rb') as f:
while chunk := f.read(block_size):
entropies.append(shannon_entropy(chunk))
plt.plot(entropies)
plt.title("Firmware Entropy Analysis")
plt.show()
5) Manual Extraction (The Scalpel)
Sometimes automation fails. You need to manually carve out the chunk of file you want.
Scenario: Binwalk identifies a SquashFS starts at offset 1184292
(0x121224), but fails to extract it because the header is non-standard.
The Toolkit: dd
dd is the "Disk Destroyer" or Data Duplicator. It copies bytes from Input (if) to Output
(of).
# Syntax: dd if=INPUT of=OUTPUT bs=1 skip=OFFSET count=LENGTH
# Extract EVERYTHING after the offset:
dd if=firmware.bin of=filesystem.squashfs bs=1 skip=1184292
# If you know the exact size (from the header size field):
dd if=firmware.bin of=filesystem.squashfs bs=1 skip=1184292 count=7384212
Correcting Headers
Vendors often corrupt standard headers (e.g., changing "hsqs" to "shsq") to break tools like
Binwalk. This is "Security by Obscurity".
Fix: Open the extracted `filesystem.squashfs` in a Hex Editor (Bless / GHex)
and change the bytes back to standard magic.
OFFSET 00 01 02 03 04 05 06 07 ASCII
00121220 FF FF FF FF 73 68 73 71 ....shsq <-- "shsq" (Non-standard)
00121228 04 00 00 00 01 00 00 10 ........
SOLUTION:
1. Open file in Hex Editor: $ bless filesystem.squashfs
2. Overwrite '73 68 73 71' (shsq) with '68 73 71 73' (hsqs)
3. Save and run: $ unsquashfs filesystem.squashfs
6) Firmware Modification & Repacking
Extraction is passive. To prove impact, we often need to modify the firmware (add a backdoor) and repack it.
The "Backdoor Loop"
- Unpack:
unsquashfs filesystem.squashfs-> Createssquashfs-root/ - Modify:
- Add a user: Edit
squashfs-root/etc/shadow - Enable Telnet: Edit
squashfs-root/etc/init.d/rcSto start telnetd. - Inject binary: Copy a static
gdbserveror reverse shell agent.
- Add a user: Edit
- Repack:
mksquashfs squashfs-root/ new_filesystem.squashfs -comp lzma -b 131072
Note: You MUST match the compression type (LZMA/GZIP) and block size of the original! - Reconstruct: Append the new filesystem back to the original header+kernel.
cat header.bin kernel.bin new_filesystem.squashfs > hacked_firmware.bin
When you unpack a filesystem as a regular user, all files become owned by YOU (UID 1000). When you repack it, the files on the router will be owned by UID 1000 (which doesn't exist), not Root (UID 0).
Fix: Always run
sudo unsquashfs and sudo mksquashfs
or use the -fakeroot option to preserve permissions.
The device bootloader typically calculates a Checksum (CRC32) of the image before booting. If your modified image has a different checksum (it will), the device will refuse to boot ("Brick").
Fix: You must calculate the new CRC32 and update the header (uImage header or TRX header) using a tool like
u-boot-tools.
import zlib
import sys
def calculate_checksum(filename):
"""Calculates CRC32 of a firmware image for header patching"""
try:
with open(filename, 'rb') as f:
data = f.read()
crc = zlib.crc32(data) & 0xFFFFFFFF
print(f"[+] File: {filename}")
print(f"[+] Size: {len(data)} bytes")
print(f"[+] CRC32: {crc:08X}")
except FileNotFoundError:
print("[-] File not found.")
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python3 crc_calc.py ")
else:
calculate_checksum(sys.argv[1])
7) Defensive Architecture: Signing & Encryption
As defenders, how do we stop researchers (and attackers) from doing this?
The bootloader contains a Public Key. The firmware is signed with the Private
Key.
If verify(firmware, pub_key) == FAIL: Halt();
This prevents modification (Repacking), but not analysis.
The entire firmware blob is encrypted (AES-CBC). The device decrypts it in RAM
during boot.
This prevents analysis (Entropy = 1.0). To defeat this, we often perform voltage
glitching to dump the decrypted RAM.
[ HW ROOT OF TRUST ] ---> [ BootROM (Immutable) ]
| Verifies (RSA-2048)
v
[ Bootloader (U-Boot) ]
| Verifies (RSA-2048)
v
[ Kernel (Linux) ]
| Verifies (Dm-Verity)
v
[ Filesystem (RootFS) ]
Case Study: The Smart Plug Fail (Simulated)
To bring this all together, let's look at a fictional but realistic engagement.
Phase 1: Discovery
We download fw_update.bin for a popular smart plug. Running binwalk
reveals a SquashFS filesystem. No encryption!
Phase 2: Extraction
We extract it: binwalk -e fw_update.bin. We find typical linux dirs: `/bin`, `/etc`,
`/usr`.
Phase 3: Analysis (The Gold Mine)
We grep for "password", "key", "token".
grep -r "aws_secret" squashfs-root/
We find a file: /etc/mqtt_config.json containing:
{
"host": "iot.vendor.com",
"port": 8883,
"client_id": "PLUG_9921",
"password": "SuperSecretHardcodedPassword123!"
}
Phase 4: Impact
This password was the same for every plug. We could use this to subscribe to the
MQTT topic `#` and control ANY user's plug remotely.
Lesson: Firmware analysis turned a $20 device into a global botnet key.
Tools of the Trade (Mega-List)
| Tool | System Package | Function |
|---|---|---|
| Binwalk | binwalk |
Signature scan and extraction (Automation). |
| dd | (Built-in) | Manual bit-for-bit copying/carving. |
| hexdump | (Built-in) | View raw hex bytes (or xxd). |
| strings | binutils |
Extract printable ASCII strings (strings -n 10 file.bin). |
| unsquashfs | squashfs-tools |
Extract SquashFS filesystems. |
| sasquatch | (Build from git) | Patched unsquashfs for non-standard vendor formats. |
| jefferson | pip3 install jefferson |
Extract JFFS2 filesystems. |
| ubi_reader | pip3 install ubi_reader |
Extract UBIFS/NAND images. |
| fact_extractor | (Docker) | FACT (Firmware Analysis and Comparison Tool) - Enterprise grade automation. |
Security Verification Checklist
When analyzing a new firmware, run this 5-point check:
- Entropy Check: Is it encrypted? (Binwalk -E)
- Secrets Scan: Are there hardcoded keys? (grep / strings)
- Version Check: Is the Kernel > 5 years old? (Linux 2.6 is bad)
- Insecure Services: Is Telnet enabled? (RC scripts)
- Permissions: Is the web server running as Root? (UID check)
Guided Lab: The Firmware Mod Kit
Objective: Manually extract a filesystem, inject a "Backdoor" file, and repack it.
Scenario: You verify a vulnerability by planting a file named pwned.txt
in the root directory.
Step 1: Analyze the Target
Download the sample file iot_camera_v1.bin (simulated).
128 0x80 SquashFS filesystem, little endian, version 4.0
Step 2: Extract
Since the offset is 128, let's carve it manually.
dd if=iot_camera_v1.bin of=rootfs.squashfs bs=1 skip=128
unsquashfs rootfs.squashfs
# This creates the 'squashfs-root' directory
Step 3: Modify
cd squashfs-root
echo "If you are reading this, I have root." > pwned.txt
# Verify it's there
ls -l pwned.txt
Step 4: Repack
cd ..
mksquashfs squashfs-root/ mod_rootfs.squashfs -comp gzip
# Check if size fits! IT MUST NOT BE LARGER than the flash partition.
ls -l rootfs.squashfs mod_rootfs.squashfs
Extra Credit: Header Patching
The lab isn't over. Your modified firmware is larger than the original! You cut off the header bytes
(0-128). Use cat to splice them back together.
# Get the original header (first 128 bytes)
dd if=iot_camera_v1.bin of=header.bin bs=1 count=128
# Combine
cat header.bin mod_rootfs.squashfs > pwned_firmware.bin
Outcome Check
- Explain the difference between a Bootloader, Kernel, and Filesystem.
- Interpret Binwalk output and identify potential false positives.
- Use `dd` to manually carve binary data based on offsets.
- Analyze entropy graphs to identify encrypted partitions.
- Perform the Extract-Modify-Repack cycle on a Linux filesystem.
- Verify CRC checksums to prevent bricking during repacking.
Resources & Tools
Glossary of Terms
- Blob
- Binary Large Object. A raw firmware image with no filesystem structure.
- SquashFS
- A compressed, read-only filesystem optimized for embedded devices. It compresses both data and metadata (inodes).
- LZMA
- Lempel-Ziv-Markov chain Algorithm. A compression dictionary method with high ratios, standard in IoT.
- NVRAM
- Non-Volatile RAM. Used to store configuration settings (bootargs, wifi_ssid) that persist after reboot.
- Entropy
- A measure of randomness. High entropy (1.0) = Compression or Encryption. Low entropy = Code/Text.
- OOB Data
- Out-Of-Band data. Spare bytes in NAND flash pages used for Error Correction Codes (ECC), not visible in normal filesystems.
- U-Boot
- Universal Bootloader. The de-facto standard bootloader for embedded Linux devices.
- Brick
- A device that fails to boot, often due to a corrupted firmware update/modification (Soft brick) or hardware damage (Hard brick).