Opening Framing: Intelligence Drives Success
Every successful penetration test begins with reconnaissance. Before touching a single system, skilled testers gather extensive information about their target—technology stack, employees, network ranges, business relationships, and potential attack vectors.
This intelligence shapes everything that follows. Knowing an organization uses specific technology helps you find relevant vulnerabilities. Discovering employee names enables targeted phishing. Finding forgotten subdomains reveals attack surface.
This week covers passive and active reconnaissance, OSINT (Open Source Intelligence) techniques, and how to systematically gather information while staying within scope.
Key insight: Time invested in reconnaissance multiplies the effectiveness of every subsequent phase. Rushing to exploitation means missing opportunities.
1) Passive vs. Active Reconnaissance
Understanding the distinction is critical:
Passive Reconnaissance:
Definition:
- Gathering information without directly interacting
with target systems
- Target cannot detect your activities
- Uses public information sources
Examples:
- Search engine queries
- WHOIS lookups
- DNS records (public)
- Social media research
- Job postings
- Public documents
- Archive.org
Advantages:
- Undetectable
- Legal (public information)
- Can be done pre-authorization
- Risk-free
Limitations:
- Information may be outdated
- Limited depth
- Can't discover internal details
Active Reconnaissance:
Active Reconnaissance:
Definition:
- Directly interacting with target systems
- Target could potentially detect activity
- Requires authorization
Examples:
- Port scanning
- Banner grabbing
- DNS zone transfers
- Directory brute forcing
- Vulnerability scanning
Advantages:
- Current, accurate information
- Discovers actual attack surface
- Finds internal details
Limitations:
- Detectable by IDS/SIEM
- Requires authorization
- May trigger alerts
- Could impact systems
⚠️ Active recon without authorization = illegal
Reconnaissance Phases:
Typical workflow:
Phase 1: Passive OSINT
├── Company information
├── Employee enumeration
├── Technology identification
├── Domain/IP discovery
└── Document harvesting
Phase 2: Semi-Passive
├── DNS enumeration
├── Subdomain discovery
├── Certificate transparency
└── Search engine dorking
Phase 3: Active (requires auth)
├── Port scanning
├── Service enumeration
├── Directory brute forcing
└── Vulnerability scanning
Document everything as you go!
Key insight: Start passive, go active only with authorization. You'd be surprised how much you can learn without touching a single target system.
2) OSINT Techniques
Open Source Intelligence provides rich information:
Search Engine Reconnaissance:
Google Dorking (advanced search operators):
# Find specific file types
site:target.com filetype:pdf
site:target.com filetype:xlsx
site:target.com filetype:docx
# Find login pages
site:target.com inurl:login
site:target.com inurl:admin
site:target.com intitle:"login"
# Find exposed directories
site:target.com intitle:"index of"
site:target.com intitle:"directory listing"
# Find configuration files
site:target.com filetype:conf
site:target.com filetype:env
site:target.com filetype:xml
# Find error messages
site:target.com "sql syntax"
site:target.com "php error"
site:target.com "stack trace"
# Exclude certain results
site:target.com -www
Google Hacking Database (GHDB):
https://www.exploit-db.com/google-hacking-database
Domain and Infrastructure:
# WHOIS lookup
whois target.com
# Returns:
# - Registrant information
# - Name servers
# - Registration dates
# - Contact details (sometimes)
# DNS enumeration
dig target.com ANY
dig target.com MX
dig target.com NS
dig target.com TXT
# Reverse DNS
dig -x [IP address]
# DNS history
# Use SecurityTrails, ViewDNS, DNSDumpster
# Certificate Transparency logs
# Find subdomains via SSL certificates
# https://crt.sh/?q=%.target.com
# Shodan (search engine for devices)
# https://www.shodan.io
# Find exposed services, IoT devices, etc.
Employee and Organization OSINT:
LinkedIn:
- Employee names and roles
- Technology skills mentioned
- Company size and structure
- Job postings (reveal tech stack)
Job Postings:
- Technologies used
- Security tools deployed
- Cloud providers
- Development practices
GitHub/GitLab:
- Company repositories
- Employee personal repos
- Exposed credentials
- Code structure
Document Metadata:
# Extract metadata from public documents
exiftool document.pdf
# May reveal:
# - Author names
# - Software versions
# - Internal paths
# - Usernames
Social Media:
- Twitter/X for company and employees
- Company blogs
- Conference presentations
- Technical write-ups
Key insight: People post amazing amounts of useful information publicly. Job postings alone can reveal an organization's entire technology stack.
3) Subdomain Enumeration
Finding all subdomains expands your attack surface:
Why Subdomains Matter:
- Forgotten systems (dev, staging, old)
- Different security postures
- Different technologies
- Separate teams/maintenance
- Often less monitored
Common interesting subdomains:
dev.target.com
staging.target.com
test.target.com
api.target.com
admin.target.com
vpn.target.com
mail.target.com
owa.target.com
portal.target.com
legacy.target.com
Enumeration Techniques:
# Certificate Transparency
curl -s "https://crt.sh/?q=%.target.com&output=json" | jq '.[].name_value' | sort -u
# Online tools
# - crt.sh
# - dnsdumpster.com
# - securitytrails.com
# - virustotal.com
# Subfinder (passive)
subfinder -d target.com -o subdomains.txt
# Amass (comprehensive)
amass enum -d target.com -o amass_output.txt
# Assetfinder
assetfinder --subs-only target.com
# Brute force (active - requires auth)
gobuster dns -d target.com -w /usr/share/seclists/Discovery/DNS/subdomains-top1million-5000.txt
# Combine multiple sources
# Compare results from different tools
# Remove duplicates
cat subfinder.txt amass.txt assetfinder.txt | sort -u > all_subdomains.txt
Validating Subdomains:
# Check which subdomains resolve
cat subdomains.txt | while read sub; do
if host "$sub" > /dev/null 2>&1; then
echo "$sub"
fi
done > live_subdomains.txt
# Or use httpx/httprobe
cat subdomains.txt | httpx -silent -o live_http.txt
# Get HTTP response codes
cat live_subdomains.txt | httpx -status-code -title
# Screenshot all live sites
# gowitness or aquatone
gowitness file -f live_http.txt
Key insight: The main website might be hardened, but that forgotten dev.target.com from 2019 might be wide open.
4) Technology Fingerprinting
Identifying technologies guides your attack approach:
Web Technology Detection:
# Wappalyzer (browser extension)
# Detects CMS, frameworks, libraries
# WhatWeb (command line)
whatweb target.com
# Output example:
# Apache, PHP 7.4, WordPress 5.8, jQuery 3.6
# Builtwith
# https://builtwith.com
# Netcraft
# https://sitereport.netcraft.com
Server Fingerprinting:
# HTTP headers reveal technology
curl -I https://target.com
# Look for:
Server: Apache/2.4.41 (Ubuntu)
X-Powered-By: PHP/7.4.3
X-AspNet-Version: 4.0.30319
# Nmap service detection (active)
nmap -sV target.com
# Banner grabbing
nc -v target.com 80
HEAD / HTTP/1.0
# SSL/TLS analysis
sslscan target.com
testssl.sh target.com
CMS Identification:
WordPress:
- /wp-admin/
- /wp-content/
- /wp-includes/
- /xmlrpc.php
Drupal:
- /sites/
- /modules/
- /CHANGELOG.txt
- /core/
Joomla:
- /administrator/
- /components/
- /modules/
- /README.txt
Generic indicators:
- Generator meta tags
- JavaScript library paths
- CSS framework classes
- Cookie names
- Error messages
# CMSmap for automated detection
cmsmap https://target.com
# WPScan for WordPress
wpscan --url https://target.com
Why Technology Matters:
Once you know the technology:
1. Find known vulnerabilities
- CVE databases
- Exploit-DB
- Vendor advisories
2. Find default credentials
- Admin interfaces
- Database defaults
- Application defaults
3. Find common misconfigurations
- Directory listings
- Debug modes
- Default files
4. Target your testing
- PHP injection vs. ASP.NET
- MySQL vs. MSSQL syntax
- Linux vs. Windows commands
Example:
"Apache 2.4.49" → CVE-2021-41773 (path traversal)
"WordPress 5.4" → Look for plugin vulnerabilities
Key insight: Technology identification tells you what to look for. Knowing the stack turns generic testing into targeted attacks.
5) Organizing Reconnaissance Data
Good organization prevents losing valuable intelligence:
Directory Structure:
~/pentests/target_name/
├── scope/
│ ├── authorization.pdf
│ ├── scope.txt
│ └── contacts.txt
├── recon/
│ ├── passive/
│ │ ├── whois.txt
│ │ ├── dns_records.txt
│ │ ├── subdomains.txt
│ │ ├── employees.txt
│ │ └── technology.txt
│ └── active/
│ ├── nmap/
│ ├── screenshots/
│ └── directories/
├── vulnerabilities/
├── exploitation/
├── evidence/
└── report/
Note-Taking Tools:
Command-line documentation:
# Script your terminal
script -a recon_session.log
# Or use tmux logging
Note-taking applications:
- CherryTree (hierarchical notes)
- Obsidian (markdown, linking)
- Notion (collaborative)
- OneNote (Microsoft)
Dedicated pentest tools:
- Dradis (reporting framework)
- Faraday (collaborative platform)
- Reconftw (automated recon)
Key practices:
- Timestamp everything
- Save raw output
- Document commands used
- Note interesting findings immediately
- Take screenshots
Creating a Target Profile:
Target Profile Document:
ORGANIZATION OVERVIEW
- Company Name:
- Industry:
- Size:
- Key business functions:
TECHNICAL INFRASTRUCTURE
- Primary domains:
- IP ranges:
- Hosting providers:
- Cloud services:
WEB PRESENCE
- Main website technology:
- Subdomains discovered:
- Web applications:
- Login portals:
EMPLOYEES
- Key personnel:
- IT/Security contacts:
- Email format:
- LinkedIn profiles:
TECHNOLOGY STACK
- Web servers:
- Programming languages:
- Databases:
- Security tools observed:
POTENTIAL ATTACK VECTORS
- [List based on findings]
- [Prioritized by likelihood]
Key insight: Reconnaissance data is useless if you can't find it later. Organize from the start—your future self will thank you.
Real-World Context: Recon in Practice
How professionals approach reconnaissance:
Time Investment: Professional pentesters often spend 30-40% of their engagement time on reconnaissance. This isn't wasted time—it's what makes exploitation efficient and comprehensive.
Automation vs. Manual: Tools automate the grunt work, but human analysis spots patterns tools miss. The best results combine automated scanning with manual review and creative thinking.
Legal Boundaries: Passive OSINT is generally legal, but active scanning requires authorization. Be very clear about where the line is and document your authorization.
MITRE ATT&CK Mapping:
- T1595 - Active Scanning: Port scans, vulnerability scans
- T1592 - Gather Victim Host Information: Technology fingerprinting
- T1589 - Gather Victim Identity Information: Employee OSINT
- T1590 - Gather Victim Network Information: DNS, IP ranges
Key insight: Attackers do extensive recon. Understanding their techniques helps you both attack (as a pentester) and defend (by reducing your exposure).
Guided Lab: Comprehensive Reconnaissance
Conduct reconnaissance against an authorized target.