Opening Framing: The Foundation of Testing
Before exploiting vulnerabilities, you must find them. Before finding vulnerabilities, you must understand the application. Information gathering and mapping is where professional testing begins—and where it differs most from amateur attempts.
A rushed tester jumps straight to SQL injection payloads. A professional tester first maps every endpoint, identifies every parameter, understands the technology stack, and discovers hidden functionality. This systematic approach finds vulnerabilities that automated scanners miss.
This week covers passive and active reconnaissance, application mapping, content discovery, and technology fingerprinting— building a complete picture of your target.
Key insight: The quality of your reconnaissance determines the quality of your findings.
1) Passive Reconnaissance
Gathering information without touching the target:
Passive Recon Sources:
Search Engines:
- Google dorking
- Bing, DuckDuckGo
- Cached pages
- Indexed files
Domain Information:
- WHOIS records
- DNS records (A, CNAME, MX, TXT)
- Certificate Transparency logs
- Historical DNS (SecurityTrails)
Code Repositories:
- GitHub (organization accounts)
- GitLab, Bitbucket
- Leaked credentials in commits
- API keys, secrets
Archive Services:
- Wayback Machine
- Archive.today
- Historical versions
- Removed content
Job Postings:
- Technology stack hints
- Security tools in use
- Infrastructure details
Google Dorking for Web Apps:
# Find login pages
site:target.com inurl:login
site:target.com inurl:admin
site:target.com intitle:"login"
# Find exposed files
site:target.com filetype:pdf
site:target.com filetype:xlsx
site:target.com filetype:sql
site:target.com filetype:log
site:target.com filetype:env
# Find configuration files
site:target.com filetype:xml
site:target.com filetype:conf
site:target.com filetype:config
# Find backup files
site:target.com filetype:bak
site:target.com filetype:old
site:target.com inurl:backup
# Find error messages
site:target.com "sql syntax"
site:target.com "mysql_fetch"
site:target.com "warning" "error"
site:target.com "stack trace"
# Find directories
site:target.com intitle:"index of"
site:target.com intitle:"directory listing"
# Exclude www
site:target.com -www
Certificate Transparency:
# Certificate Transparency logs reveal subdomains
# crt.sh
https://crt.sh/?q=%.target.com
# Returns all certificates issued for domain
# Reveals:
# - Subdomains (including internal!)
# - dev.target.com
# - staging.target.com
# - api-internal.target.com
# Automate with curl
curl -s "https://crt.sh/?q=%25.target.com&output=json" | jq '.[].name_value' | sort -u
# Tools:
# - Amass
# - Subfinder
# - Assetfinder
Key insight: Passive recon is undetectable and often reveals more than expected—including internal assets and forgotten systems.
2) Active Application Mapping
Systematically exploring the application:
Application Mapping Goals:
1. Enumerate all functionality
- Every page
- Every form
- Every feature
2. Identify entry points
- URL parameters
- POST body parameters
- Headers (cookies, auth)
- File uploads
3. Understand data flow
- Input → Processing → Output
- Where does data go?
- How is it transformed?
4. Map user roles
- Anonymous
- Authenticated
- Admin
- Different permission levels
Manual Crawling with Burp:
Systematic browsing:
1. Configure scope in Burp
Target → Scope → Add target URL
2. Browse the application manually
- Click every link
- Submit every form
- Test every feature
- Try different user roles
3. Review Burp Site Map
- Hierarchical view of application
- Identify all endpoints
- Note parameters
4. Examine each request
- Parameters (GET, POST)
- Cookies
- Custom headers
- Request body formats (JSON, XML)
5. Note interesting responses
- Error messages
- Different response lengths
- Redirects
- Set-Cookie headers
Automated Crawling:
# Burp Spider (built-in)
Target → Site map → Right-click → Spider
# Configure:
# - Crawl depth
# - Form submission
# - Scope limitations
# Limitations:
# - JavaScript-heavy apps need manual help
# - Form logic may not be followed correctly
# - Auth flows often break
# Best approach:
# 1. Manual browse with auth
# 2. Spider from authenticated state
# 3. Review and fill gaps manually
Creating Application Map:
Document your findings:
APPLICATION MAP
===============
Domain: target.com
Authentication:
├── /login (POST username, password)
├── /logout
├── /register (POST email, username, password)
├── /forgot-password (POST email)
└── /reset-password (GET token, POST new_password)
User Dashboard:
├── /dashboard
├── /profile (GET, POST - update profile)
├── /settings (GET, POST - change settings)
└── /notifications
API Endpoints:
├── /api/v1/users (GET - list, POST - create)
├── /api/v1/users/{id} (GET, PUT, DELETE)
├── /api/v1/products (GET, POST)
└── /api/v1/orders (GET, POST)
Admin (requires admin role):
├── /admin/dashboard
├── /admin/users
└── /admin/settings
Entry Points per Endpoint:
/api/v1/users/{id}
- URL parameter: id (integer)
- Headers: Authorization, Content-Type
- Body (PUT): name, email, role
Key insight: Complete mapping takes time but reveals every potential vulnerability point.
3) Content Discovery
Finding hidden files and directories:
Why Content Discovery Matters:
Applications often have:
- Backup files (.bak, .old, ~)
- Configuration files (.config, .env)
- Development artifacts (.git, .svn)
- Admin interfaces
- API documentation
- Debug endpoints
- Forgotten functionality
These are often:
- Not linked from anywhere
- Not protected
- Contain sensitive information
Directory Brute Forcing:
# Gobuster
gobuster dir -u https://target.com -w /usr/share/wordlists/dirb/common.txt
gobuster dir -u https://target.com -w /usr/share/seclists/Discovery/Web-Content/raft-medium-directories.txt
# With extensions
gobuster dir -u https://target.com -w wordlist.txt -x php,asp,aspx,jsp,html,js
# With cookies (authenticated)
gobuster dir -u https://target.com -w wordlist.txt -c "session=abc123"
# Ffuf (faster)
ffuf -u https://target.com/FUZZ -w wordlist.txt
ffuf -u https://target.com/FUZZ -w wordlist.txt -e .php,.html,.txt
# With filtering
ffuf -u https://target.com/FUZZ -w wordlist.txt -fc 404 # Filter 404s
ffuf -u https://target.com/FUZZ -w wordlist.txt -fs 1234 # Filter by size
# Feroxbuster (recursive)
feroxbuster -u https://target.com -w wordlist.txt
Wordlist Selection:
# SecLists - essential wordlists
/usr/share/seclists/Discovery/Web-Content/
Common choices:
- common.txt (quick scan)
- raft-medium-directories.txt
- raft-large-directories.txt
- directory-list-2.3-medium.txt
Technology-specific:
- /Discovery/Web-Content/CMS/
- /Discovery/Web-Content/api/
- /Discovery/Web-Content/CGIs.txt
Custom wordlist generation:
# CeWL - scrape words from target
cewl https://target.com -d 2 -m 5 -w custom.txt
# Add common patterns
admin, backup, config, debug, dev, test, staging
Finding Sensitive Files:
# Common sensitive files to check:
Configuration:
/.env
/config.php
/wp-config.php
/web.config
/application.yml
/.htaccess
Version Control:
/.git/HEAD
/.git/config
/.svn/entries
/.hg/
Backup Files:
/backup.sql
/database.sql
/site.zip
/backup.tar.gz
/*.bak
Development:
/phpinfo.php
/info.php
/test.php
/debug
/.DS_Store
/Thumbs.db
Documentation:
/swagger.json
/api-docs
/openapi.yaml
/README.md
/CHANGELOG.md
Server Status:
/server-status
/server-info
/.well-known/
Git Repository Exposure:
# Check for exposed .git
curl https://target.com/.git/HEAD
# If accessible, dump entire repo:
# git-dumper
git-dumper https://target.com/.git/ ./git-dump
# Or manually:
wget --mirror -I .git https://target.com/.git/
# Then:
cd git-dump
git checkout -- .
git log
git show [commit]
# Often contains:
# - Source code
# - Credentials
# - Configuration
# - Development history
Key insight: Hidden content often contains the most critical vulnerabilities—backup files with credentials, exposed git repos with secrets.
4) Technology Fingerprinting
Identifying the application's technology stack:
Why Fingerprinting Matters:
Knowing the stack reveals:
- Known CVEs for specific versions
- Default credentials
- Common misconfigurations
- Attack techniques that apply
Technology Stack Components:
- Web server (Apache, Nginx, IIS)
- Programming language (PHP, Java, Python, .NET)
- Framework (Laravel, Spring, Django, Express)
- CMS (WordPress, Drupal, Joomla)
- Frontend framework (React, Angular, Vue)
- Database (MySQL, PostgreSQL, MongoDB)
- WAF (Cloudflare, AWS WAF, ModSecurity)
Fingerprinting Methods:
# HTTP Headers
curl -I https://target.com
Server: nginx/1.19.0
X-Powered-By: PHP/7.4.3
X-AspNet-Version: 4.0.30319
# Cookies
PHPSESSID → PHP
JSESSIONID → Java
ASP.NET_SessionId → .NET
connect.sid → Node.js/Express
# File Extensions
.php → PHP
.asp/.aspx → ASP.NET
.jsp → Java
.py (rare) → Python
# Response Patterns
- Error messages (framework-specific)
- Default pages
- URL structures (/wp-admin → WordPress)
Automated Fingerprinting:
# Wappalyzer (browser extension)
# Shows technologies as you browse
# WhatWeb (command line)
whatweb https://target.com
whatweb -v https://target.com # Verbose
# Webanalyze
webanalyze -host https://target.com
# Nmap HTTP scripts
nmap -sV -p 80,443 --script=http-headers,http-server-header target.com
# Nuclei technology detection
nuclei -u https://target.com -t technologies/
CMS-Specific Enumeration:
# WordPress
wpscan --url https://target.com
wpscan --url https://target.com --enumerate u # Users
wpscan --url https://target.com --enumerate p # Plugins
wpscan --url https://target.com --enumerate t # Themes
# Manual WordPress checks
/wp-admin/
/wp-login.php
/wp-content/plugins/
/wp-content/themes/
/xmlrpc.php
/wp-json/wp/v2/users
# Drupal
droopescan scan drupal -u https://target.com
# Joomla
joomscan --url https://target.com
# Generic
cmseek -u https://target.com
WAF Detection:
# Detecting WAF presence
# wafw00f
wafw00f https://target.com
# Manual detection
# Send obviously malicious request:
curl "https://target.com/?id=1' OR '1'='1"
# WAF indicators:
# - 403 Forbidden
# - Custom error pages
# - Different response headers
# - Request blocked message
# Common WAFs:
# - Cloudflare (cf-ray header)
# - AWS WAF
# - Akamai
# - ModSecurity
# - Imperva
# Why it matters:
# - Need to craft bypass payloads
# - Different WAFs have different weaknesses
Key insight: Technology stack knowledge focuses your testing. PHP apps need different tests than Java apps.
5) Identifying Entry Points
Entry points are where attackers inject malicious input:
Entry Point Categories:
URL Parameters:
https://target.com/page?id=123&action=view
- id, action are parameters
- Test each for injection
POST Body:
username=admin&password=secret
- Form submissions
- JSON/XML payloads
HTTP Headers:
- Cookie: session=abc
- Authorization: Bearer xyz
- User-Agent
- Referer
- X-Forwarded-For
- Custom headers
File Uploads:
- Filename
- File content
- MIME type
Path Parameters:
/api/users/123/orders/456
- 123 (user ID)
- 456 (order ID)
Documenting Entry Points:
Entry Point Documentation:
ENDPOINT: POST /api/users
Entry Points:
┌─────────────────┬────────────┬──────────────────────┐
│ Parameter │ Location │ Type │
├─────────────────┼────────────┼──────────────────────┤
│ name │ Body │ String │
│ email │ Body │ String (email format)│
│ role │ Body │ String (enum?) │
│ Authorization │ Header │ Bearer token │
│ Content-Type │ Header │ application/json │
└─────────────────┴────────────┴──────────────────────┘
Testing Priority:
1. email - SQL injection, format validation
2. role - Privilege escalation
3. name - XSS, length limits
4. Authorization - Token validation
Using Burp to Identify Parameters:
# Burp Proxy → HTTP History
For each request, examine:
1. URL parameters (visible in URL)
2. Body parameters (view in Raw or Params tab)
3. Cookies (Cookie header)
4. Other headers
# Burp feature: Engagement Tools → Find Parameters
# Lists all unique parameters discovered
# Param Miner extension
# Discovers hidden parameters:
# - Guesses common parameter names
# - Detects parameters that change behavior
# - Finds headers that affect application
Right-click request → Extensions → Param Miner → Guess params
Hidden Parameter Discovery:
# Arjun - parameter discovery
arjun -u https://target.com/page
arjun -u https://target.com/page -m POST
# Common hidden parameters:
debug=true
test=1
admin=1
source=1
id, user_id, account_id
role, privilege, permission
redirect, next, return_url
callback, jsonp
_method (method override)
page, limit, offset
sort, order
format (json, xml)
version, v, api_version
Key insight: Every entry point is a potential vulnerability. Missing one means missing potential findings.
Real-World Context: Recon in Bug Bounty
How reconnaissance differentiates successful hunters:
Surface Area Competition: In bug bounty, thousands of researchers test the same applications. Low-hanging fruit on main domains is found quickly. Success comes from finding assets others miss—subdomains, legacy systems, APIs.
Automation vs. Manual: The best hunters combine automated tools with manual analysis. Tools find breadth; humans find depth. Automated subdomain enumeration plus manual review of each discovered asset.
Continuous Monitoring: Top hunters monitor targets continuously. New subdomains, new functionality, new versions—each is an opportunity before others notice.
MITRE ATT&CK Mapping:
- T1595 - Active Scanning: Content discovery, fingerprinting
- T1592 - Gather Victim Host Information: Technology identification
- T1589 - Gather Victim Identity Information: User enumeration
Key insight: Exceptional recon is often the difference between finding critical vulnerabilities and finding nothing.
Guided Lab: Comprehensive Reconnaissance
Perform complete reconnaissance of a target application.