Opening Framing: The Foundation of Testing
Before exploiting vulnerabilities, you must find them. Before finding vulnerabilities, you must understand the application. Information gathering and mapping is where professional testing begins—and where it differs most from amateur attempts.
A rushed tester jumps straight to SQL injection payloads. A professional tester first maps every endpoint, identifies every parameter, understands the technology stack, and discovers hidden functionality. This systematic approach finds vulnerabilities that automated scanners miss.
This week covers passive and active reconnaissance, application mapping, content discovery, and technology fingerprinting— building a complete picture of your target.
Key insight: The quality of your reconnaissance determines the quality of your findings.
1) Passive Reconnaissance
Gathering information without touching the target:
Passive Recon Sources:
Search Engines:
- Google dorking
- Bing, DuckDuckGo
- Cached pages
- Indexed files
Domain Information:
- WHOIS records
- DNS records (A, CNAME, MX, TXT)
- Certificate Transparency logs
- Historical DNS (SecurityTrails)
Code Repositories:
- GitHub (organization accounts)
- GitLab, Bitbucket
- Leaked credentials in commits
- API keys, secrets
Archive Services:
- Wayback Machine
- Archive.today
- Historical versions
- Removed content
Job Postings:
- Technology stack hints
- Security tools in use
- Infrastructure details
Google Dorking for Web Apps:
# Find login pages
site:target.com inurl:login
site:target.com inurl:admin
site:target.com intitle:"login"
# Find exposed files
site:target.com filetype:pdf
site:target.com filetype:xlsx
site:target.com filetype:sql
site:target.com filetype:log
site:target.com filetype:env
# Find configuration files
site:target.com filetype:xml
site:target.com filetype:conf
site:target.com filetype:config
# Find backup files
site:target.com filetype:bak
site:target.com filetype:old
site:target.com inurl:backup
# Find error messages
site:target.com "sql syntax"
site:target.com "mysql_fetch"
site:target.com "warning" "error"
site:target.com "stack trace"
# Find directories
site:target.com intitle:"index of"
site:target.com intitle:"directory listing"
# Exclude www
site:target.com -www
Certificate Transparency:
# Certificate Transparency logs reveal subdomains
# crt.sh
https://crt.sh/?q=%.target.com
# Returns all certificates issued for domain
# Reveals:
# - Subdomains (including internal!)
# - dev.target.com
# - staging.target.com
# - api-internal.target.com
# Automate with curl
curl -s "https://crt.sh/?q=%25.target.com&output=json" | jq '.[].name_value' | sort -u
# Tools:
# - Amass
# - Subfinder
# - Assetfinder
Key insight: Passive recon is undetectable and often reveals more than expected—including internal assets and forgotten systems.
2) Active Application Mapping
Systematically exploring the application:
Application Mapping Goals:
1. Enumerate all functionality
- Every page
- Every form
- Every feature
2. Identify entry points
- URL parameters
- POST body parameters
- Headers (cookies, auth)
- File uploads
3. Understand data flow
- Input → Processing → Output
- Where does data go?
- How is it transformed?
4. Map user roles
- Anonymous
- Authenticated
- Admin
- Different permission levels
Manual Crawling with Burp:
Systematic browsing:
1. Configure scope in Burp
Target → Scope → Add target URL
2. Browse the application manually
- Click every link
- Submit every form
- Test every feature
- Try different user roles
3. Review Burp Site Map
- Hierarchical view of application
- Identify all endpoints
- Note parameters
4. Examine each request
- Parameters (GET, POST)
- Cookies
- Custom headers
- Request body formats (JSON, XML)
5. Note interesting responses
- Error messages
- Different response lengths
- Redirects
- Set-Cookie headers
Automated Crawling:
# Burp Spider (built-in)
Target → Site map → Right-click → Spider
# Configure:
# - Crawl depth
# - Form submission
# - Scope limitations
# Limitations:
# - JavaScript-heavy apps need manual help
# - Form logic may not be followed correctly
# - Auth flows often break
# Best approach:
# 1. Manual browse with auth
# 2. Spider from authenticated state
# 3. Review and fill gaps manually
Creating Application Map:
Document your findings:
APPLICATION MAP
===============
Domain: target.com
Authentication:
├── /login (POST username, password)
├── /logout
├── /register (POST email, username, password)
├── /forgot-password (POST email)
└── /reset-password (GET token, POST new_password)
User Dashboard:
├── /dashboard
├── /profile (GET, POST - update profile)
├── /settings (GET, POST - change settings)
└── /notifications
API Endpoints:
├── /api/v1/users (GET - list, POST - create)
├── /api/v1/users/{id} (GET, PUT, DELETE)
├── /api/v1/products (GET, POST)
└── /api/v1/orders (GET, POST)
Admin (requires admin role):
├── /admin/dashboard
├── /admin/users
└── /admin/settings
Entry Points per Endpoint:
/api/v1/users/{id}
- URL parameter: id (integer)
- Headers: Authorization, Content-Type
- Body (PUT): name, email, role
Key insight: Complete mapping takes time but reveals every potential vulnerability point.
3) Content Discovery
Finding hidden files and directories:
Why Content Discovery Matters:
Applications often have:
- Backup files (.bak, .old, ~)
- Configuration files (.config, .env)
- Development artifacts (.git, .svn)
- Admin interfaces
- API documentation
- Debug endpoints
- Forgotten functionality
These are often:
- Not linked from anywhere
- Not protected
- Contain sensitive information
Directory Brute Forcing:
# Gobuster
gobuster dir -u https://target.com -w /usr/share/wordlists/dirb/common.txt
gobuster dir -u https://target.com -w /usr/share/seclists/Discovery/Web-Content/raft-medium-directories.txt
# With extensions
gobuster dir -u https://target.com -w wordlist.txt -x php,asp,aspx,jsp,html,js
# With cookies (authenticated)
gobuster dir -u https://target.com -w wordlist.txt -c "session=abc123"
# Ffuf (faster)
ffuf -u https://target.com/FUZZ -w wordlist.txt
ffuf -u https://target.com/FUZZ -w wordlist.txt -e .php,.html,.txt
# With filtering
ffuf -u https://target.com/FUZZ -w wordlist.txt -fc 404 # Filter 404s
ffuf -u https://target.com/FUZZ -w wordlist.txt -fs 1234 # Filter by size
# Feroxbuster (recursive)
feroxbuster -u https://target.com -w wordlist.txt
Wordlist Selection:
# SecLists - essential wordlists
/usr/share/seclists/Discovery/Web-Content/
Common choices:
- common.txt (quick scan)
- raft-medium-directories.txt
- raft-large-directories.txt
- directory-list-2.3-medium.txt
Technology-specific:
- /Discovery/Web-Content/CMS/
- /Discovery/Web-Content/api/
- /Discovery/Web-Content/CGIs.txt
Custom wordlist generation:
# CeWL - scrape words from target
cewl https://target.com -d 2 -m 5 -w custom.txt
# Add common patterns
admin, backup, config, debug, dev, test, staging
Finding Sensitive Files:
# Common sensitive files to check:
Configuration:
/.env
/config.php
/wp-config.php
/web.config
/application.yml
/.htaccess
Version Control:
/.git/HEAD
/.git/config
/.svn/entries
/.hg/
Backup Files:
/backup.sql
/database.sql
/site.zip
/backup.tar.gz
/*.bak
Development:
/phpinfo.php
/info.php
/test.php
/debug
/.DS_Store
/Thumbs.db
Documentation:
/swagger.json
/api-docs
/openapi.yaml
/README.md
/CHANGELOG.md
Server Status:
/server-status
/server-info
/.well-known/
Git Repository Exposure:
# Check for exposed .git
curl https://target.com/.git/HEAD
# If accessible, dump entire repo:
# git-dumper
git-dumper https://target.com/.git/ ./git-dump
# Or manually:
wget --mirror -I .git https://target.com/.git/
# Then:
cd git-dump
git checkout -- .
git log
git show [commit]
# Often contains:
# - Source code
# - Credentials
# - Configuration
# - Development history
Key insight: Hidden content often contains the most critical vulnerabilities—backup files with credentials, exposed git repos with secrets.
4) Technology Fingerprinting
Identifying the application's technology stack:
Why Fingerprinting Matters:
Knowing the stack reveals:
- Known CVEs for specific versions
- Default credentials
- Common misconfigurations
- Attack techniques that apply
Technology Stack Components:
- Web server (Apache, Nginx, IIS)
- Programming language (PHP, Java, Python, .NET)
- Framework (Laravel, Spring, Django, Express)
- CMS (WordPress, Drupal, Joomla)
- Frontend framework (React, Angular, Vue)
- Database (MySQL, PostgreSQL, MongoDB)
- WAF (Cloudflare, AWS WAF, ModSecurity)
Fingerprinting Methods:
# HTTP Headers
curl -I https://target.com
Server: nginx/1.19.0
X-Powered-By: PHP/7.4.3
X-AspNet-Version: 4.0.30319
# Cookies
PHPSESSID → PHP
JSESSIONID → Java
ASP.NET_SessionId → .NET
connect.sid → Node.js/Express
# File Extensions
.php → PHP
.asp/.aspx → ASP.NET
.jsp → Java
.py (rare) → Python
# Response Patterns
- Error messages (framework-specific)
- Default pages
- URL structures (/wp-admin → WordPress)
Automated Fingerprinting:
# Wappalyzer (browser extension)
# Shows technologies as you browse
# WhatWeb (command line)
whatweb https://target.com
whatweb -v https://target.com # Verbose
# Webanalyze
webanalyze -host https://target.com
# Nmap HTTP scripts
nmap -sV -p 80,443 --script=http-headers,http-server-header target.com
# Nuclei technology detection
nuclei -u https://target.com -t technologies/
CMS-Specific Enumeration:
# WordPress
wpscan --url https://target.com
wpscan --url https://target.com --enumerate u # Users
wpscan --url https://target.com --enumerate p # Plugins
wpscan --url https://target.com --enumerate t # Themes
# Manual WordPress checks
/wp-admin/
/wp-login.php
/wp-content/plugins/
/wp-content/themes/
/xmlrpc.php
/wp-json/wp/v2/users
# Drupal
droopescan scan drupal -u https://target.com
# Joomla
joomscan --url https://target.com
# Generic
cmseek -u https://target.com
WAF Detection:
# Detecting WAF presence
# wafw00f
wafw00f https://target.com
# Manual detection
# Send obviously malicious request:
curl "https://target.com/?id=1' OR '1'='1"
# WAF indicators:
# - 403 Forbidden
# - Custom error pages
# - Different response headers
# - Request blocked message
# Common WAFs:
# - Cloudflare (cf-ray header)
# - AWS WAF
# - Akamai
# - ModSecurity
# - Imperva
# Why it matters:
# - Need to craft bypass payloads
# - Different WAFs have different weaknesses
Key insight: Technology stack knowledge focuses your testing. PHP apps need different tests than Java apps.
5) Identifying Entry Points
Entry points are where attackers inject malicious input:
Entry Point Categories:
URL Parameters:
https://target.com/page?id=123&action=view
- id, action are parameters
- Test each for injection
POST Body:
username=admin&password=secret
- Form submissions
- JSON/XML payloads
HTTP Headers:
- Cookie: session=abc
- Authorization: Bearer xyz
- User-Agent
- Referer
- X-Forwarded-For
- Custom headers
File Uploads:
- Filename
- File content
- MIME type
Path Parameters:
/api/users/123/orders/456
- 123 (user ID)
- 456 (order ID)
Documenting Entry Points:
Entry Point Documentation:
ENDPOINT: POST /api/users
Entry Points:
┌─────────────────┬────────────┬──────────────────────┐
│ Parameter │ Location │ Type │
├─────────────────┼────────────┼──────────────────────┤
│ name │ Body │ String │
│ email │ Body │ String (email format)│
│ role │ Body │ String (enum?) │
│ Authorization │ Header │ Bearer token │
│ Content-Type │ Header │ application/json │
└─────────────────┴────────────┴──────────────────────┘
Testing Priority:
1. email - SQL injection, format validation
2. role - Privilege escalation
3. name - XSS, length limits
4. Authorization - Token validation
Using Burp to Identify Parameters:
# Burp Proxy → HTTP History
For each request, examine:
1. URL parameters (visible in URL)
2. Body parameters (view in Raw or Params tab)
3. Cookies (Cookie header)
4. Other headers
# Burp feature: Engagement Tools → Find Parameters
# Lists all unique parameters discovered
# Param Miner extension
# Discovers hidden parameters:
# - Guesses common parameter names
# - Detects parameters that change behavior
# - Finds headers that affect application
Right-click request → Extensions → Param Miner → Guess params
Hidden Parameter Discovery:
# Arjun - parameter discovery
arjun -u https://target.com/page
arjun -u https://target.com/page -m POST
# Common hidden parameters:
debug=true
test=1
admin=1
source=1
id, user_id, account_id
role, privilege, permission
redirect, next, return_url
callback, jsonp
_method (method override)
page, limit, offset
sort, order
format (json, xml)
version, v, api_version
Key insight: Every entry point is a potential vulnerability. Missing one means missing potential findings.
Real-World Context: Recon in Bug Bounty
How reconnaissance differentiates successful hunters:
Surface Area Competition: In bug bounty, thousands of researchers test the same applications. Low-hanging fruit on main domains is found quickly. Success comes from finding assets others miss—subdomains, legacy systems, APIs.
Automation vs. Manual: The best hunters combine automated tools with manual analysis. Tools find breadth; humans find depth. Automated subdomain enumeration plus manual review of each discovered asset.
Continuous Monitoring: Top hunters monitor targets continuously. New subdomains, new functionality, new versions—each is an opportunity before others notice.
MITRE ATT&CK Mapping:
- T1595 - Active Scanning: Content discovery, fingerprinting
- T1592 - Gather Victim Host Information: Technology identification
- T1589 - Gather Victim Identity Information: User enumeration
Key insight: Exceptional recon is often the difference between finding critical vulnerabilities and finding nothing.
Guided Lab: Comprehensive Reconnaissance
Perform complete reconnaissance of a target application.
Step 1: Passive Reconnaissance
# Choose target (your lab app or authorized target)
# Google dorking
site:target.com filetype:pdf
site:target.com inurl:admin
site:target.com "error"
# Certificate transparency
curl -s "https://crt.sh/?q=%25.target.com&output=json" | jq '.[].name_value' | sort -u
# Wayback Machine
https://web.archive.org/web/*/target.com/*
Step 2: Application Mapping
# Configure Burp scope
# Browse entire application manually
# Click every link, submit every form
# Review Target → Site Map
# Document all endpoints found
# Create application map diagram
Step 3: Content Discovery
# Directory brute forcing
ffuf -u https://target.com/FUZZ -w /usr/share/seclists/Discovery/Web-Content/common.txt
# Check for sensitive files
curl https://target.com/.git/HEAD
curl https://target.com/.env
curl https://target.com/robots.txt
curl https://target.com/sitemap.xml
Step 4: Technology Fingerprinting
# Automated fingerprinting
whatweb https://target.com
# Manual analysis
curl -I https://target.com
# Note Server, X-Powered-By, cookies
# WAF detection
wafw00f https://target.com
Step 5: Entry Point Documentation
# For each endpoint found:
# - List all parameters
# - Note parameter types
# - Identify testing priorities
# Use Burp → Target → Site Map → select endpoint → view params
Reflection (mandatory)
- What did you discover that wasn't obvious from normal browsing?
- Which content discovery technique found the most interesting results?
- How would knowing the technology stack change your testing approach?
- What entry points look most promising for vulnerability testing?
Week 02 Quiz
Test your understanding of Information Gathering and Application Mapping.
Format: 10 multiple-choice questions. Passing score: 70%. Time: Untimed.
Take QuizWeek 2 Outcome Check
By the end of this week, you should be able to:
- Perform passive reconnaissance without touching the target
- Systematically map an application using Burp Suite
- Discover hidden content using directory brute forcing
- Identify technology stacks through fingerprinting
- Document all entry points for testing
- Create comprehensive application documentation
Next week: Authentication Vulnerabilities—attacking the identity verification systems.
🎯 Hands-On Labs (Free & Essential)
Apply what you learned through practical reconnaissance and information gathering exercises. Complete these labs before moving to reading resources.
🎮 TryHackMe: Passive Reconnaissance
What you'll do: Learn passive information gathering techniques that don't
directly interact with the target. Practice WHOIS lookups, DNS enumeration, search engine
reconnaissance, and social media intelligence gathering.
Why it matters: Passive recon is stealthy and legal—you're only viewing
public information. Master these techniques to gather intelligence without alerting the target
or triggering IDS/IPS.
Time estimate: 1.5-2 hours
🎮 TryHackMe: Active Reconnaissance
What you'll do: Learn active information gathering through direct interaction
with the target. Practice port scanning with Nmap, service enumeration, banner grabbing, and
vulnerability scanning.
Why it matters: Active recon reveals the attack surface—open ports,
running services, software versions. This intel drives your entire testing strategy and
identifies initial entry points.
Time estimate: 2-3 hours
🎮 TryHackMe: Content Discovery
What you'll do: Learn techniques to discover hidden web content—directories,
files, subdomains, and parameters. Practice with tools like dirb, gobuster, and ffuf.
Why it matters: Hidden admin panels, backup files, and undocumented
APIs are goldmines for vulnerability discovery. Content discovery often finds the most critical
vulnerabilities.
Time estimate: 1.5-2 hours
💡 Lab Strategy: Start with Passive Recon (safe, legal, stealthy), then Active Recon (direct interaction), finally Content Discovery (finding hidden attack surface). This progression mirrors real-world pentesting methodology: 500 total XP, 5-7 hours of reconnaissance mastery!
🛡️ Defensive Architecture & Secure Design Patterns
Recon shows attackers how to map your application. Defensive design removes what they can see and limits what they can learn.
Attack Surface Reduction
Every exposed endpoint is a potential risk. Minimize public-facing functionality and harden what must remain.
Attack surface reduction checklist:
- Remove debug routes, sample apps, and test endpoints
- Disable directory listing and verbose server banners
- Keep configs, backups, and secrets out of web roots
- Restrict admin paths to VPN, allowlists, or SSO
- Separate dev/staging from production
- Maintain a living asset inventory
Security Headers as Defensive Baseline
Headers reduce information disclosure and limit browser abuse before an attacker reaches the application logic.
Baseline headers:
Content-Security-Policy: default-src 'self'
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Referrer-Policy: no-referrer
Permissions-Policy: geolocation=()
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Real-World Breach: Uber 2016 (Exposed Credentials)
Attackers found hard-coded AWS credentials in a public GitHub repo, leading to access of internal systems and S3 data. Lessons learned: secret scanning, least-privilege access, and rapid key rotation prevent small leaks from becoming breaches.
Defensive Labs
Lab: Implement Security Headers Baseline
Configure headers in a test app or web server and verify with `curl -I` and SecurityHeaders.com. Document before/after.
Lab: Create an Attack Surface Inventory
Build a complete endpoint inventory, classify each by exposure level, and propose hardening actions (remove, restrict, or monitor).
📚 Building on CSY101 Week-13: Use threat modeling to identify recon-driven abuse cases. CSY101 Week-14: Map controls to CIS Controls and NIST 800-53. CSY104 Week-11: Use CVSS scoring to prioritize remediation of exposed endpoints.
Reading Resources (Free + Authoritative)
Complete the required resources to build your foundation.
- PortSwigger - Information Disclosure · 45-60 min · 50 XP · Resource ID: csy203_w2_r1 (Required)
- SecLists - Security Wordlists · 30-45 min · 50 XP · Resource ID: csy203_w2_r2 (Required)
- HackTricks - Web Methodology · Reference · 25 XP · Resource ID: csy203_w2_r3 (Optional)
Lab: Full Reconnaissance Assessment
Goal: Produce comprehensive reconnaissance documentation for a target application.
Part 1: Passive Reconnaissance
- Perform Google dorking (document 10+ queries)
- Check certificate transparency
- Search Wayback Machine
- Look for code repositories
- Document all findings
Part 2: Application Mapping
- Manually browse application through Burp
- Create visual site map
- Identify all user roles and functionality
- Document authentication flow
Part 3: Content Discovery
- Run directory brute forcing with 2+ wordlists
- Check for sensitive files (list of 20+)
- Test for exposed version control
- Document all discovered content
Part 4: Technology Analysis
- Fingerprint web server
- Identify programming language/framework
- Check for CMS
- Detect WAF presence
- Research known vulnerabilities for versions found
Part 5: Entry Point Inventory
- Document all endpoints
- List parameters for each endpoint
- Categorize by input type
- Prioritize for vulnerability testing
Deliverable (submit):
- Passive recon findings document
- Application map diagram
- Content discovery results
- Technology stack analysis
- Complete entry point inventory
- Testing priority recommendations
Checkpoint Questions
- What is the difference between passive and active reconnaissance?
- How can certificate transparency reveal subdomains?
- What tool would you use for directory brute forcing?
- Why is exposed .git directory dangerous?
- How do you identify a PHP application from HTTP headers?
- What are three categories of entry points in web applications?
Weekly Reflection
Reflection Prompt (200-300 words):
This week you learned systematic reconnaissance—the foundation of professional web application testing. You discovered hidden content, identified technologies, and mapped attack surfaces.
Reflect on these questions:
- How does thorough reconnaissance change the testing process compared to immediately trying attacks?
- What surprised you about what's discoverable through passive reconnaissance alone?
- If you were a developer, what would you do differently after seeing how easily hidden content can be found?
- How would you explain the importance of reconnaissance to someone who thinks security testing is just "running scanners"?
A strong reflection will connect reconnaissance methodology to real-world testing effectiveness and defensive implications.
Verified Resources & Videos
- Subdomain Enumeration: OWASP Amass
- Directory Fuzzing: ffuf - Fast Web Fuzzer
- Fingerprinting: Wappalyzer
Reconnaissance is where professional testing begins. The documentation and mapping skills you develop now form the foundation for all vulnerability testing. A well-documented attack surface leads to systematic, comprehensive testing. Next week: authentication vulnerabilities—attacking the front door.