CSY202 Week 02 - Build a recon workflow before moving to reading resources.

Opening Framing: Intelligence Drives Success

Every successful penetration test begins with reconnaissance. Before touching a single system, skilled testers gather extensive information about their target—technology stack, employees, network ranges, business relationships, and potential attack vectors.

This intelligence shapes everything that follows. Knowing an organization uses specific technology helps you find relevant vulnerabilities. Discovering employee names enables targeted phishing. Finding forgotten subdomains reveals attack surface.

This week covers passive and active reconnaissance, OSINT (Open Source Intelligence) techniques, and how to systematically gather information while staying within scope.

Key insight: Time invested in reconnaissance multiplies the effectiveness of every subsequent phase. Rushing to exploitation means missing opportunities.

1) Passive vs. Active Reconnaissance

Understanding the distinction is critical:

Passive Reconnaissance:

Definition:
- Gathering information without directly interacting
  with target systems
- Target cannot detect your activities
- Uses public information sources

Examples:
- Search engine queries
- WHOIS lookups
- DNS records (public)
- Social media research
- Job postings
- Public documents
- Archive.org

Advantages:
- Undetectable
- Legal (public information)
- Can be done pre-authorization
- Risk-free

Limitations:
- Information may be outdated
- Limited depth
- Can't discover internal details

Active Reconnaissance:

Active Reconnaissance:

Definition:
- Directly interacting with target systems
- Target could potentially detect activity
- Requires authorization

Examples:
- Port scanning
- Banner grabbing
- DNS zone transfers
- Directory brute forcing
- Vulnerability scanning

Advantages:
- Current, accurate information
- Discovers actual attack surface
- Finds internal details

Limitations:
- Detectable by IDS/SIEM
- Requires authorization
- May trigger alerts
- Could impact systems

⚠️ Active recon without authorization = illegal

Reconnaissance Phases:

Typical workflow:

Phase 1: Passive OSINT
├── Company information
├── Employee enumeration
├── Technology identification
├── Domain/IP discovery
└── Document harvesting

Phase 2: Semi-Passive
├── DNS enumeration
├── Subdomain discovery
├── Certificate transparency
└── Search engine dorking

Phase 3: Active (requires auth)
├── Port scanning
├── Service enumeration
├── Directory brute forcing
└── Vulnerability scanning

Document everything as you go!

Key insight: Start passive, go active only with authorization. You'd be surprised how much you can learn without touching a single target system.

2) OSINT Techniques

Open Source Intelligence provides rich information:

Search Engine Reconnaissance:

Google Dorking (advanced search operators):

# Find specific file types
site:target.com filetype:pdf
site:target.com filetype:xlsx
site:target.com filetype:docx

# Find login pages
site:target.com inurl:login
site:target.com inurl:admin
site:target.com intitle:"login"

# Find exposed directories
site:target.com intitle:"index of"
site:target.com intitle:"directory listing"

# Find configuration files
site:target.com filetype:conf
site:target.com filetype:env
site:target.com filetype:xml

# Find error messages
site:target.com "sql syntax" 
site:target.com "php error"
site:target.com "stack trace"

# Exclude certain results
site:target.com -www

Google Hacking Database (GHDB):
https://www.exploit-db.com/google-hacking-database

Domain and Infrastructure:

# WHOIS lookup
whois target.com

# Returns:
# - Registrant information
# - Name servers
# - Registration dates
# - Contact details (sometimes)

# DNS enumeration
dig target.com ANY
dig target.com MX
dig target.com NS
dig target.com TXT

# Reverse DNS
dig -x [IP address]

# DNS history
# Use SecurityTrails, ViewDNS, DNSDumpster

# Certificate Transparency logs
# Find subdomains via SSL certificates
# https://crt.sh/?q=%.target.com

# Shodan (search engine for devices)
# https://www.shodan.io
# Find exposed services, IoT devices, etc.

Employee and Organization OSINT:

LinkedIn:
- Employee names and roles
- Technology skills mentioned
- Company size and structure
- Job postings (reveal tech stack)

Job Postings:
- Technologies used
- Security tools deployed
- Cloud providers
- Development practices

GitHub/GitLab:
- Company repositories
- Employee personal repos
- Exposed credentials
- Code structure

Document Metadata:
# Extract metadata from public documents
exiftool document.pdf

# May reveal:
# - Author names
# - Software versions
# - Internal paths
# - Usernames

Social Media:
- Twitter/X for company and employees
- Company blogs
- Conference presentations
- Technical write-ups

Key insight: People post amazing amounts of useful information publicly. Job postings alone can reveal an organization's entire technology stack.

3) Subdomain Enumeration

Finding all subdomains expands your attack surface:

Why Subdomains Matter:

- Forgotten systems (dev, staging, old)
- Different security postures
- Different technologies
- Separate teams/maintenance
- Often less monitored

Common interesting subdomains:
dev.target.com
staging.target.com
test.target.com
api.target.com
admin.target.com
vpn.target.com
mail.target.com
owa.target.com
portal.target.com
legacy.target.com

Enumeration Techniques:

# Certificate Transparency
curl -s "https://crt.sh/?q=%.target.com&output=json" | jq '.[].name_value' | sort -u

# Online tools
# - crt.sh
# - dnsdumpster.com
# - securitytrails.com
# - virustotal.com

# Subfinder (passive)
subfinder -d target.com -o subdomains.txt

# Amass (comprehensive)
amass enum -d target.com -o amass_output.txt

# Assetfinder
assetfinder --subs-only target.com

# Brute force (active - requires auth)
gobuster dns -d target.com -w /usr/share/seclists/Discovery/DNS/subdomains-top1million-5000.txt

# Combine multiple sources
# Compare results from different tools
# Remove duplicates
cat subfinder.txt amass.txt assetfinder.txt | sort -u > all_subdomains.txt

Validating Subdomains:

# Check which subdomains resolve
cat subdomains.txt | while read sub; do
  if host "$sub" > /dev/null 2>&1; then
    echo "$sub"
  fi
done > live_subdomains.txt

# Or use httpx/httprobe
cat subdomains.txt | httpx -silent -o live_http.txt

# Get HTTP response codes
cat live_subdomains.txt | httpx -status-code -title

# Screenshot all live sites
# gowitness or aquatone
gowitness file -f live_http.txt

Key insight: The main website might be hardened, but that forgotten dev.target.com from 2019 might be wide open.

4) Technology Fingerprinting

Identifying technologies guides your attack approach:

Web Technology Detection:

# Wappalyzer (browser extension)
# Detects CMS, frameworks, libraries

# WhatWeb (command line)
whatweb target.com

# Output example:
# Apache, PHP 7.4, WordPress 5.8, jQuery 3.6

# Builtwith
# https://builtwith.com

# Netcraft
# https://sitereport.netcraft.com

Server Fingerprinting:

# HTTP headers reveal technology
curl -I https://target.com

# Look for:
Server: Apache/2.4.41 (Ubuntu)
X-Powered-By: PHP/7.4.3
X-AspNet-Version: 4.0.30319

# Nmap service detection (active)
nmap -sV target.com

# Banner grabbing
nc -v target.com 80
HEAD / HTTP/1.0

# SSL/TLS analysis
sslscan target.com
testssl.sh target.com

CMS Identification:

WordPress:
- /wp-admin/
- /wp-content/
- /wp-includes/
- /xmlrpc.php

Drupal:
- /sites/
- /modules/
- /CHANGELOG.txt
- /core/

Joomla:
- /administrator/
- /components/
- /modules/
- /README.txt

Generic indicators:
- Generator meta tags
- JavaScript library paths
- CSS framework classes
- Cookie names
- Error messages

# CMSmap for automated detection
cmsmap https://target.com

# WPScan for WordPress
wpscan --url https://target.com

Why Technology Matters:

Once you know the technology:

1. Find known vulnerabilities
   - CVE databases
   - Exploit-DB
   - Vendor advisories

2. Find default credentials
   - Admin interfaces
   - Database defaults
   - Application defaults

3. Find common misconfigurations
   - Directory listings
   - Debug modes
   - Default files

4. Target your testing
   - PHP injection vs. ASP.NET
   - MySQL vs. MSSQL syntax
   - Linux vs. Windows commands

Example:
"Apache 2.4.49" → CVE-2021-41773 (path traversal)
"WordPress 5.4" → Look for plugin vulnerabilities

Key insight: Technology identification tells you what to look for. Knowing the stack turns generic testing into targeted attacks.

5) Organizing Reconnaissance Data

Good organization prevents losing valuable intelligence:

Directory Structure:

~/pentests/target_name/
├── scope/
│   ├── authorization.pdf
│   ├── scope.txt
│   └── contacts.txt
├── recon/
│   ├── passive/
│   │   ├── whois.txt
│   │   ├── dns_records.txt
│   │   ├── subdomains.txt
│   │   ├── employees.txt
│   │   └── technology.txt
│   └── active/
│       ├── nmap/
│       ├── screenshots/
│       └── directories/
├── vulnerabilities/
├── exploitation/
├── evidence/
└── report/

Note-Taking Tools:

Command-line documentation:
# Script your terminal
script -a recon_session.log

# Or use tmux logging

Note-taking applications:
- CherryTree (hierarchical notes)
- Obsidian (markdown, linking)
- Notion (collaborative)
- OneNote (Microsoft)

Dedicated pentest tools:
- Dradis (reporting framework)
- Faraday (collaborative platform)
- Reconftw (automated recon)

Key practices:
- Timestamp everything
- Save raw output
- Document commands used
- Note interesting findings immediately
- Take screenshots

Creating a Target Profile:

Target Profile Document:

ORGANIZATION OVERVIEW
- Company Name: 
- Industry:
- Size:
- Key business functions:

TECHNICAL INFRASTRUCTURE
- Primary domains:
- IP ranges:
- Hosting providers:
- Cloud services:

WEB PRESENCE
- Main website technology:
- Subdomains discovered:
- Web applications:
- Login portals:

EMPLOYEES
- Key personnel:
- IT/Security contacts:
- Email format:
- LinkedIn profiles:

TECHNOLOGY STACK
- Web servers:
- Programming languages:
- Databases:
- Security tools observed:

POTENTIAL ATTACK VECTORS
- [List based on findings]
- [Prioritized by likelihood]

Key insight: Reconnaissance data is useless if you can't find it later. Organize from the start—your future self will thank you.

Real-World Context: Recon in Practice

How professionals approach reconnaissance:

Time Investment: Professional pentesters often spend 30-40% of their engagement time on reconnaissance. This isn't wasted time—it's what makes exploitation efficient and comprehensive.

Automation vs. Manual: Tools automate the grunt work, but human analysis spots patterns tools miss. The best results combine automated scanning with manual review and creative thinking.

Legal Boundaries: Passive OSINT is generally legal, but active scanning requires authorization. Be very clear about where the line is and document your authorization.

MITRE ATT&CK Mapping:

T1595 - Active Scanning: Port scans, vulnerability scans
T1592 - Gather Victim Host Information: Technology fingerprinting
T1589 - Gather Victim Identity Information: Employee OSINT
T1590 - Gather Victim Network Information: DNS, IP ranges

Key insight: Attackers do extensive recon. Understanding their techniques helps you both attack (as a pentester) and defend (by reducing your exposure).

Guided Lab: Comprehensive Reconnaissance

Conduct reconnaissance against an authorized target.

Step 1: Select a Target

Choose an authorized target:

Option 1: Use intentionally vulnerable sites
- scanme.nmap.org (Nmap authorized)
- testphp.vulnweb.com (Acunetix)
- demo.testfire.net (IBM)

Option 2: Your own domain/infrastructure

Option 3: Bug bounty program (follow their rules)
- hackerone.com/directory
- bugcrowd.com/programs

⚠️ ONLY test authorized targets!

Step 2: Passive Reconnaissance

# WHOIS
whois [target-domain]

# DNS records
dig [target-domain] ANY
dig [target-domain] MX
dig [target-domain] TXT

# Subdomain enumeration (passive)
curl -s "https://crt.sh/?q=%.[target-domain]&output=json" | jq -r '.[].name_value' | sort -u

# Google dorking
# site:[target-domain] filetype:pdf
# site:[target-domain] inurl:admin

# Technology detection
whatweb [target-domain]

Step 3: Semi-Active Reconnaissance

# HTTP headers
curl -I https://[target-domain]

# Subdomain brute force (if authorized)
gobuster dns -d [target-domain] -w /usr/share/seclists/Discovery/DNS/subdomains-top1million-5000.txt

# Validate subdomains
cat subdomains.txt | httpx -status-code -title

Step 4: Document Findings

Create target profile:

1. Document all discovered domains/IPs
2. List technologies identified
3. Note interesting findings
4. Identify potential attack vectors
5. Prioritize next steps

Reflection (mandatory)

What was the most surprising information you found publicly?
How could this information be used by an attacker?
What could the organization do to reduce their exposure?
How did passive vs. active recon results differ?

Week 2 Outcome Check

By the end of this week, you should be able to:

Distinguish between passive and active reconnaissance
Use OSINT techniques to gather intelligence
Enumerate subdomains using multiple methods
Fingerprint web technologies
Organize reconnaissance data effectively
Create a comprehensive target profile

Next week: Scanning and Enumeration—actively mapping the target network and services.

🎯 Hands-On Labs (Free & Essential)

Build a recon workflow before moving to reading resources.

🎮 TryHackMe: Passive Recon

What you'll do: Gather intelligence using public sources and OSINT tools.
Why it matters: Passive recon builds target context without detection risk.
Time estimate: 1-1.5 hours

Start TryHackMe Passive Recon →

🎮 TryHackMe: OH SINT

What you'll do: Perform OSINT analysis to build a target profile.
Why it matters: OSINT often reveals access paths and weak points.
Time estimate: 1-2 hours

Start TryHackMe OH SINT →

🎮 TryHackMe: Active Recon

What you'll do: Perform authorized, active recon techniques against targets.
Why it matters: Active recon reveals real attack surface data.
Time estimate: 1-1.5 hours

Start TryHackMe Active Recon →

🏁 PicoCTF Practice: Web Exploitation (Recon Skills)

What you'll do: Solve beginner web challenges that reinforce recon techniques.
Why it matters: Recon skills transfer directly into web exploitation.
Time estimate: 1-2 hours

Start PicoCTF Web Exploitation →

🛡️ Lab: Implement Proper Encryption

What you'll do: Encrypt and decrypt data using AES-GCM or ChaCha20-Poly1305.
Deliverable: Script showing secure encryption + integrity verification.
Why it matters: Many breaches come from using outdated or unsafe modes.
Time estimate: 60-90 minutes

💡 Lab Tip: Record sources, timestamps, and scope notes so findings are defensible.

🛡️ Key Management in the Real World

Encryption is only as strong as its keys. During recon, you should identify where keys are stored and how they are rotated.

Key management basics:
- Rotate keys on a schedule or after exposure
- Store keys in a vault, not source code
- Separate encryption keys from data storage
- Use unique keys per environment

📚 Building on CSY101 Week-14: Map key handling to compliance requirements and audit evidence.

Resources

Complete the required resources to build your foundation.

OSINT Framework · 45-60 min · 50 XP · Resource ID: csy202_w2_r1 (Required)
Google Hacking Database · 30-45 min · 50 XP · Resource ID: csy202_w2_r2 (Required)
OWASP Amass Documentation · Reference · 25 XP · Resource ID: csy202_w2_r3 (Optional)

Lab: Full OSINT Investigation

Goal: Conduct comprehensive OSINT against an authorized target and document findings professionally.

Scenario

You've been engaged to perform a penetration test. Before active testing begins, you're conducting OSINT to understand the target's attack surface.

Part 1: Domain Intelligence

Perform WHOIS lookup
Enumerate all DNS record types
Identify all subdomains (minimum 3 methods)
Check certificate transparency logs
Document IP ranges and hosting providers

Part 2: Technology Profiling

Identify web server and version
Detect CMS and frameworks
List JavaScript libraries
Note any exposed version information
Check for known CVEs for identified technologies

Part 3: Organization Intelligence

Search for public documents (PDF, DOCX, etc.)
Extract metadata from found documents
Research key employees (if in scope)
Identify email format
Review job postings for technology hints

Part 4: Attack Surface Analysis

List all discovered entry points
Prioritize by potential value
Identify potential vulnerabilities based on technologies
Recommend next steps for active testing

Deliverable (submit):

Complete target profile document
Subdomain list with validation status
Technology stack analysis
Attack surface summary with prioritized vectors
Raw tool output (appendix)

Checkpoint Questions

What is the difference between passive and active reconnaissance?
What information can be obtained from WHOIS records?
What are three methods for discovering subdomains?
What is a Google dork and how is it useful?
Why is technology fingerprinting valuable for a penetration test?
What OSINT sources can reveal an organization's technology stack?

Week 02 Quiz

Test your understanding of reconnaissance methods and OSINT techniques.

Format: 10 multiple-choice questions. Passing score: 70%. Time: Untimed.

Take Quiz

Weekly Reflection

Reflection Prompt (200-300 words):

This week you learned reconnaissance—the foundation of effective penetration testing. You gathered intelligence using OSINT techniques and organized your findings.

Reflect on these questions:

What did you find that surprised you about how much information is publicly available?
How could organizations reduce their OSINT exposure without impacting business operations?
Why do you think professional pentesters spend so much time on reconnaissance?
How has your perspective on information sharing changed after learning these techniques?

A strong reflection will consider both the offensive (pentesting) and defensive (protecting organizations) implications.

Verified Resources & Videos

Subdomain Tools: Subfinder by ProjectDiscovery
Technology Detection: Wappalyzer
Certificate Search: crt.sh - Certificate Transparency

Reconnaissance is where penetration tests are won or lost. The intelligence you gather shapes every subsequent phase. Practice these skills constantly—they improve with experience. Next week: active scanning and enumeration.