Initial commit: antifragile cybersecurity consulting blueprint

Complete repository of frameworks, playbooks, and assessment resources
for cybersecurity consultations focused on antifragile enterprise design.

Includes:
- Core philosophy and manifest (5 pillars)
- 12 modular engagement packages
- AI sovereignty and operations frameworks
- Zero-budget vulnerability discovery and hardening playbooks
- M365 E3 hardening and antifragile project plans
- Osquery sovereign discovery platform blueprint
- Perimeter scanning capability guide
- AI-assisted TVM blueprint for AI-powered adversaries
- Vertical specializations: banking, telco, power/utilities
- CIS Controls v8 and NIST CSF 2.0 mappings
- Risk registers and assessment templates
- C-suite conversation guide and business case templates
This commit is contained in:
2026-05-09 16:53:22 +02:00
commit 763da003d3
35 changed files with 9711 additions and 0 deletions

View File

@@ -0,0 +1,380 @@
# On-Premises AD and Endpoint Hardening Playbook
> *"The cloud gets the glory. Active Directory gets compromised."*
This playbook covers the security of on-premises Active Directory, Windows endpoints, and the identity boundary between on-premises and cloud (hybrid identity). It is designed for consulting engagements where the client maintains on-premises infrastructure alongside M365—common in telco, power, and banking environments.
---
## The On-Premise Reality
Most M365 clients did not start in the cloud. They have:
- Active Directory forests with 10+ years of technical debt
- Group Policy objects (GPOs) that no one dares to change
- Service accounts with passwords set to "never expire"
- Admin accounts that log in from the same workstations as regular users
- Backup systems that have never been tested
- KRBTGT accounts that have never been rotated
Our job is not to shame them. Our job is to **fix the kill chain fast** and give them a path to sustainable hygiene.
---
## Phase 1: AD Kill Chain Assessment (Days 1-7)
### Identity Census
**Export and analyze the full AD estate**:
```powershell
# All users with properties
Get-ADUser -Filter * -Properties LastLogonDate, PasswordLastSet, PasswordNeverExpires, ServicePrincipalName, MemberOf | Export-Csv ad-users.csv
# All groups (especially privileged)
Get-ADGroup -Filter * | Where-Object { $_.Name -match "admin|operator|backup|account|server" } | Export-Csv ad-priv-groups.csv
# All computer accounts
Get-ADComputer -Filter * -Properties LastLogonDate, OperatingSystem | Export-Csv ad-computers.csv
# Service accounts (have SPN or description indicating service use)
Get-ADUser -Filter { ServicePrincipalName -like "*" } -Properties ServicePrincipalName | Export-Csv ad-spns.csv
```
**What to look for**:
| Red Flag | Risk | Action |
|----------|------|--------|
| Accounts with PasswordNeverExpires = $true | Credential stuffing goldmine | Force rotation; justify exceptions |
| Admin accounts with last logon > 90 days | Stale, possibly compromised | Disable; verify with owner |
| Users in Domain Admins who should not be | Lateral movement path | Remove; document justification for remaining |
| Computer accounts with last logon > 180 days | Ghost machines, easy targets | Disable; purge after 30 days |
| Service accounts with interactive logon | Violation of principle | Convert to managed service accounts or gMSA |
| Duplicate SPNs | Kerberos authentication failures, potential attack vector | Fix immediately |
### Privileged Access Assessment
**Map the tier model** (if it exists) or establish one:
| Tier | Scope | Examples |
|------|-------|----------|
| Tier 0 | Controls AD and identity | Domain Admins, Enterprise Admins, Schema Admins, Account Operators, KRBTGT |
| Tier 1 | Controls server workloads | Server Admins, Database Admins, Backup Operators |
| Tier 2 | Controls workstations | Workstation Admins, Help Desk |
**Immediate actions**:
- Remove Account Operators, Backup Operators, Print Operators from Tier 0 equivalents if possible (these groups have dangerous default permissions)
- Ensure no Tier 0 account ever logs on to a Tier 2 device (workstation)
- Document every member of Domain Admins with business justification
### The KRBTGT Account
The KRBTGT account is the **cryptographic foundation of the entire Kerberos realm**. Its password hash is used to sign all Kerberos tickets. If an adversary has this hash, they have permanent golden ticket capability.
**Check last password change**:
```powershell
Get-ADUser krbtgt -Properties PasswordLastSet
```
- If last changed > 180 days ago: **rotate immediately**
- If never changed (common in old forests): **rotate immediately, but plan carefully**
**Rotation procedure** (do not do this during business hours without planning):
```powershell
# Requires Domain Admin; do twice with ~10 hours between (replication window)
Reset-KrbtgtKeyInteractive -Domain "corp.example.com"
```
Or use the Microsoft KRBTGT rotation script: `https://github.com/microsoft/New-KrbtgtKeys.ps1`
**Warning**: Rotating KRBTGT invalidates all existing Kerberos tickets. Users will need to re-authenticate. Plan for:
- Off-hours execution
- Service account impact (may need restart)
- VPN reconnection requirements
---
## Phase 2: Endpoint Hardening (Days 8-14)
### Microsoft Defender Antivirus (E3 Baseline)
E3 includes Defender Antivirus but **not** the advanced EDR features. Maximize what you have:
**Enable all protection features** (often disabled by previous AV migration):
```powershell
# Check current state
Get-MpPreference | Select-Object Disable*, Exclusion*
# Enable real-time protection
Set-MpPreference -DisableRealtimeMonitoring $false
# Enable behaviour monitoring
Set-MpPreference -DisableBehaviorMonitoring $false
# Enable network protection (blocks malicious IPs/URLs at network layer)
Set-MpPreference -EnableNetworkProtection Enabled
# Enable attack surface reduction rules (audit mode - requires ASR-capable license for full enforcement, but audit logging works)
# Note: Full ASR enforcement requires Defender for Endpoint P2, but you can still configure audit mode
Set-MpPreference -AttackSurfaceReductionRules_Actions AuditMode
```
**Update signatures and engine**:
```powershell
Update-MpSignature
Update-MpThreatDefinitions
```
### Sysmon Deployment (Free Telemetry)
Since E3 lacks EDR, **Sysmon is non-negotiable**. It provides process creation, network connections, driver loading, and file creation telemetry.
**Deployment**:
1. Download Sysmon from Microsoft Sysinternals
2. Use the SwiftOnSecurity configuration: `sysmonconfig-export.xml`
3. Deploy via GPO or Intune:
```cmd
sysmon.exe -accepteula -i sysmonconfig-export.xml
```
**Log forwarding**: Configure Windows Event Forwarding (WEF) or use a free log collector (Wazuh agent, nxlog) to centralize Sysmon logs.
### LAPS (Local Administrator Password Solution)
LAPS is **free from Microsoft** and essential. It randomizes local admin passwords per machine and stores them securely in AD.
**Deployment**:
1. Download LAPS from Microsoft
2. Extend AD schema (one-time, irreversible):
```powershell
Update-AdmPwdADSchema
```
3. Set permissions for computer self-write:
```powershell
Set-AdmPwdComputerSelfPermission -OrgUnit "OU=Workstations,DC=corp,DC=example,DC=com"
```
4. Set read permissions for authorized admins only:
```powershell
Set-AdmPwdReadPasswordPermission -OrgUnit "OU=Workstations,DC=corp,DC=example,DC=com" -AllowedPrincipals "HelpDesk-Admins"
```
5. Deploy LAPS client via GPO
**The conversation**:
> *"Every workstation with the same local admin password is a domino. If I compromise one, I own them all. LAPS makes every password unique and rotates it automatically. It is free, from Microsoft, and takes one day to deploy."*
### Windows Firewall Hardening
Enable and log all profiles:
```powershell
# Enable all profiles
Set-NetFirewallProfile -Profile Domain,Public,Private -Enabled True
# Enable logging for dropped packets
Set-NetFirewallProfile -Profile Domain,Public,Private -LogBlocked True -LogFileName "%systemroot%\system32\LogFiles\Firewall\pfirewall.log"
```
**Block inbound by default** except:
- RDP (only via jump host or PAW)
- SMB (only server-to-server, block workstation inbound)
- Required application ports (documented)
### Credential Guard and Device Guard (Where Hardware Supports)
Credential Guard isolates LSASS to prevent credential theft (Mimikatz-style attacks).
**Requirements**: UEFI 2.3.1c+, Secure Boot, TPM 2.0, Hyper-V Hypervisor
**Enable via GPO**:
- Computer Configuration → Administrative Templates → System → Device Guard → Turn On Virtualization Based Security
- Enable Credential Guard
**Banking/telco/power**: These sectors often have hardware that supports Credential Guard. Enable it. It is free and dramatically reduces credential theft risk.
---
## Phase 3: Network Segmentation and Boundary (Days 15-21)
### The Active Directory Perimeter
Most AD environments are "flat": every workstation can reach every server, every VLAN trusts every other VLAN. This is the kill chain.
**Segmentation priorities** (work with existing network team):
| Segment | What It Contains | Access Rules |
|---------|-----------------|--------------|
| Tier 0 | Domain controllers, AD admin jump hosts | No inbound from Tier 1 or 2. Admin access only from PAWs. |
| Tier 1 | Servers, databases, applications | No inbound from Tier 2 (workstations) except required application ports. |
| Tier 2 | Workstations, user devices | Internet and internal app access only. No direct server admin access. |
| Management | Monitoring, backup, patch management | Outbound to all tiers for management traffic. Inbound restricted to admin sources. |
| OT Boundary | SCADA, ICS, control systems | **Air-gapped or one-way diode**. If integration required, use data diode or unidirectional gateway. |
### DNS Security
DNS is the most underrated security control. Most malware needs DNS to find its command and control.
**Immediate actions**:
- Point all endpoints to a DNS resolver with filtering:
- **Quad9** (9.9.9.9) — free, blocks known malicious domains
- **Cloudflare for Teams** (free tier) — filtering + logging
- **Microsoft DNS security** (if available)
- Enable DNS query logging on internal DNS servers
- Block DNS over HTTPS (DoH) at the firewall unless using a managed DoH provider (prevents DNS tunneling evasion)
### Network Monitoring on a Budget
**Zeek (formerly Bro)** — open-source network analysis framework:
- Deploy on a SPAN port or network tap at internet boundary
- Provides connection logs, DNS logs, HTTP logs, SSL certificate logs
- Feed into Wazuh, Splunk Free, or Elastic Stack
**Suricata** — open-source IDS/IPS:
- Deploy at internet boundary and critical internal segments
- Use Emerging Threats Open ruleset (free)
- Alert on known malicious indicators
**The conversation**:
> *"You do not need a $100,000 NDR platform to see malicious traffic. You need a SPAN port, an old server, and Zeek. We will show you the connections your firewall is allowing that it should not be."*
---
## Phase 4: Hybrid Identity Security (Days 22-30)
### Azure AD Connect Health
Most on-premises AD environments are synchronized to Entra ID (Azure AD) via Azure AD Connect.
**Immediate hardening**:
- **Secure the Azure AD Connect server**: Treat it as Tier 0. No interactive logon except admins.
- **Enable PTA (Pass-Through Authentication) or PHS (Password Hash Sync) + Seamless SSO**: Evaluate which is appropriate
- PHS: Better resilience (can authenticate even if AAD Connect is down)
- PTA: Passwords never leave premises (some regulatory preference)
- **Enable password hash synchronization even if using PTA**: Provides fallback auth and enables Identity Protection detections if you later upgrade to P2
- **Enable Seamless SSO**: Reduces password prompts, improves MFA adoption
**Azure AD Connect configuration audit**:
```powershell
# On the AAD Connect server
Get-ADSyncScheduler
Get-ADSyncConnector
```
Verify:
- Only required OUs are syncing
- No accidental filtering exclusions that hide accounts
- The sync account has minimal necessary permissions
### AD FS (If Present)
AD FS is a **high-value target**. If compromised, the adversary controls federation for all cloud apps.
**Immediate hardening**:
- **Upgrade to latest supported version** (AD FS 2019 or later)
- **Enable Extranet Lockout**: Prevents brute force against AD FS from the internet
- **Enable PPR (Protection Against Password Reuse) / Smart Lockout**
- **Require MFA for AD FS extranet access** (if MFA infrastructure exists)
- **Review relying party trusts**: Remove stale or unknown trusts
- **Enable AD FS audit logging**: Forward to SIEM
**The conversation**:
> *"If I compromise AD FS, I do not need to crack your passwords. I just federate myself as an administrator. AD FS is Tier 0. Treat it accordingly."*
---
## OT / Critical Infrastructure Specifics (Telco, Power)
### The IT/OT Boundary
In power and telco environments, the AD forest often extends closer to OT than it should.
**Rules**:
- OT networks must not trust IT AD forests directly
- If Active Directory is required in OT, use a **separate forest** with one-way trust or no trust
- SCCM / Intune patch management for OT systems must be on a separate hierarchy
- Administrative credentials for OT must never be used on IT workstations
### Control System Workstations
- Engineering workstations (EWS) and operator stations (HMI) must run **application whitelisting** (AppLocker or third-party)
- USB ports: disabled or strictly controlled
- No internet access from OT VLANs
- Antivirus signatures updated via offline mechanism, not direct internet
### NIS2 and Critical Infrastructure
For EU critical infrastructure (power, telco):
- Incident reporting to CSIRT/NIS authority within 24-72 hours
- Supply chain security: document every vendor with AD or network access
- Encryption: data at rest and in transit for sensitive systems
- Multi-factor authentication for all remote access to critical systems
See [Vertical: Power Utilities](../reference/vertical-power-utilities.md) for comprehensive OT alignment.
---
## Banking Specifics
### Privileged Access for Financial Data
- Database administrators with access to core banking systems: **vault all credentials**, require dual authorization
- SWIFT infrastructure: isolated network, dedicated workstations, no internet
- Audit trails for all financial transaction system access: immutable, 7+ years retention
### Regulatory Alignment
| Regulation | AD/Endpoint Implication |
|-----------|------------------------|
| **PSD2** | Strong authentication for payment service users; MFA for internal payment systems |
| **DORA** | ICT risk management includes identity and access; recovery testing mandatory |
| **GDPR** | Access to personal data must be logged, justified, and time-bounded |
| **NIS2** (for systemic banks) | Incident reporting, supply chain risk management, encryption |
See [Vertical: Banking](../reference/vertical-banking.md) for comprehensive regulatory alignment.
---
## 30-Day Checklist for AD/Endpoint Engagements
- [ ] Full AD identity census exported and analyzed
- [ ] KRBTGT password rotation completed (or scheduled with plan)
- [ ] All privileged groups documented and justified
- [ ] LAPS deployed to all workstations
- [ ] Sysmon deployed to all Windows endpoints
- [ ] Defender Antivirus fully enabled and updated
- [ ] Windows Firewall enabled and logging on all endpoints
- [ ] DNS filtering deployed (Quad9 / Cloudflare)
- [ ] Network segmentation plan documented (even if not fully implemented)
- [ ] Azure AD Connect server secured and audited
- [ ] AD FS hardened (if present)
- [ ] Backup of AD System State tested (verify you can restore a DC)
- [ ] Credential Guard enabled on capable hardware
---
*Previous: [M365 E3 Hardening](m365-e3-hardening.md)*
*Next: [Implementation Playbook](implementation-playbook.md)*

View File

@@ -0,0 +1,326 @@
# AI-Assisted Threat and Vulnerability Management Blueprint
> *"Mythos will scan your entire perimeter in hours, not weeks. But here is the asymmetry: Mythos finds vulnerabilities. AI-assisted TVM finds them first, prioritizes them by exploitability in your specific environment, and generates the remediation code before the adversary writes the exploit."*
This blueprint provides a concrete, board-ready program for organizations facing the reality that AI-powered adversaries—whether criminal tools or agentic systems like Mythos—can discover and weaponize vulnerabilities faster than human teams can patch them.
It is designed for CTOs who need to go to the board with **something tangible**: not just "fix the basics," but an active, modern defensive capability that uses artificial intelligence as a force multiplier against AI-powered offence.
---
## The Problem: AI-Powered Offense Changes the Math
### Traditional Vulnerability Management
| Step | Traditional Timeline | Human Effort |
|------|---------------------|--------------|
| Scan for vulnerabilities | Weekly or monthly | Automated scanner |
| Prioritize findings | Days to weeks | Analyst reads CVSS, debates internally |
| Assess exploitability | Weeks | Manual research, PoC testing |
| Create remediation | Weeks to months | Engineering ticket, backlog queue |
| Validate fix | Months | Re-scan, manual verification |
| **Total cycle** | **3-9 months** | **Heavy human bottlenecks** |
### AI-Powered Offense (Mythos-Class)
| Capability | Impact |
|-----------|--------|
| **Continuous autonomous scanning** | Perimeter scanned daily, not monthly |
| **Intelligent vulnerability chaining** | Identifies kill chains: vuln A + vuln B + misconfiguration C = domain compromise |
| **Automated exploit generation** | Proof-of-concept code generated in minutes for newly disclosed CVEs |
| **Context-aware targeting** | Prioritizes vulnerabilities on internet-facing, privileged, or unmonitored assets |
| **Speed** | What took a human red team weeks takes an AI agent hours |
**The board conversation the CTO fears**:
> *"We have 12,000 open vulnerabilities. Our patching SLA is 90 days for critical. Mythos—or a criminal group using similar tooling—can scan our entire estate, chain our weaknesses, and have an exploit ready before we have even assigned the ticket."*
**The traditional consultant response** (which is correct but insufficient):
> *"We need to implement CIS IG1, clean up our attack surface, and get our house in order."*
**The problem**: The board has heard this before. The CTO has heard this before. It sounds like the same plan that has failed for five years, now with an AI-shaped deadline.
---
## The Asymmetric Response: AI-Assisted TVM
AI-assisted TVM does not replace basic hygiene. It **accelerates it by an order of magnitude**. The goal is not to eliminate all vulnerabilities—that is impossible. The goal is to **compress the find-to-fix cycle so dramatically that the adversary's AI advantage is neutralized**.
| Traditional TVM | AI-Assisted TVM | Speed Multiplier |
|----------------|-----------------|------------------|
| Scan → prioritize by CVSS | Scan → prioritize by **exploitability × asset criticality × active threat intelligence** | 10x faster prioritization |
| Manual research: "Is this actually exploitable?" | AI predicts exploitability from code patterns, social media chatter, and dark web indicators | 100x faster assessment |
| Manual ticket creation and assignment | AI generates **remediation code, GPO scripts, or Intune policies** with human review | 10x faster remediation prep |
| Monthly re-scan to verify | Continuous validation via **agent-based monitoring and drift detection** | Real-time verification |
| Analyst reads 500-page scan report | AI synthesizes **top 10 actions that reduce risk most** into a one-page brief | Board-ready in seconds |
---
## The Architecture
### Layer 1: Discovery and Inventory
**Goal**: Know what you have before the adversary does.
| Source | What It Provides | AI Enhancement |
|--------|-----------------|---------------|
| **Defender Exposure Management** (E5) | Vulnerability inventory, misconfigurations, Secure Score | AI prioritizes recommendations by actual exploitability, not just severity |
| **Network scanners** (Tenable, Qualys, Rapid7, OpenVAS) | Traditional vulnerability scanning | AI correlates scan results with threat intel to predict which vulns will be exploited first |
| **Cloud security posture** (Defender for Cloud, Prisma, Wiz) | Cloud resource misconfigurations | AI identifies cloud-specific kill chains (e.g., overly permissive S3 → compromised IAM → lateral movement) |
| **Zero-budget discovery** (PowerShell, SSH scripts, Syft/Grype, osquery) | Server inventory, SBOMs, package-level CVE correlation | AI aggregates script-based findings into unified risk view. See [Zero-Budget Vulnerability Discovery](zero-budget-vulnerability-discovery.md) |
| **osquery + FleetDM** | Cross-platform endpoint inventory, real-time process/network data, policy compliance | AI queries live endpoint state for prioritization and kill chain simulation. See [Osquery: The Sovereign Discovery Platform](osquery-custom-platform.md) |
| **Attack surface management** (Cortex Xpanse, Shodan, Nuclei, Amass) | External-facing assets unknown to IT | AI maps shadow IT and forgotten assets faster than manual discovery. See [Perimeter Scanning Capability](perimeter-scanning-capability.md) |
| **Software bill of materials (SBOM)** | Known vulnerable components in applications | AI monitors SBOMs against real-time CVE disclosure and exploit availability |
### Layer 2: Intelligent Prioritization
**Goal**: Stop patching by CVSS. Start patching by **probability of exploitation in your environment**.
| Input | AI Processing | Output |
|-------|--------------|--------|
| CVE database + exploit code availability | Predictive model: will this be exploited in the wild in the next 7/14/30 days? | Risk-ranked vulnerability list |
| Asset criticality (CMDB + business context) | Cross-reference: which vulnerable assets are Tier 0 / Tier 1 / internet-facing? | Environment-specific priority |
| Active threat intelligence (MISP, CISA KEV, vendor advisories) | Correlation: are threat actors currently targeting this vulnerability? | Threat-informed urgency |
| Network topology and segmentation | Kill chain simulation: can this vulnerability be reached from the internet? From a compromised workstation? | Reachability-adjusted risk |
| Compensating controls | Control validation: is the vulnerable host behind WAF? Is EDR monitoring it? | Residual risk calculation |
| External attack surface (perimeter scan findings) | Outside-in risk multiplier: internet-facing vulns weighted 10x higher than internal | Perimeter-aware priority |
**The outside-in weighting**: A vulnerability on an internet-facing server is 10x more urgent than the same vulnerability on an internal workstation because adversary AI scanners find it first. See [Perimeter Scanning Capability](perimeter-scanning-capability.md).
**The result**: Instead of 12,000 vulnerabilities sorted by CVSS, the team sees **the 50 vulnerabilities that matter this week**—ranked by the probability that an AI-powered adversary will exploit them in the client's specific architecture.
### Layer 3: Automated Remediation Preparation
**Goal**: Reduce the time from "identified" to "fix ready" from weeks to hours.
| Vulnerability Type | AI-Generated Remediation | Human Review Required |
|-------------------|-------------------------|----------------------|
| Missing OS patch | PowerShell/Intune update policy + deployment ring recommendation | Yes: test and schedule |
| Misconfigured firewall rule | Corrected rule + impact analysis + rollback script | Yes: network team validation |
| Default credential | Password randomization script + vault storage + service restart procedure | Yes: application owner sign-off |
| TLS configuration weakness | Hardened registry settings / nginx config / Azure Front Door policy | Yes: SSL/TLS team validation |
| Cloud IAM over-permission | Least-privilege policy + impact simulation | Yes: cloud team review |
| Container image vulnerability | Updated Dockerfile + base image recommendation | Yes: CI/CD pipeline test |
**Key principle**: AI generates the **draft remediation**. Humans validate, test, and deploy. This is not autonomous patching. It is **augmented patching**—the AI does the research and scripting; the human does the judgment and approval.
### Layer 4: Continuous Validation
**Goal**: Prove that fixes worked and detect drift immediately.
| Validation Method | AI Enhancement |
|-------------------|---------------|
| Re-scan after patch | AI correlates patch deployment with scan results; flags failed patches automatically |
| Configuration drift detection | AI baselines "known good"; alerts on deviation within hours, not months |
| Exploit attempt detection | AI monitors EDR/SIEM for exploitation techniques targeting recently disclosed CVEs |
| Adversarial simulation | AI-driven purple team exercises that target the **exact vulnerabilities** still open |
---
## The 30-60-90 Day AI-Assisted TVM Sprint
### Phase 1: Baseline and Acceleration (Days 0-30)
**Theme**: *Know your enemy's starting point. Beat them to the first move.*
**Week 1: Threat-Informed Asset Discovery**
- Inventory all vulnerability scanning sources (Defender Exposure Management, Tenable, Qualys, cloud scanners, or zero-budget scripts if no commercial tools exist)
- Identify gaps: which assets are not scanned? Which scans are stale?
- Deploy **attack surface management** scan: discover what the internet sees
- Deploy **Shadow IT discovery**: unknown cloud apps, unapproved infrastructure
- Run **zero-budget discovery sweep** on servers without EDR/scanner coverage. See [Zero-Budget Vulnerability Discovery](zero-budget-vulnerability-discovery.md)
**Deliverable**: Asset and vulnerability inventory with coverage gaps identified
**Week 2: AI-Powered Prioritization Engine**
- Integrate vulnerability data with:
- CISA Known Exploited Vulnerabilities (KEV) catalog
- ExploitDB / GitHub exploit availability
- Dark web chatter monitoring (where feasible)
- Client's CMDB for asset criticality
- Deploy **local AI model** (or Azure OpenAI with structured prompting) to:
- Synthesize scan results into risk-ranked action list
- Predict which vulnerabilities will be exploited in next 30 days
- Generate one-page executive brief weekly
**Deliverable**: AI-prioritized vulnerability list; first executive brief
**Week 3: Remediation Acceleration**
- Select top 20 vulnerabilities from AI-prioritized list
- Use AI to generate remediation scripts/policies for each
- Human review and validation
- Deploy fixes in controlled maintenance windows
- Measure: time from identification to fix ready vs. historical baseline
**Deliverable**: 20 critical vulnerabilities remediated or in controlled deployment
**Week 4: Validation and Board Briefing**
- Re-scan to validate fixes
- AI generates before/after risk dashboard
- Board briefing: "We had 12,000 vulnerabilities. AI identified the 50 that mattered. We fixed the top 20 in 30 days. Here is the trend."
**Deliverable**: Board-ready TVM dashboard; 30-day metrics report
---
### Phase 2: Operationalization (Days 30-60)
**Theme**: *Make AI-assisted TVM the operating rhythm, not a project.*
**Week 5-6: Integration into SOC Workflow**
- Vulnerability alerts feed into SOC triage queue
- AI enriches vulnerability alerts with: exploit availability, asset criticality, business impact
- SOC analysts can escalate high-risk vulnerabilities as incidents
- Automated containment: vulnerable internet-facing assets temporarily restricted pending patch
**Week 7-8: Automated Remediation Pipeline**
- Build CI/CD pipeline for vulnerability remediation:
- AI generates patch policy → security team reviews → automated deployment to test ring → validation → production deployment
- Target: 80% of routine patches (OS, browser, standard apps) automated with human approval
- Exception handling: complex or risky patches remain manual
**Week 9-10: Purple Team Targeting Open Vulnerabilities**
- Purple team exercise: red team attempts to exploit vulnerabilities **still open** from the AI-prioritized list
- Measures: Did the SOC detect the exploitation attempt? Did the vulnerability allow compromise? How fast was response?
- Findings feed back into AI prioritization model
**Deliverable**: Operating rhythm established; automated pipeline operational; first vulnerability-focused purple team complete
---
### Phase 3: Strategic Advantage (Days 60-90)
**Theme**: *Convert vulnerability management from cost centre to competitive advantage.*
**Week 11-12: Predictive and Proactive**
- AI monitors CVE disclosure streams in real time
- Within 24 hours of critical CVE disclosure:
- AI assesses: are we affected? Which assets? What is the exposure?
- AI generates: risk assessment, remediation script, communication draft
- Human team validates and deploys in <48 hours
- Compare: industry average for critical CVE response is 30-60 days. Target: <48 hours for high-confidence remediations.
**Ongoing: Continuous Improvement**
- Weekly AI-generated TVM executive brief
- Monthly purple team exercise targeting open vulnerabilities
- Quarterly board report: mean time to remediate, AI prediction accuracy, adversarial simulation results
---
## The Board-Ready Demo Script
When the CTO walks into the boardroom with this program, they bring **evidence, not promises**.
### The 10-Minute Demo
**Minute 1-2: The Threat**
> *"Last month, an AI-powered scanning tool identified 12,000 vulnerabilities in our environment. Industry average time to patch a critical vulnerability: 60 days. Industry average time for an AI-powered adversary to weaponize a newly disclosed vulnerability: 5 days. The gap is fatal."*
**Minute 2-4: The Traditional Response**
> *"Our previous approach was to patch by CVSS score. The board has seen this plan before. It requires 20 additional engineers we cannot hire, 9 months we do not have, and produces a false sense of security because CVSS does not predict exploitability."*
**Minute 4-7: The AI-Assisted Alternative**
[Show the dashboard live]
> *"This is our AI-assisted TVM platform. It does not show us 12,000 vulnerabilities. It shows us the 47 vulnerabilities that an adversary is likely to exploit in our specific environment this month, ranked by probability."
[Click on top vulnerability]
> *"This vulnerability—CVE-2024-XXXX—is on three of our internet-facing web servers. CVSS score: 7.5. But the AI has cross-referenced exploit availability, our network topology, and active threat intelligence. It predicts 85% probability of exploitation within 14 days. It has already generated the remediation script. We are deploying it tonight."
[Show before/after]
> *"In 30 days, we reduced our exploitable attack surface by 40%. We did not hire 20 engineers. We used AI to prioritize, generate fixes, and validate. Our mean time to remediate a critical vulnerability dropped from 60 days to 4 days."*
**Minute 7-10: The Ask**
> *"We are not asking for a three-year transformation. We are asking for a 90-day sprint to operationalize AI-assisted vulnerability management. The investment is less than one senior engineer's annual salary. The return is closing the 55-day gap between adversary weaponization and our remediation."*
---
## Tool Stack Recommendations
### Microsoft-Centric (Most Common for Our Clients)
| Layer | Microsoft Tool | AI Enhancement |
|-------|---------------|---------------|
| Discovery | Defender Exposure Management + Defender for Cloud | AI prioritizes exposure recommendations by exploitability |
| Prioritization | Azure OpenAI / local LLM + CISA KEV feed + MISP | Predictive exploitability scoring |
| Remediation | Intune + Azure Policy + PowerShell + Azure Automation | AI-generated remediation scripts and policies |
| Validation | Defender for Endpoint + Sentinel | AI-driven drift detection and adversarial simulation validation |
| Reporting | Power BI + Azure OpenAI synthesis | Natural language executive briefs generated automatically |
### Open-Source and Hybrid
| Layer | Tool | Role |
|-------|------|------|
| Discovery | Wazuh + OpenVAS + osquery/FleetDM + Cloud-native scanners | Vulnerability, configuration, and real-time endpoint discovery |
| Prioritization | Local LLM (Llama 3, Mistral) + exploit prediction models | On-premise AI for sensitive environments |
| Remediation | Ansible + Puppet + custom scripts | Infrastructure-as-code remediation |
| Validation | VulnHub + Atomic Red Team + Caldera | Continuous adversarial validation |
| Reporting | Grafana + custom dashboards + LLM synthesis | Real-time metrics and executive summaries |
---
## The Honest Limitations
AI-assisted TVM is powerful but not magic. Be honest with the board:
| What AI TVM Does Well | What AI TVM Cannot Do |
|----------------------|----------------------|
| Prioritizes faster and smarter than humans | Cannot patch systems without human approval and testing |
| Generates remediation scripts and policies | Cannot fix architectural debt or design flaws |
| Predicts which vulnerabilities will be exploited | Cannot predict zero-days before disclosure |
| Validates fixes continuously | Cannot replace basic hygiene (CIS IG1 is still mandatory) |
| Reduces analyst workload by 70% | Cannot operate without skilled human oversight |
**The framing**:
> *"AI-assisted TVM does not replace our need to implement CIS IG1, harden our endpoints, and govern our identities. What it does is compress the vulnerability management cycle from months to days—giving us a fighting chance against adversaries who operate at machine speed. It is the accelerator. Basic hygiene is still the foundation."*
---
## Integration With Existing Frameworks
| Document | Integration Point |
|----------|-------------------|
| [Rapid Modernisation Plan](rapid-modernisation-plan.md) | AI TVM maps to Phase 1 (Hygiene: visibility), Phase 2 (Control: prioritized remediation), and Phase 4 (Antifragility: continuous learning) |
| [Modular Engagements](../core/modular-engagements.md) | AI TVM can be delivered as a standalone 90-day module or embedded in Module 3 (M365 Security Hardening) and Module 12 (Blue/Purple Team) |
| [Zero-Budget Hardening](zero-budget-hardening.md) | AI TVM leverages existing Microsoft tooling (Defender Exposure Management, Intune) before recommending new purchases |
| [Osquery: The Sovereign Discovery Platform](osquery-custom-platform.md) | osquery provides the owned, queryable data layer for AI prioritization; FleetDM enables continuous endpoint monitoring |
| [Azure OpenAI Sovereignty Bridge](../core/azure-openai-sovereignty-bridge.md) | Azure OpenAI can power the prioritization and synthesis layers; local AI can power air-gapped environments |
| [Antifragile Risk Register](../assessment-templates/antifragile-risk-register.md) | AI TVM directly addresses vulnerability-related risks with convex payoff: small AI investment prevents catastrophic exploitation |
---
## Metrics and KPIs
| Metric | Before | 30-Day Target | 90-Day Target |
|--------|--------|--------------|---------------|
| Mean time to prioritize critical vuln | 14 days | 24 hours | 4 hours |
| Mean time to remediate critical vuln | 60 days | 14 days | 4 days |
| Vulnerabilities with known exploits (open) | Unknown | Measured | <10 |
| % of estate with current scan coverage | 60% | 90% | 98% |
| AI prediction accuracy (exploited vs. not) | N/A | 70% | 85% |
| Time to generate remediation script | 2 days | 2 hours | 30 minutes |
| Executive brief generation time | 8 hours | 30 minutes | 5 minutes (automated) |
| Purple team detection rate (open vulns) | Unknown | 50% | 80% |
---
*For the AI operations inevitability argument, see [AI Operations Inevitability](../core/ai-operations-inevitability.md).*
*For the business case template, see [Business Case Template](business-case-template.md).*
*For board conversation guidance, see [C-Suite Conversation Guide](../core/c-suite-conversation-guide.md).*

View File

@@ -0,0 +1,245 @@
# Business Case Template
> *"The board does not buy security. The board buys risk reduction, regulatory survival, and competitive advantage. Price it accordingly."*
This template provides a reusable structure for building financial justification for antifragile engagements. It is designed to be adapted per client, per vertical, and per regulatory context. The output should be a 4-6 page document that a CFO can evaluate in 15 minutes.
---
## Document Structure
### Page 1: Executive Summary
**Subtitle**: *Investment Proposal: Antifragile Enterprise Program*
| Element | Content |
|---------|---------|
| **Investment ask** | €[X] over 180 days, phase-gated with go/no-go decisions at days 30, 60, 90 |
| **Primary return** | Reduction of existential cyber risk; regulatory compliance evidence; competitive differentiation through AI sovereignty |
| **Break-even** | Day 90 (via avoided regulatory fine exposure, reduced insurance premiums, or operational resilience) |
| **Risk of inaction** | Quantified below; summary: [X]% probability of material incident within 24 months at estimated cost of €[Y] |
### Page 2: Cost of Inaction
**Frame**: The most expensive decision is the one not to act.
#### Direct Costs (Quantifiable)
| Risk Category | Probability (Client-Specific) | Average Industry Cost | Expected Value |
|--------------|------------------------------|----------------------|----------------|
| Ransomware incident (recovery + downtime) | [X]% | €4.5M | €[X * 4.5M] |
| Regulatory fine (DORA / NIS2 / national) | [X]% | 1-2% global turnover | €[X * % GT] |
| Data breach notification and remediation | [X]% | €3.8M (per IBM Cost of Data Breach Report) | €[X * 3.8M] |
| Cloud AI vendor price increase / lock-in | [X]% | 200-500% price shock | €[X * shock] |
| Competitive intelligence loss (cloud AI training) | [X]% | Unquantifiable but existential | High |
**Calculation**:
```
Expected Loss = Σ (Probability_i × Cost_i)
```
Present this as: *"Without intervention, the organization faces an expected loss of €[X] over 24 months. The proposed program costs €[Y], representing a [Z]:1 return on risk reduction."*
#### Indirect Costs (Narrative)
- **Reputational damage**: Customer churn, difficulty acquiring new business, talent attrition
- **Operational paralysis**: During an incident, leadership attention is diverted from growth to survival
- **Insurance premium increases**: Cyber insurers are tightening terms; resilience demonstrably reduces premiums
- **Regulatory scrutiny**: A single incident triggers multi-year regulatory attention and reporting obligations
---
### Page 3: Investment Structure
**Frame**: We spend your money as if it were our own. Configuration first. Purchase only if justified.
#### Phase-Gated Budget
| Phase | Timeline | Primary Activity | Estimated Cost | Go/No-Go Gate |
|-------|----------|-----------------|----------------|---------------|
| **1. Hygiene** | Days 0-30 | Configuration of existing tools; identity cleanse; visibility | €[X] (primarily labor) | Day 30: Demonstrate risk reduction or stop |
| **2. Control** | Days 30-60 | ASR, MFA enforcement, network segmentation, vendor lockdown | €[X] (labor + minimal tooling) | Day 60: Validate control effectiveness |
| **3. Sovereignty** | Days 60-90 | Local AI pilot; recovery drills; T0 asset protection | €[X] (labor + local inference hardware if needed) | Day 90: Prove local AI viability |
| **4. Antifragility** | Days 90-180 | Chaos engineering; red team; continuous improvement | €[X] (labor + external testing) | Day 180: Maturity assessment and next-phase planning |
| **Total** | 180 days | | **€[X]** | |
#### Cost Categories
| Category | Typical % of Budget | Description |
|----------|--------------------|-------------|
| Consulting / Labor | 60-70% | Configuration, process design, training, documentation |
| Existing Tool Activation | 0% | Included in current licensing; no new purchase |
| Local AI Infrastructure | 10-20% | Hardware or sovereign cloud for inference (only if pilot justifies) |
| External Testing | 10-15% | Red team, penetration testing, regulatory validation |
| Training / Change Management | 5-10% | Security awareness, champion programs, board briefings |
#### Compare to Alternatives
| Alternative Approach | Cost | Timeline | Risk |
|---------------------|------|----------|------|
| **Do nothing** | €0 | — | Expected loss €[X] over 24 months |
| **Traditional security audit** | €[X] | 90 days | Produces report; no structural change |
| **Full E5 licensing upgrade** | €[X]/user/year | 30 days | Solves some gaps; does not address architecture or AI sovereignty |
| **Managed security service (MSSP)** | €[X]/month | Ongoing | Outsources detection; does not reduce structural fragility |
| **Antifragile program (this proposal)** | €[X] | 180 days | Structural change, regulatory evidence, AI sovereignty, measurable resilience |
---
### Page 4: Return on Investment
**Frame**: The return is not revenue. It is **avoided cost + preserved optionality + regulatory license to operate**.
#### Quantifiable Returns
| Return Category | Calculation | 12-Month Value | 24-Month Value |
|----------------|-------------|---------------|----------------|
| Avoided ransomware recovery | Probability reduction × €4.5M | €[X] | €[Y] |
| Avoided regulatory fine | Probability reduction × % GT | €[X] | €[Y] |
| Insurance premium reduction | 10-20% reduction on cyber premium | €[X] | €[Y] |
| Cloud AI cost stabilization | Shift from variable API costs to fixed infra | €[X] | €[Y] |
| Reduced incident response cost | Faster detection and containment | €[X] | €[Y] |
| **Total Quantifiable Return** | | **€[X]** | **€[Y]** |
#### Strategic Returns (Narrative)
| Return Category | Description |
|----------------|-------------|
| **Competitive moat** | Proprietary data improves only your models; competitors cannot replicate your operational intelligence |
| **Regulatory agility** | Demonstrable resilience accelerates regulatory approvals, market entries, and partnership discussions |
| **Talent retention** | Engineers and security professionals prefer organizations that invest in durability over firefighting |
| **M&A readiness** | Clean identity architecture, tested recovery, and documented controls increase valuation and reduce due-diligence friction |
| **Vendor negotiation leverage** | Documented exit architectures improve negotiating position with all major suppliers |
#### ROI Summary
```
ROI = (Total Return - Total Investment) / Total Investment × 100%
```
Present as: *"This program delivers a [X]% return in year one, rising to [Y]% in year two, with strategic optionality that compounds beyond quantification."*
---
### Page 5: Risk and Sensitivity Analysis
**Frame**: We are honest about what could go wrong. That honesty is why you should trust us.
#### Program Risks
| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|-----------|
| Operational disruption during hygiene phase | Medium | Medium | Changes executed in maintenance windows; rollback procedures documented; "get out of jail free" executive authorization |
| Client team capacity constraints | High | Medium | Weekly sprints with clear priorities; we do the heavy lifting; client provides decisions, not labor |
| Scope creep | Medium | High | Ruthless phase gating; kill chain prioritization; deferred items tracked for future phases |
| Tool activation reveals deeper problems | High | Low | This is the point. Early discovery is cheaper than late discovery. |
| Executive sponsor departure | Low | High | Board-level endorsement; documented in steering committee minutes; knowledge transfer at each phase |
#### Sensitivity Analysis
| Scenario | Investment Adjustment | Outcome |
|----------|----------------------|---------|
| **Best case** | No additional tooling needed | Program completes under budget; all value from configuration |
| **Base case** | Local AI hardware required for pilot | Slight budget increase; sovereign intelligence proven |
| **Worst case** | Deeper technical debt than anticipated | Extend Phase 1 by 30 days; additional labor cost; still cheaper than incident |
---
### Page 6: Recommendation and Next Steps
**The Ask (Full Program)**:
> *"We recommend approval of a 180-day antifragile enterprise program, structured in four 30-60-90-180 day phases with hard go/no-go gates. The initial 30-day investment is €[X] with a defined deliverable: identification and initial closure of the organizational kill chain. If measurable risk reduction is not demonstrated by Day 30, the program stops with no further obligation."*
**The Ask (Modular Alternative)**:
> *"Alternatively, we can start with a single, fixed-scope module chosen based on your highest-priority pain. Each module is 30-60 days, fixed price, with defined deliverables and a hard stop. If the value is proven, we proceed to the next module. If not, you have still received a complete, bounded solution. See [Modular Engagements](../core/modular-engagements.md) for the module menu."*
**Immediate Next Steps**:
| Step | Owner | Timeline |
|------|-------|----------|
| Executive sponsor designation | CEO / Board | Week 0 |
| Steering committee scheduling | COO / Chief of Staff | Week 0 |
| Data room access (AD, cloud IAM, network diagrams) | CISO / IT Director | Week 0 |
| SOW execution and kickoff | Procurement / Consultant | Week 1 |
| Week 1 stakeholder interviews | Consultant | Week 1 |
| Day 30 steering committee and go/no-go | Executive Sponsor | Day 30 |
---
## Vertical-Specific Financial Adjustments
### Banking
- **Regulatory fine exposure**: DORA fines up to 2% of global turnover; use client's actual global turnover
- **SWIFT CSP non-compliance**: Potential disconnection from SWIFT network; catastrophic for international payments
- **PSD2 SCA failure**: Transaction rejection rates, customer abandonment, regulator attention
- **Insurance context**: Many banks are self-insured for cyber; frame as direct balance-sheet protection
### Telco / Power (Critical Infrastructure)
- **NIS2 penalties**: Up to €10M or 2% of global turnover (whichever is higher)
- **Operational downtime**: Power outages measured in €/minute; telco downtime in subscriber churn
- **National security implications**: Some incidents trigger government intervention or nationalization risk
- **Supply chain**: Single vendor failure can disable critical infrastructure; optionality has direct monetary value
### Generic Enterprise
- **Ransomware**: Primary quantifiable risk; use industry averages if client-specific data unavailable
- **Business interruption**: Use revenue/day × estimated downtime
- **Reputation**: Use customer acquisition cost × estimated churn from breach notification
---
## The CFO Conversation: Key Metrics
When presenting to the CFO, lead with these metrics and no others:
1. **Expected loss without intervention** (24 months): €[X]
2. **Program cost**: €[Y]
3. **Risk reduction ROI**: [Z]%
4. **Cash payback period**: [X] days
5. **Probability of material incident**: [before]% → [after]%
Everything else is supporting detail.
---
## Template Appendix: Client-Specific Worksheets
### Worksheet 1: Revenue at Risk
```
Annual revenue: €_________
Revenue per day: €_________ (annual / 365)
Critical system downtime tolerance: _________ days
Revenue at risk from downtime: €_________ (revenue/day × tolerance)
```
### Worksheet 2: Regulatory Fine Exposure
```
Global turnover (if applicable): €_________
Applicable regulation: [DORA / NIS2 / National / None]
Maximum fine %: _________%
Maximum fine €: €_________
Probability of fine (current): _________%
Expected fine exposure: €_________
```
### Worksheet 3: Cloud AI Cost Trajectory
```
Current monthly cloud AI spend: €_________
Projected 24-month spend: €_________
Local AI infrastructure cost: €_________
Break-even month: _________
24-month savings: €_________
Data leakage risk (narrative): [Eliminated / Reduced / Unchanged]
```
---
*For the board conversation guide, see [C-Suite Conversation Guide](../core/c-suite-conversation-guide.md).*
*For the one-page executive summary, see [Executive Summary](../core/executive-summary.md).*

View File

@@ -0,0 +1,297 @@
# Endpoint Management: The Antifragile Entry Vector
> *"Every client who asks you to manage their devices is actually asking you to see their blind spots. Endpoint management is the Trojan horse that gets you inside the perimeter—and from there, every other security conversation becomes natural."*
This playbook positions **endpoint management**—Microsoft Intune, Endpoint Manager, and modern device management—as the ideal entry vector for antifragile consulting engagements. It is designed for M365/Azure consultancies whose clients arrive with a specific, bounded request ("manage our devices" or "replace SCCM") and who need a structured path from that request to a comprehensive security transformation.
---
## Why Endpoint Management Is the Perfect Entry Vector
### 1. Clients Ask for It
Unlike abstract security frameworks, endpoint management solves **immediate, visible pain**:
| Client Pain | Why They Call |
|-------------|--------------|
| "We need to manage remote worker laptops" | COVID-era remote work became permanent; devices are invisible |
| "We are retiring SCCM and moving to the cloud" | On-premise management infrastructure is end-of-life or too expensive |
| "We need mobile device management for field staff" | Tablets and phones access email and customer data with no oversight |
| "Our auditor asked for proof of device compliance" | Regulatory gap; no evidence that devices meet security baselines |
| "We bought Intune licenses and never turned them on" | Common scenario: E3/E5 includes Intune but deployment stalled |
| "Users install whatever they want" | Shadow IT on endpoints; malware risk; unlicensed software |
**The insight**: Every one of these requests is a **symptom of deeper fragility**. The client sees the device problem. You see the identity, data, network, and governance problem that the device problem reveals.
### 2. It Creates Immediate Visibility
Once Intune is deployed, you can see:
- Every managed device: OS version, patch level, encryption status
- Every application installed: sanctioned and shadow
- Every configuration drift: firewall off, AV disabled, unknown admin accounts
- Every compliance failure: unencrypted disk, missing updates, jailbroken phone
This visibility is **the foundation of everything else**. You cannot harden what you cannot see. You cannot govern what you cannot inventory.
### 3. It Touches Every Security Domain
Endpoint management is not an island. It is the **intersection point** of:
| Domain | Endpoint Management Connection |
|--------|-------------------------------|
| **Identity** | Device compliance becomes a conditional access signal; non-compliant devices cannot access data |
| **Network** | VPN profiles, certificate deployment, DNS settings, Wi-Fi security |
| **Data** | DLP enforcement at the endpoint; remote wipe; encryption policies |
| **Application** | App deployment, update management, software inventory, browser policies |
| **Threat detection** | EDR onboarding (Defender for Endpoint), ASR rule deployment, vulnerability visibility |
| **AI governance** | Devices are where shadow AI usage happens; endpoint visibility reveals unsanctioned AI tools |
### 4. It Produces Visible Results Fast
In 30 days, a client can see:
- A dashboard of all their devices
- Non-compliant devices highlighted in red
- Policies pushing encryption, updates, and security baselines
- Remote workers no longer "flying blind"
This builds **trust and political capital** for the harder conversations that follow.
---
## The Trojan Horse Strategy
### The Opening Request
> *"Can you help us deploy Intune? We need to manage our laptops and phones."*
### Your Response
> *"Absolutely. And while we are deploying Intune, we will see things that need attention—accounts that should not exist, devices that are not encrypted, applications that are leaking data. We will fix the device problem in 30 days. But we will also give you a map of what we found, because the device is usually where the bigger problems show up first."*
### The Natural Expansion Path
| Endpoint Management Phase | What We Discover | What We Propose Next |
|--------------------------|-----------------|---------------------|
| **Device enrollment and inventory** | Orphaned AD accounts, devices with no owner, unknown machines on the network | Identity hygiene blitz; CMDB seeding |
| **Compliance policy deployment** | No disk encryption, outdated OS, missing patches, legacy authentication | Endpoint hardening; patch management; ASR rules |
| **Application management** | Shadow IT, unlicensed software, consumer AI apps on corporate devices | Application governance; sanctioned AI alternative (Azure OpenAI bridge) |
| **Conditional access integration** | No device-based access control; same credentials work from any device anywhere | Identity security architecture; MFA enforcement; location policies |
| **Remote worker security** | Home networks, personal printers, USB devices, split tunneling | Zero-trust architecture; DNS security; data loss prevention |
---
## The 30-60-90 Day Endpoint Management Sprint
### Phase 1: Visibility (Days 0-30)
**Objective**: Know every device. Know its state. Know its owner.
| Week | Action | Deliverable | Natural Discovery |
|------|--------|-------------|-------------------|
| 1 | Tenant readiness review: Intune licensing, roles, connectors, update rings | Readiness report | Often finds unused E5 Security licenses; orphaned Intune configs from previous attempts |
| 1 | AD/AAD device inventory: What devices exist? Which are managed? Which are not? | Device census spreadsheet | Ghost devices; stale computer accounts; devices with no owner |
| 2 | Enrollment campaign: Auto-enrollment for AAD-joined devices; manual for BYOD/COPE | Enrollment metrics (% managed) | Users with multiple unmanaged devices; non-standard hardware |
| 2 | Compliance baseline: Encryption, OS version, password policy, firewall | Compliance dashboard | Massive non-compliance: unencrypted disks, outdated Windows, disabled firewalls |
| 3 | Application inventory: Installed apps via Intune inventory or WDAC/AppLocker audit | Application report | Shadow IT goldmine: unauthorized VPNs, consumer cloud storage, AI apps, games |
| 3 | Policy deployment (audit mode): Push basic policies without enforcement to measure impact | Policy readiness report | Devices that will break; apps that will be blocked; users who will be affected |
| 4 | Enforcement (gradual): Enable policies in waves; prioritize highest-risk users | Enforcement wave report | Executive devices that were never managed; admin machines with no PAW |
**The Phase 1 conversation**:
> *"We now manage 85% of your devices. Twenty-three devices are unencrypted. Fourteen are running Windows versions that no longer receive security updates. Seven users have installed consumer AI tools that send data to third-party clouds. We fixed the device management request. Here is what we found—and here is what we should fix next."*
---
### Phase 2: Control (Days 30-60)
**Objective**: Ensure every managed device meets the security baseline. Eliminate the highest-risk gaps.
| Week | Action | Deliverable |
|------|--------|-------------|
| 5 | Encryption enforcement: BitLocker (Windows), FileVault (macOS) | Encryption coverage: 100% of managed devices |
| 5 | Update rings: Deploy Windows Update for Business; test and production rings | Patch compliance report |
| 6 | Application control: Block known-bad categories; require approved app installation | Application control policy deployed |
| 6 | Browser hardening: Edge/Chrome policies, extension management, safe browsing | Browser security baseline |
| 7 | Conditional access integration: Device compliance as access signal | CA policies: compliant device required for M365 access |
| 7 | Admin device hardening: PAW enrollment, dedicated admin profiles, restricted browsing | Admin device compliance: 100% |
| 8 | Mobile device hardening: iOS/Android app protection policies, jailbreak detection | Mobile compliance report |
| 8 | DNS and network: Deploy secure DNS (DoH/DoT) via Intune profile | Network security baseline |
**The Phase 2 conversation**:
> *"Your devices are now encrypted, patched, and compliant. Only managed, healthy devices can access your email and documents. But we also discovered that your conditional access policies do not exist yet—so a stolen password from an unmanaged device still works. That is the next bridge to cross."*
---
### Phase 3: Sovereignty and Expansion (Days 60-90)
**Objective**: Use endpoint visibility to drive broader security transformation.
| Week | Action | Deliverable |
|------|--------|-------------|
| 9 | Shadow AI discovery: Review application inventory for AI/ML tools; proxy log correlation | Shadow AI report |
| 9 | Sanctioned AI deployment: Azure OpenAI bridge or local AI alternative for approved use | AI governance pilot |
| 10 | EDR deployment: Defender for Endpoint (if E5) or Wazuh/Sysmon augmentation (if E3) | EDR coverage report |
| 10 | Vulnerability management: Integrate Intune compliance data with vulnerability prioritization | Risk-based patch prioritization |
| 11 | Data loss prevention: Endpoint DLP policies (if Purview licensed) or manual controls | DLP baseline |
| 11 | Recovery validation: Test remote wipe, device replacement workflow, backup of device config | Recovery procedure tested |
| 12 | Governance handover: Client team trained on Intune operations; runbooks documented; monitoring automated | Operational handover complete |
**The Phase 3 conversation**:
> *"Your endpoint estate is now managed, hardened, and visible. From here, the natural next steps are identity hardening—because devices are only as strong as the accounts that access them—and AI sovereignty—because we found consumer AI tools on twelve corporate devices that are sending your data to third parties. We can fix both in the next 90 days."*
---
## Client Archetypes and Approach
### Archetype 1: The SCCM Retiree
**Profile**: Mature on-premises environment; SCCM administering thousands of devices; management wants cloud-native management.
**Entry conversation**:
> *"SCCM has served you well, but it requires infrastructure, VPN connectivity, and on-premises presence. Intune manages devices wherever they are—home, hotel, airport—without VPN. We can run SCCM and Intune in parallel during migration, then retire SCCM once coverage is proven. During the migration, we will also modernize your security baselines because they have likely not been updated since the SCCM deployment began."*
**Key considerations**:
- Co-management (SCCM + Intune) as a transitional state
- Task sequence migration to Intune proactive remediations and PowerShell scripts
- Windows Update for Business replacing WSUS
- Driver and firmware update strategy (Intune is weaker here; plan for Windows Update for Business or third-party tools)
### Archetype 2: The Remote-First Convert
**Profile**: Post-COVID organization; devices scattered globally; no visibility into home office security.
**Entry conversation**:
> *"Your devices are in forty home offices, three countries, and an unknown number of coffee shops. You currently have no visibility into whether they are encrypted, patched, or compromised. Intune gives you that visibility in two weeks. From there, we can enforce compliance so that only healthy devices access company data—regardless of where the device is physically located."*
**Key considerations**:
- BYOD vs. corporate-owned: define the boundary clearly
- Privacy regulations: employee monitoring on personal devices requires legal review
- Network security: home Wi-Fi is untrusted; DNS security and VPN policies critical
- Licensing: Intune is included in E3; no additional purchase required for basic MDM
### Archetype 3: The Compliance-Driven Client
**Profile**: Regulated industry (banking, healthcare, critical infrastructure); auditor found device management gaps; needs evidence.
**Entry conversation**:
> *"Your auditor wants proof that every device accessing customer data is encrypted, patched, and compliant. Intune does not just achieve compliance—it generates the evidence automatically. Every device reports its state. Every policy violation is logged. Every remediation is tracked. When the auditor returns, you show them a dashboard, not a prayer."*
**Key considerations**:
- Evidence retention: compliance reports must be retained for auditor review
- Segregation: regulated devices may need separate compliance policies
- Documentation: every policy must have a business justification for auditor review
### Archetype 4: The Intune License Hoarder
**Profile**: Bought E3/E5 years ago; Intune was never deployed; licenses are "shelfware."
**Entry conversation**:
> *"You are already paying for Intune. It is included in your E3 licenses. Deploying it costs nothing beyond our time—and it will reveal whether you are getting value from the rest of your Microsoft investment. We often find that organizations with unused Intune also have unused MFA, unused conditional access, and unused Defender features. Intune is the first domino."*
**Key considerations**:
- Zero incremental licensing cost is a powerful argument
- Often reveals other underutilized E3/E5 capabilities
- Fastest path to visible ROI
---
## E3 vs. E5 Endpoint Management
| Capability | E3 Inclusion | E5 Addition | Practical Impact |
|-----------|-------------|-------------|------------------|
| Intune MDM/MAM | Yes | Yes | Full device and app management |
| Windows Update for Business | Yes | Yes | Cloud-native patching |
| BitLocker management | Yes | Yes | Encryption deployment and key escrow |
| Defender Antivirus | Yes | Yes | Basic AV configuration via Intune |
| **Defender for Endpoint (EDR)** | **No** | **Yes** | Behavioral detection, threat hunting, automated investigation |
| **Advanced compliance policies** | **Basic** | **Enhanced** | Risk-based conditional access integration |
| **Endpoint DLP** | **No** | **Yes** (Purview) | Data loss prevention at the endpoint |
| **Attack Surface Reduction (ASR)** | **No** | **Yes** | Exploit protection, controlled folder access |
**The E3 approach**:
- Intune for configuration, compliance, and application management
- Sysmon + Wazuh for EDR-like visibility
- Manual vulnerability prioritization
- LAPS for local admin password management
**The E5 approach**:
- Everything in E3, plus Defender for Endpoint full EDR
- ASR rules deployed via Intune
- Automated investigation and remediation
- Endpoint DLP for data governance
- Threat analytics and vulnerability management integration
---
## Converting Endpoint Management Into Antifragile Engagement
### The 30-Day Pivot
At the 30-day steering committee, present:
1. **Device management results**: enrollment %, compliance %, encryption %
2. **Discovery findings**: the top 5 security gaps revealed by device visibility
3. **The expansion proposal**: 60-90 day roadmap to address those gaps
**Example pivot**:
> *"We enrolled 340 devices and achieved 94% compliance. During enrollment, we discovered 12 devices with consumer AI tools sending data to third-party clouds, 8 accounts with standing global admin rights, and no conditional access policies at all. The device problem is solved. We now propose a 60-day identity and access hardening sprint to close the gaps we found."*
### The Natural Service Ladder
```
Month 1: Endpoint Management (Intune deployment)
↓ Discovery of identity, app, and data gaps
Month 2-3: Identity Hardening (MFA, conditional access, PIM)
↓ Discovery of shadow AI and data leakage
Month 4-6: AI Sovereignty (Azure OpenAI bridge, local AI pilot)
↓ Discovery of architectural fragility
Month 6-12: Antifragile Architecture (exit architectures, chaos engineering, red team)
```
---
## Talking Points for Executives
### For the CEO
> *"Your employees are working from home offices, airports, and coffee shops on devices you cannot see. Intune gives you visibility in two weeks and control in four. It is not surveillance—it is ensuring that the device accessing your strategy documents is encrypted, patched, and owned by your company, not a contractor with a personal laptop."*
### For the CFO
> *"You already own Intune. It is included in your E3 licenses. We are not selling you software. We are extracting value you have already paid for. The average organization with E3 uses less than 40% of included security capabilities. Intune is the fastest way to prove ROI on existing licensing."*
### For the CISO
> *"Intune is not just device management. It is the enforcement point for every other security control. Your conditional access policies are useless if they cannot evaluate device health. Your DLP policies are toothless if they do not apply to endpoints. Your identity security is theoretical if stolen credentials work from any unmanaged device. Intune makes the rest of your security stack actually work."*
### For the IT Director
> *"We know SCCM has been reliable. But it requires VPN, on-premises infrastructure, and manual touch. Intune automates what SCCM does and adds capabilities SCCM cannot: mobile device management, application protection on personal devices, and cloud-native patching without VPN. We run them in parallel, migrate gradually, and retire SCCM only when you are confident."*
---
## Integration With Existing Frameworks
| Framework Document | Integration Point |
|-------------------|-------------------|
| [M365 E3 Hardening](m365-e3-hardening.md) | Intune is the primary E3 endpoint management tool; this document extends it with entry-vector strategy |
| [M365 Antifragile Project](m365-antifragile-project.md) | Endpoint management is a core workstream in both greenfield and modernisation projects |
| [Rapid Modernisation Plan](rapid-modernisation-plan.md) | Phase 1 (Hygiene) device visibility maps directly to endpoint management deployment |
| [Zero-Budget Hardening](zero-budget-hardening.md) | Intune is free in E3; Sysmon/Wazuh augment E3 endpoint security without new purchases |
| [Azure OpenAI Sovereignty Bridge](../core/azure-openai-sovereignty-bridge.md) | Device application inventory reveals shadow AI; Intune becomes the enforcement point for sanctioned AI |
| [AI Operations Inevitability](../core/ai-operations-inevitability.md) | Endpoints are where defensive AI agents run; managed endpoints are prerequisite for AI-driven endpoint security |
---
*For the M365 E3 hardening specifics, see [M365 E3 Hardening](m365-e3-hardening.md).*
*For the rapid modernisation plan, see [Rapid Modernisation Plan](rapid-modernisation-plan.md).*
*For the M365 antifragile project playbook, see [M365 Antifragile Project](m365-antifragile-project.md).*

View File

@@ -0,0 +1,403 @@
# Implementation Playbook
> *"This is not an upgrade. It is an insurance policy against the obsolescence of your own company."*
This playbook provides tactical, step-by-step guidance for delivering the [Rapid Modernisation Plan](rapid-modernisation-plan.md) in a client environment. It is organized by workstream and intended for hands-on consultants, security architects, and technical leads.
---
## Table of Contents
1. [Engagement Kickoff](#engagement-kickoff)
2. [Workstream: Identity and Access](#workstream-identity-and-access)
3. [Workstream: Perimeter and Visibility](#workstream-perimeter-and-visibility)
4. [Workstream: AI Sovereignty](#workstream-ai-sovereignty)
5. [Workstream: Resilience and Recovery](#workstream-resilience-and-recovery)
6. [Workstream: Culture and Governance](#workstream-culture-and-governance)
7. [Common Failure Modes](#common-failure-modes)
8. [Tools and Templates](#tools-and-templates)
---
## Engagement Kickoff
### Pre-Engagement Checklist
Before arriving on-site or starting the remote engagement:
- [ ] Client has signed SOW with explicit scope, authority, and escalation paths
- [ ] Key stakeholders identified: CISO, CIO, legal, business unit sponsors
- [ ] Initial data room access granted: AD exports, cloud IAM, network diagrams, CMDB if exists
- [ ] Emergency contact list established with authority to disable accounts / block access
- [ ] Backup verification: confirm backups exist and have been tested within last 90 days
- [ ] "Get out of jail free" letter: written executive authorization for disruptive security actions
### Day 0: Stakeholder Interviews
Interview each stakeholder for 30 minutes. Ask the same five questions:
1. What is the shortest path to a business-ending incident here?
2. What are you most worried about that you are not telling the board?
3. What is the one system whose failure would stop revenue for 24 hours?
4. Where is your proprietary data going that you cannot fully track?
5. If you had to replace your primary cloud vendor in 90 days, could you?
Document answers. Look for contradictions between stakeholders—these reveal hidden dependencies.
### Day 0: Establish the War Room
- Physical or virtual space for daily standups
- Shared dashboard: tasks, blockers, risks
- Direct escalation path to executive sponsor
- Decision log: every major decision recorded with rationale and owner
---
## Workstream: Identity and Access
### Objective
Eliminate unknown identities, reduce privilege, and establish just-in-time access before attackers exploit standing permissions.
### Week 1: Identity Census
**Step 1: Export all identities**
- Active Directory: all users, groups, computers, service accounts
- Cloud IAM: AWS IAM, Azure AD / Entra ID, GCP IAM
- SaaS platforms with local identity stores
- Non-human identities: API keys, service principals, OAuth apps, managed identities
**Step 2: Deduplicate and correlate**
- Match cloud identities to on-premises identities
- Identify orphaned accounts: no owner, no recent use, no documented purpose
- Identify over-privileged accounts: admin rights without justification
**Step 3: Categorize by risk**
| Category | Action | Timeline |
|----------|--------|----------|
| Orphaned, unused > 90 days | Disable immediately | Day 1-2 |
| Shared accounts | Target for elimination or vaulting | Week 1-2 |
| Admin / privileged | Force password rotation + MFA enforcement | Day 3-5 |
| Service accounts with interactive logon | Review and restrict | Week 1-2 |
| External / vendor access | Audit and time-bound | Week 1-2 |
### Week 2: Privilege Reduction
**Step 1: Implement Privileged Access Workstations (PAWs)**
- Dedicated machines for admin tasks
- No internet browsing, no email, no non-admin applications
- Physical or strongly virtualized separation
**Step 2: Deploy Just-in-Time (JIT) elevation where possible**
- Azure AD PIM, AWS IAM Identity Center, or third-party PAM
- Maximum elevation duration: 4 hours
- Require approval for standing admin roles
**Step 3: Password hygiene enforcement**
- Minimum 14 characters, no complexity requirements (NIST 800-63B)
- Audit against known-breached password lists
- Eliminate password rotation mandates unless compromise suspected
### Week 3-4: MFA and Conditional Access
- Enforce MFA on all remote access: VPN, cloud admin, RDP gateways
- Implement risk-based conditional access:
- Unmanaged device → require MFA + compliant device
- Impossible travel → block or step-up
- Legacy authentication → block entirely
### Common Pitfalls
- **Over-scoping**: Do not attempt to fix every identity in 30 days. Focus on privileged and external first.
- **Breaking automation**: Service account password rotations can break CI/CD. Coordinate with application owners. Test in non-production first.
- **Shadow IT identities**: SaaS platforms with standalone accounts (Slack, Zoom, etc.) are often missed. Use email domain scanning or CASB tools.
---
## Workstream: Perimeter and Visibility
### Objective
Know exactly what the organization looks like from the outside, and monitor every path that crosses the trust boundary.
### Week 1-2: External Attack Surface Mapping
**Step 1: Passive reconnaissance**
- Enumerate subdomains: certificate transparency logs, DNS brute force, search engine dorks
- Identify exposed services: Shodan, Censys, custom port scanning from external vantage points
- Map cloud assets: public S3 buckets, open storage accounts, exposed databases
**Step 2: Active validation**
- Confirm ownership of discovered assets with client
- Test for default credentials on exposed management interfaces
- Document findings with risk ratings: P0 (immediate), P1 (urgent), P2 (planned)
### Week 2-3: Internal Visibility
**Step 1: Deploy endpoint detection**
- Microsoft Defender for Endpoint, CrowdStrike, SentinelOne, or equivalent
- Target: 100% of managed Windows, macOS, Linux endpoints
- Validate: can you see process execution, network connections, and file modifications?
**Step 2: Network monitoring**
- Deploy sensors at:
- Internet boundary
- Internal network segments (especially IT/OT boundaries)
- Critical server VLANs
- Enable DNS query logging and analysis
**Step 3: Log aggregation**
- Centralize logs from: identity systems, endpoints, firewalls, cloud control planes, critical applications
- Minimum retention: 90 days hot, 1 year cold
- Ensure tamper protection: attackers delete logs
### Week 3-4: CMDB Seeding
- Populate CMDB with T0 and T1 assets first
- For each asset: owner, criticality, dependencies, recovery requirements
- Accept imperfection. A partially correct CMDB is infinitely better than no CMDB.
### Common Pitfalls
- **Scanning without authorization**: Ensure written approval for active scanning. Some jurisdictions treat unauthorized scanning as criminal.
- **Alert fatigue**: Do not enable every detection rule on day one. Start with high-confidence, high-impact alerts. Tune before expanding.
- **Log storage costs**: Centralized logging is expensive. Prioritize critical systems. Use tiered storage.
---
## Workstream: AI Sovereignty
### Objective
Convert intelligence from a rented commodity into an owned, protected, T0-class asset.
### Week 1-2: AI Usage Discovery
**Step 1: Survey**
- Interview department heads: engineering, legal, marketing, operations, finance
- Ask: "What AI tools are you using? What data are you putting into them?"
- Expect 30-50% shadow usage. Employees use personal ChatGPT accounts, browser extensions, and mobile apps.
**Step 2: Technical discovery**
- Review proxy logs for AI API traffic: OpenAI, Anthropic, Google, Azure OpenAI
- Review SaaS billing for AI-enabled tools
- Review browser extensions and endpoint software inventories
**Step 3: Data flow mapping**
For each discovered AI tool, document:
- Data types entering the tool
- Data residency and processing location
- Vendor terms: training use, retention, deletion, subprocessing
- Regulatory implications: GDPR, DORA, NIS2, industry-specific
### Week 3-4: Local AI Infrastructure
**Step 1: Select hardware or sovereign cloud**
| Option | When to Use |
|--------|-------------|
| On-premise GPU servers | High volume, strict air-gap, existing data centre capacity |
| Sovereign cloud (EU, national) | Regulatory requirements, no on-premises GPU expertise |
| Edge inference nodes | Distributed organization, OT environments, low-latency requirements |
**Step 2: Select initial model**
For most organizations, start with:
- **Base model**: Llama 3, Mistral, or Qwen (7B-13B parameters, quantized to 4-bit)
- **Deployment**: Ollama, vLLM, or llama.cpp for inference
- **Orchestration**: LangChain or custom RAG pipeline for proprietary data integration
- **Fine-tuning**: QLoRA for domain adaptation on proprietary datasets
**Step 3: Deploy with T0 controls**
- Network segmentation: inference hosts have no direct internet egress
- Access control: model weights encrypted at rest; access requires multi-party approval
- Audit: log all prompts, responses, and model access
- Backup: immutable backups of weights, configurations, and vector databases
### Week 5-8: Pilot and Measure
Select one high-value, low-risk workflow:
| Workflow | Why It Works |
|----------|-------------|
| Internal code review assistant | Proprietary code never leaves perimeter; measurable quality improvement |
| Security log analysis | Feeds defensive AI directly; reduces analyst workload |
| Policy / compliance document drafting | High volume, repetitive, proprietary domain knowledge |
| Customer support triage | Reduces response time; training data is historical tickets |
**Measurement criteria**:
- Accuracy vs. cloud baseline (human-evaluated on a sample)
- Cost per inference (compute + personnel)
- Data leakage incidents: zero
- User satisfaction: qualitative survey
### Common Pitfalls
- **Over-engineering the first deployment**: Do not build a full MLOps platform for the pilot. Start simple. Prove value. Then scale.
- **Ignoring GPU availability**: GPU procurement can take months. Have a cloud fallback for the pilot if on-premises hardware is delayed.
- **Neglecting prompt injection**: Local models are not immune to adversarial prompts. Implement input validation and output filtering.
- **Forgetting the human loop**: AI augments decisions; it does not replace accountability. Design workflows where humans retain final authority.
---
## Workstream: Resilience and Recovery
### Objective
Ensure that when—not if—a critical system fails, recovery is fast, tested, and deterministic.
### Week 1-4: Backup Validation
**Step 1: Inventory backup coverage**
- For every T0 and T1 asset: what is backed up, how often, where, by what mechanism
- Identify gaps: databases without point-in-time recovery, VMs without application-consistent snapshots
**Step 2: Test restoration**
- Select one critical system per week
- Perform full restoration to isolated environment
- Document: time to restore, data loss window, manual steps required, blockers encountered
**Step 3: Fix what breaks**
- If a backup cannot be restored, the backup does not exist
- Update procedures, fix tooling, re-test
### Month 2-3: Recovery Automation
- Automate the most common recovery scenarios: VM restore, database point-in-time recovery, Active Directory forest recovery
- Document runbooks for scenarios that cannot be fully automated
- Train multiple team members on each runbook
### Month 3-6: Chaos Engineering
**Step 1: Game days**
- Scheduled, announced simulations of failure scenarios
- Example: simulate domain controller failure during business hours
- Measure: detection time, escalation time, resolution time, communication quality
**Step 2: Chaos experiments**
- Unannounced, bounded experiments in non-production
- Example: terminate API service instances, block DNS resolution, fill disk space
- Validate: auto-scaling, alerting, runbook accuracy
**Step 3: Production chaos**
- Only after months of successful game days and non-production experiments
- Start with low-risk failures: single instance termination, network latency injection
- Always have automated rollback and a human kill switch
### Common Pitfalls
- **Assuming backups work**: Untested backups are prayers, not plans.
- **Recovery without validation**: A restored system that cannot authenticate users or connect to databases is not recovered.
- **Chaos without guardrails**: Never run chaos experiments when the organization is already under stress (active incident, change freeze, key personnel on leave).
---
## Workstream: Culture and Governance
### Objective
Embed antifragile principles into decision-making, hiring, and organizational habits.
### Tactics
**Blameless Post-Mortems**
- Within 48 hours of significant incident
- Focus: what about the system allowed this mistake? Not: who made the mistake?
- Mandatory output: at least one structural change (policy, architecture, or procedure)
- Publish internally: transparency builds trust and disseminates learning
**Security Champions Program**
- Identify one volunteer per team who acts as security liaison
- Monthly 1-hour meeting: new threats, policy changes, team-specific concerns
- Champions feed team context up and security guidance down
**Red Team as a Service**
- Monthly or quarterly adversarial simulations
- Report to CISO and board, not just IT
- Measure: time to detect, time to contain, time to evict
- Trend over time: the organization should get faster, not just more compliant
**Antifragile Metrics Review**
- Monthly steering committee reviews:
- Mean time to structural fix (from incident)
- Number of chaos experiments run and lessons learned
- % of vendor dependencies with documented exit plan
- AI sovereignty maturity score
### Common Pitfalls
- **Post-mortems without action**: If findings are not tracked to completion, they become theater.
- **Security champions without authority**: Champions need time allocation and executive backing, or they become scapegoats.
- **Metrics without narrative**: Numbers alone do not persuade boards. Pair metrics with stories: "Here is what we learned, here is what we changed, here is why we are safer."
---
## Common Failure Modes
| Failure Mode | Symptom | Remedy |
|-------------|---------|--------|
| **Scope creep** | 30-day phase stretches to 90 days | Time-box ruthlessly. Document deferred items for next phase. |
| **Tool obsession** | Team debates SIEM vendor for 3 weeks | Pick the good-enough tool. Implementation beats selection. |
| **Perfectionism** | CMDB project stalls waiting for completeness | Seed with critical assets. Expand iteratively. |
| **Vendor capture** | Recommendations always favor one provider | Disclose relationships. Maintain independence. Document alternatives. |
| **Executive fatigue** | Board stops attending updates | Lead with business risk, not technical detail. Show cost of inaction. |
| **Operational resistance** | IT refuses to disable legacy accounts | Use the "get out of jail free" letter. Escalate to executive sponsor. |
| **Pilot purgatory** | Local AI pilot runs forever without production migration | Define hard success criteria and production migration date before starting. |
---
## Tools and Templates
### Templates Included in This Repository
- [T0 Asset Classification Worksheet](../core/t0-asset-framework.md#t0-classification-worksheet)
- AI Usage Discovery Interview Guide (see Workstream: AI Sovereignty)
- Blameless Post-Mortem Template (to be added)
- Chaos Experiment Planning Template (to be added)
- Vendor Exit Architecture Template (to be added)
### Recommended External Tools
| Category | Options | Notes |
|----------|---------|-------|
| Endpoint Detection | Microsoft Defender, CrowdStrike, SentinelOne | Choose based on existing Microsoft footprint |
| SIEM / Log Analysis | Sentinel, Splunk, Elastic, Wazuh | Wazuh is open-source and sufficient for many environments |
| Identity Governance | Azure AD / Entra ID, Okta, Saviynt | Match to primary cloud identity provider |
| PAM / Vault | CyberArk, Delinea, HashiCorp Vault | Essential for service account and secret management |
| CMDB | ServiceNow, Device42, GLPI, or spreadsheet | Any CMDB is better than no CMDB |
| Local AI Inference | Ollama, vLLM, llama.cpp, TGI | Start simple; scale to TGI or vLLM for production load |
| Chaos Engineering | Gremlin, Chaos Mesh, custom scripts | Gremlin for enterprise; Chaos Mesh for Kubernetes |
---
*This playbook is a living document. Update it with lessons from every engagement.*
*Previous: [Rapid Modernisation Plan](rapid-modernisation-plan.md)*

View File

@@ -0,0 +1,310 @@
# M365 Antifragile Project Playbook
> *"Most M365 deployments create fragile monocultures: one tenant, one identity provider, one way in, and no way out. We architect M365 as an antifragile platform: decoupled, observable, recoverable, and sovereign."*
This playbook applies antifragile principles to Microsoft 365 projects—both **greenfield deployments** (new tenant, new organization, or post-merger consolidation) and **modernisation** (existing tenant hardening, restructuring, or security transformation).
It is designed for M365/Azure consultancies who want to deliver resilient, governance-ready, and future-proof M365 environments—not just functional ones.
---
## The Antifragile M365 Philosophy
Traditional M365 projects optimize for:
- **User adoption**: How quickly can we get people using Teams?
- **Feature enablement**: Which M365 apps should we roll out?
- **License efficiency**: Are we using all our E3/E5 seats?
Antifragile M365 projects optimize for:
- **Structural decoupling**: Can we migrate, split, or exit this tenant without existential disruption?
- **Observability**: Do we know who has access to what, and what they are doing with it?
- **Recoverability**: Can we rebuild this tenant from zero in 48 hours?
- **Sovereignty**: Does our proprietary data improve our position, or Microsoft's?
---
## Part 1: Greenfield M365 Deployment
### Phase 0: Architecture and Sovereignty Design (Before Migration)
**Objective**: Design the tenant so it does not become a trap.
| Decision | Antifragile Default | Fragile Alternative |
|----------|--------------------|---------------------|
| **Tenant location** | Data center in client's primary jurisdiction (e.g., EU, Germany, Switzerland) | Default US tenant with data residency afterthought |
| **Domain strategy** | Custom domain owned by client; MX records client-controlled | Microsoft-managed domain; no exit path |
| **Identity architecture** | Cloud-only Entra ID with documented exit path, OR hybrid with phased cloud-native migration | Hybrid AD with indefinite synchronization; no cloud-only plan |
| **Email archiving** | Immutable third-party journal or customer-managed retention; not Exchange Online-only | Exchange Online retention only; vendor-dependent |
| **External sharing** | Default off; enabled per-site with justification | Default on; locked down reactively after incidents |
| **Guest access** | Disabled by default; enabled via governed workflow | Enabled by default; cleaned up never |
| **Third-party apps** | Admin consent required; app catalog governed | User consent allowed; shadow OAuth proliferation |
| **Backup strategy** | Third-party backup with immutable storage; tested quarterly | Native recycle bin only; no recovery testing |
**The conversation**:
> *"We are not just setting up email and Teams. We are designing the digital foundation of your organization for the next decade. Every decision we make in the first two weeks will either preserve your optionality or eliminate it. We choose optionality."*
---
### Phase 1: Tenant Foundation (Week 1-2)
**Identity and Access Architecture**
- **Custom domain verification**: Client retains DNS control; Microsoft is a service, not an owner
- **Break-glass accounts**: 2-3 global admins, excluded from conditional access, complex passwords managed offline
- **Initial admin roles**: No standing global admins for daily work; delegated admin roles (Exchange admin, SharePoint admin, User admin)
- **Security defaults or conditional access baseline**:
- E3: Per-user MFA for all admins; block legacy authentication
- E5: Conditional access requiring MFA for all users, compliant devices for admins, block legacy auth, risky sign-in policies
**Data Governance Foundation**
- **Retention policies**: Define retention from day one
- Email: 7 years for regulated industries; 3 years for general business
- Teams chat: 2 years minimum
- SharePoint: per-site classification
- **Microsoft Purview labels** (if licensed): Deploy default sensitivity labels (Public, Internal, Confidential, Highly Confidential)
- **Data loss prevention** (if licensed): Pilot DLP for PCI, PII, and client-defined crown jewels
**Baseline Security Configuration**
- **Audit logging**: Enable Unified Audit Log immediately; configure 10-year retention for regulated clients
- **Mailbox auditing**: Enable for all mailboxes via PowerShell
- **Alert policies**: Configure default alert policies for elevated privileges, malware, phishing
- **Secure Score**: Baseline and weekly tracking
---
### Phase 2: Workload Deployment (Week 3-6)
**Deployment Order (Antifragile Priority)**
| Priority | Workload | Why First? |
|----------|----------|-----------|
| 1 | **Exchange Online** | Identity verified, email secured, archiving established |
| 2 | **SharePoint / OneDrive** | Document governance foundation before content accumulates |
| 3 | **Teams** | Collaboration with the governance guardrails already in place |
| 4 | **Intune / Endpoint Management** | Device compliance before conditional access enforcement; see [Endpoint Management Entry Vector](endpoint-management-entry-vector.md) |
| 5 | **Power Platform** | Low-code governance before citizen developers create shadow IT |
| 6 | **Copilot / AI features** | Only after data governance, access control, and sovereignty architecture are proven |
**The antifragile rule**: Governance before workload. Every Teams channel created without retention policy is technical debt. Every Power App deployed without DLP is a future incident.
---
### Phase 3: Hardening and Governance (Week 7-10)
**Conditional Access (E5 or Entra ID P1/P2)**
- Require MFA for all users
- Require compliant or hybrid Azure AD joined device for sensitive apps
- Block legacy authentication
- Block downloads from unmanaged devices for confidential content
- Require password change on high user risk
- Enforce token binding where supported
**SharePoint and OneDrive Lockdown**
- External sharing: Only people in your organization (default)
- Anyone links: Disabled
- Guest access: Admin-controlled per site
- Site creation: Admin-only or governed workflow
- Access requests: Disabled or routed to site owner
**Teams Governance**
- Team creation: Governed workflow (not open to all)
- Guest access in Teams: Disabled by default; enabled per team with justification
- Private channel creation: Restricted
- Third-party apps in Teams: Admin-approved catalog only
- Meeting recordings: Retention policy applied; transcription governed
**Power Platform Governance**
- Environment strategy: Default environment restricted; production environments for approved use cases
- DLP policies: Block connectors that exfiltrate data (personal email, unauthorized cloud storage)
- Data policies: Prevent citizen developers from creating unmanaged databases of customer data
- ALM: Require solution packaging for production environments
---
### Phase 4: Validation and Handover (Week 11-12)
**Recovery Testing**
- Perform tenant recovery drill: restore a deleted mailbox, a deleted SharePoint site, a corrupted Teams channel
- Validate backup integrity if third-party backup is deployed
- Document recovery runbooks
**Governance Documentation**
- Acceptable use policy for M365
- Data classification and handling guide
- Guest access policy
- External sharing decision tree
- Incident response runbook for M365-specific threats (BEC, OAuth consent grants, data exfiltration)
**Knowledge Transfer**
- Admin training: Entra ID, Exchange admin center, SharePoint admin, Security & Compliance
- End-user training: Phishing awareness, data handling, external sharing procedures
- Champion program: Identify M365 champions per department
---
## Part 2: M365 Modernisation
### The Modernisation Audit
Before any changes, assess the current tenant against antifragile criteria:
| Category | Audit Question | Finding |
|----------|---------------|---------|
| **Identity** | How many global admins? How many unused accounts? Is PIM enabled? | |
| **Access** | Is conditional access deployed? Is legacy auth blocked? Is MFA enforced? | |
| **Data** | Are sensitivity labels deployed? Is DLP active? Who can share externally? | |
| **Applications** | How many enterprise apps? How many OAuth consents? Are they justified? | |
| **Devices** | What is EDR coverage? Is Intune managing devices? Are PAWs used for admin? | |
| **Recovery** | When was the last backup test? Is there a tenant recovery plan? | |
| **Governance** | Is there an acceptable use policy? Who owns site creation? | |
| **AI** | Is shadow AI in use? Is there a sanctioned alternative? | |
**The conversation**:
> *"Most M365 modernisations start with 'What new features should we enable?' We start with 'What would kill this organization if it failed?' Then we fix that first."*
---
### Phase 1: Kill Chain Closure (Week 1-4)
**Identity Blitz**
```powershell
# Export and analyze the full identity estate
Get-MgUser -All | Select-Object DisplayName,UserPrincipalName,AccountEnabled,LastSignInDateTime | Export-Csv users.csv
Get-MgDirectoryRole | ForEach-Object { Get-MgDirectoryRoleMember -DirectoryRoleId $_.Id }
Get-MgOAuth2PermissionGrant -All | Export-Csv oauth-grants.csv
```
- Disable unused accounts (> 90 days inactive)
- Remove excessive admin roles
- Revoke stale OAuth consents
- Enable PIM for all privileged roles (if licensed)
- Enforce MFA for all users (per-user MFA for E3; conditional access for E5)
**External Access Lockdown**
- Audit all guest users: business justification per guest
- Audit all external shares: revoke stale links
- Audit all enterprise apps: remove unused, justify retained
- Disable user consent for apps (admin consent required)
**Email Security Tuning**
- E3: Maximize EOP (anti-phishing impersonation protection, anti-malware, anti-spam)
- E5: Enable Safe Links, Safe Attachments, advanced anti-phishing
- Mailbox auditing: enable for all mailboxes
---
### Phase 2: Structural Improvement (Week 5-8)
**Data Governance Deployment**
- Deploy sensitivity labels (if Purview available) or manual classification guidance
- Deploy retention policies for all workloads
- Deploy DLP policies for high-sensitivity data types
- Site provisioning governance: restrict site creation or implement approval workflow
**Device and Endpoint**
- Deploy Intune MDM for all corporate devices
- Deploy Windows Defender features available in E3
- Consider Sysmon + Wazuh for EDR-like visibility without E5
- Deploy LAPS for local admin password randomization
**Power Platform Cleanup**
- Inventory all environments, apps, and flows
- Apply DLP policies
- Migrate unmanaged production apps to governed environments
- Document and train citizen developers
---
### Phase 3: Sovereignty and AI Integration (Week 9-12)
**AI Sovereignty Bridge**
- Inventory shadow AI usage
- Deploy Azure OpenAI Service as sanctioned alternative (see [Azure OpenAI Sovereignty Bridge](../core/azure-openai-sovereignty-bridge.md))
- Configure private endpoints, CMK, and conditional access for AI endpoints
- Pilot Copilot for M365 with governance guardrails (if licensed)
**Tenant Recovery Validation**
- Third-party backup test: restore mailbox, SharePoint site, Teams data
- Document tenant rebuild runbook
- Validate domain recovery procedures (DNS, MX, SPF, DKIM, DMARC)
**Operational Handover**
- Transfer admin knowledge to client team
- Establish recurring governance review cadence
- Deploy automated Secure Score monitoring
---
## Antifragile M365 Checklist
### Greenfield Deployment
- [ ] Tenant in correct geographic region
- [ ] Custom domain with client-controlled DNS
- [ ] Break-glass accounts created and secured
- [ ] Security defaults or conditional access baseline
- [ ] Unified Audit Log enabled
- [ ] Retention policies defined and deployed
- [ ] External sharing default: off
- [ ] Guest access default: disabled
- [ ] User consent for apps: disabled
- [ ] Intune MDM baseline configured
- [ ] Third-party backup deployed and tested
- [ ] Recovery runbook documented
- [ ] Admin and end-user training completed
- [ ] AI governance framework defined before Copilot deployment
### Modernisation
- [ ] Full identity census completed
- [ ] Unused accounts disabled
- [ ] Admin roles minimized and justified
- [ ] OAuth consents audited and cleaned
- [ ] MFA enforced for 100% of users
- [ ] Legacy authentication blocked
- [ ] External sharing audited and locked down
- [ ] Guest access audited and time-bounded
- [ ] Email security tuned (EOP or Defender for O365)
- [ ] Sensitivity labels or classification guidance deployed
- [ ] Retention policies applied to all workloads
- [ ] Power Platform governed with DLP
- [ ] Shadow AI inventoried and sanctioned alternative deployed
- [ ] Backup recovery tested
- [ ] Secure Score trending upward
---
## Integration With the Rapid Modernisation Plan
| Rapid Modernisation Phase | M365 Project Mapping |
|--------------------------|---------------------|
| **Hygiene (Days 0-30)** | Identity audit; external access lockdown; MFA enforcement; shadow AI inventory |
| **Control (Days 30-60)** | Conditional access; data governance; device management; email security tuning |
| **Sovereignty (Days 60-90)** | Azure OpenAI bridge deployment; backup recovery validation; tenant exit architecture |
| **Antifragility (Days 90-180)** | Automated governance monitoring; quarterly recovery drills; red team including M365 vectors; AI pilot expansion |
---
*For the M365 E3 hardening specifics, see [M365 E3 Hardening](m365-e3-hardening.md).*
*For the Azure OpenAI sovereignty bridge, see [Azure OpenAI Sovereignty Bridge](../core/azure-openai-sovereignty-bridge.md).*
*For the M365 project risk register, see [M365 Project Risk Register](../assessment-templates/m365-project-risk-register.md).*

View File

@@ -0,0 +1,331 @@
# M365 E3 Hardening Playbook
> *"Most of your clients own E3, not E5. That is not a handicap. It is a constraint that forces precision."*
This playbook is designed for consulting engagements where the client's primary environment is **Microsoft 365 with E3 licensing**. It provides a pragmatic hardening roadmap that respects the E3 feature boundary while closing critical security gaps through configuration, process, and low-cost augmentation.
E3 provides the foundation. The gaps are real but manageable. This document shows you exactly what E3 gives you, what it does not, and how to close the gaps without immediately pushing an E5 upgrade.
---
## What E3 Actually Includes (Security-Relevant)
| Capability | E3 Inclusion | Notes |
|-----------|-------------|-------|
| Exchange Online Protection (EOP) | Yes | Anti-malware, anti-spam, basic anti-phishing |
| Azure AD Free / Entra ID Free | Yes | Basic identity, no conditional access, no PIM |
| Microsoft Defender Antivirus | Yes | Client-side AV, no EDR, no ASR |
| Office 365 Audit Logging | Yes | Must be manually enabled |
| Basic Mobile Device Management (MDM) | Yes | Via Microsoft Intune limited enrollment |
| Self-Service Password Reset (SSPR) | Yes | Requires Azure AD Basic configuration |
| Teams, SharePoint, OneDrive | Yes | Data governance limited without Purview |
## What E3 Does NOT Include (The Gaps)
| Capability | Missing in E3 | Business Impact |
|-----------|---------------|-----------------|
| Microsoft Defender for Endpoint P2 | No | No EDR, no ASR rules, no threat analytics, no automated investigation |
| Entra ID P2 / P1 Conditional Access | No | No risk-based policies, no device compliance gating, no location-based rules |
| Entra ID PIM | No | No just-in-time admin elevation |
| Microsoft Defender for Office 365 P2 | No | No Safe Links, no Safe Attachments, no advanced anti-phishing |
| Microsoft Purview | No | No DLP, no sensitivity labels, no insider risk management |
| Microsoft Sentinel | No | No native SIEM; logs go to Log Analytics only with additional cost |
---
## The E3 Hardening Strategy
We operate in three layers:
1. **Maximize E3** — Every configuration, every policy, every log that E3 can produce
2. **Augment E3** — Open-source and low-cost tools that close the most dangerous gaps
3. **Justify E5 selectively** — Use E3 gaps as evidence for strategic E5 upgrades, not blanket licensing
---
## Phase 1: E3 Foundation (Week 1-2)
### Identity and Access
**Enable MFA for All Users**
E3 includes MFA via Azure AD Free/Entra ID Free, but it is **per-user MFA** (less flexible than conditional access). This is still mandatory.
- Navigate to Microsoft Entra admin center → Users → Per-user MFA
- Enable MFA for all administrative accounts first
- Roll out to all users within 14 days
- Enroll at least one backup method per user (authenticator app + phone)
**Document the Gap**: Per-user MFA cannot enforce risk-based step-up, device compliance, or location-based blocking. Document this as a risk for steering committee.
**Disable Legacy Authentication**
- Microsoft 365 admin center → Settings → Org settings → Modern authentication
- Verify legacy auth is disabled tenant-wide
- If specific protocols are required (e.g., IMAP for legacy devices), document exceptions with expiration dates
**Audit and Cleanse Identities**
- Export all users: `Get-MsolUser -All | Export-Csv`
- Export all guest users: `Get-MsolUser -All -UnlicensedUsersOnly` (guests often hidden)
- Export all service principals / enterprise apps: `Get-MsolServicePrincipal`
- Disable unused accounts (> 90 days inactive)
- Review and revoke excessive OAuth consents
**Secure Break-Glass Accounts**
- Create 2-3 Global Admin accounts that are excluded from MFA (for emergency access)
- Use non-personal, complex passwords (20+ characters, managed offline)
- Log every use; review quarterly
### Email Security (EOP-Only)
**Harden Anti-Phishing in EOP**
EOP anti-phishing is basic but not useless. Configure it aggressively:
- Exchange admin center → Protection → Anti-phishing
- Enable impersonation protection for:
- Domain (your own domains)
- Users (CEO, CFO, board members)
- Enable mailbox intelligence (learns sender patterns)
- Set action for impersonated users: **Quarantine**
- Set action for impersonated domains: **Quarantine**
**Configure Anti-Malware**
- Exchange admin center → Protection → Anti-malware
- Enable common attachment filter (block executable content)
- Notify internal senders if malware detected
- Notify administrators with full message details
**Anti-Spam Tuning**
- Exchange admin center → Protection → Anti-spam
- Set bulk email threshold to 6 or 7 (aggressive)
- Enable SPF hard fail evaluation
- Configure outbound spam notifications
### Audit Logging
**Enable Unified Audit Log**
This is **not enabled by default** in many tenants and is the most underutilized E3 feature.
```powershell
# Verify status
Get-AdminAuditLogConfig | Select-Object UnifiedAuditLogIngestionEnabled
# Enable if false
Set-AdminAuditLogConfig -UnifiedAuditLogIngestionEnabled $true
```
- Retention: 90 days (E3 default); document the gap vs. 1-year requirement in some regulations
- Export for analysis: `Search-UnifiedAuditLog` or use Microsoft Purview Audit (Standard) if available
**Enable Mailbox Auditing**
```powershell
# Enable for all mailboxes
Get-Mailbox -ResultSize Unlimited | Set-Mailbox -AuditEnabled $true
```
### SharePoint and OneDrive
**External Sharing Lockdown**
- SharePoint admin center → Policies → Sharing
- Default: **Only people in your organization**
- Override per site only with documented business justification
- Disable "Anyone" links (anonymous sharing)
**OneDrive Retention**
- OneDrive admin center → Storage
- Set retention for deleted users: 30 days minimum
- Document data ownership transfer process
---
## Phase 2: Augment E3 (Week 3-4)
### Close the EDR Gap (No Defender for Endpoint P2)
E3 includes Microsoft Defender Antivirus but **not** EDR. You need visibility.
| Option | Cost | Effort | When to Use |
|--------|------|--------|-------------|
| **Wazuh** (open-source) | Free | Medium | Need centralized EDR-like visibility without purchase |
| **Sysmon + free log forwarding** | Free | Medium | Need detailed Windows endpoint telemetry |
| **Upgrade select users to E5 Security** | ~$10/user/month | Low | Critical users only (admins, executives, finance) |
| **Microsoft Defender for Business** | ~$3/user/month | Low | Small business clients; includes EDR-lite |
**Recommended Hybrid Approach for E3 Clients**:
1. Deploy **Sysmon** (free) on all Windows endpoints with the SwiftOnSecurity config
2. Forward Sysmon logs to **Wazuh** (free) or existing syslog/SIEM
3. Upgrade **only privileged users** to Microsoft Defender for Endpoint P2 via add-on or E5 Security
4. This gives you EDR coverage where it matters most at ~10% of full E5 cost
### Close the Conditional Access Gap (No Entra ID P1/P2)
Without conditional access, you cannot enforce:
- Device compliance gating
- Location-based blocking
- Risk-based step-up
- Block legacy auth per-protocol
**Mitigations within E3**:
- **Per-user MFA**: Enforce for 100% of users (already covered above)
- **Block legacy auth tenant-wide**: Already covered above
- **Intune MDM enrollment**: E3 includes basic Intune; enroll all corporate devices
- **Third-party MFA with policy engine**: Duo, Okta (additional cost, but cheaper than full E5)
**The Strategic Conversation**:
> *"E3 gives us strong authentication but weak authorization. We can enforce MFA, but we cannot say 'only from a managed device in the Czech Republic.' If that is a requirement for your risk profile, the minimum viable upgrade is Entra ID P1 for conditional access, not a full E5 jump."*
### Close the Email Security Gap (No Defender for Office 365 P2)
EOP anti-phishing is reactive. Safe Links and Safe Attachments are proactive.
**Mitigations within E3**:
- **URL rewriting via transport rules**: Block known bad TLDs, force HTTPS where possible
- **Attachment filtering**: Block executable attachments at transport rule level (EOP already does this partially)
- **User education**: Phishing simulation via free or low-cost platforms (GoPhish is open-source)
- **Third-party email gateway**: Proofpoint, Mimecast, Avanan (~$3-5/user/month)
**The Strategic Conversation**:
> *"EOP catches spam and known malware. It does not rewrite URLs or sandbox attachments. For a bank/telco/power client, that gap is meaningful. The most cost-effective close is either Defender for Office 365 P1 add-on or a third-party gateway. Let us quantify the phishing risk first, then size the investment."*
### Close the PAM Gap (No PIM)
Without PIM, administrative roles are standing privileges.
**Mitigations within E3**:
- **Dedicated admin accounts**: Separate admin and user identity for every administrator
- **PAW (Privileged Access Workstation)**: Physical or virtual separation for admin tasks
- **Time-bounded access via process**: Manual approval workflow for admin elevation
- **Quarterly admin access review**: Document every admin; remove stale assignments
- **LAPS**: Free from Microsoft; randomizes local admin passwords
---
## Phase 3: M365-Specific Threat Scenarios
### Scenario 1: Business Email Compromise (BEC)
**The Attack**: Adversary compromises executive mailbox, sends fraudulent payment instructions.
**E3 Defenses**:
- Impersonation protection in EOP (configured above)
- Mailbox auditing (configured above)
- MFA on all accounts (prevents initial compromise)
- Outbound spam policy: flag unusual send patterns
**Gap**: No Safe Links to rewrite URLs in real-time; no automated investigation.
**Augmentation**: User education + third-party email gateway.
### Scenario 2: OAuth / Consent Grant Attack
**The Attack**: User grants permissions to malicious app; adversary gains persistent access.
**E3 Defenses**:
- Audit all enterprise apps: `Get-AzureADServicePrincipal`
- Review OAuth consents quarterly
- Disable user consent to apps (admin consent required)
- Microsoft 365 admin center → Settings → Org settings → User consent to apps → **Off**
**Gap**: No automated anomaly detection for consent grants.
**Augmentation**: Manual quarterly review + scripting.
### Scenario 3: Data Exfiltration via SharePoint / OneDrive
**The Attack**: Insider or compromised account bulk-downloads sensitive files.
**E3 Defenses**:
- External sharing locked down (configured above)
- Audit logging enabled (configured above)
- Basic retention policies
**Gap**: No DLP, no sensitivity labels, no insider risk analytics.
**Augmentation**:
- PowerShell scripts to detect bulk downloads
- Quarterly access reviews on sensitive sites
- Process: data classification by site owner (manual but effective)
### Scenario 4: Lateral Movement via Compromised Credentials
**The Attack**: Phished credentials → mailbox compromise → password reset on other services → full identity takeover.
**E3 Defenses**:
- MFA (prevents password-only access)
- SSPR with MFA enforcement (prevents account lockout abuse)
**Gap**: No risk-based step-up; no impossible travel blocking.
**Augmentation**: Monitor for impossible travel in audit logs (manual or scripted).
---
## The E5 Upgrade Conversation
There will come a point where E3 augmentation is no longer cost-effective. Frame the E5 conversation around **specific capability gaps**, not feature lust.
| E5 Capability | What It Solves | When to Recommend |
|--------------|----------------|-------------------|
| Defender for Endpoint P2 | EDR, ASR, threat analytics | Client has had malware incident or is in regulated industry |
| Entra ID P2 | Conditional access, PIM, identity protection | Client has admin compromise or needs device/location gating |
| Defender for Office 365 P2 | Safe Links, Safe Attachments, automated investigation | Client has had phishing-driven incident |
| Purview | DLP, sensitivity labels, insider risk | Client handles customer PII, financial data, or trade secrets |
| Sentinel | SIEM, SOAR, threat hunting | Client has dedicated SOC or regulatory SIEM requirements |
**The Pitch**:
> *"We have extracted 80% of the security value from your E3 investment. The remaining 20% requires capabilities that only exist in E5 or specific add-ons. I am not recommending a blanket upgrade. I am recommending we selectively license the gaps that match your actual risk profile."*
---
## OT / Critical Infrastructure Overlay (Telco, Power)
For clients with operational technology (OT) or critical infrastructure obligations:
| E3 Consideration | OT Implication |
|-----------------|----------------|
| MFA enforcement | Admin accounts for OT-facing M365 tenants must have hardware tokens (no phone SMS in control rooms) |
| Audit logging | 90-day retention may be insufficient; plan export to long-term storage |
| External sharing | OneDrive/SharePoint must not become accidental conduit between IT and OT networks |
| Guest access | Strictly prohibit guest accounts in OT-connected tenants |
| Email security | EOP is baseline; NIS2 and critical infrastructure regulations may mandate advanced email filtering |
See [Vertical: Power Utilities](../reference/vertical-power-utilities.md) for full OT alignment.
---
## Banking Overlay
For financial services clients:
| E3 Consideration | Regulatory Implication |
|-----------------|----------------------|
| Audit logging | DORA Article 12 (ICT risk management) requires comprehensive logging and monitoring |
| MFA | PSD2 Strong Customer Authentication principles apply to internal systems |
| Data residency | M365 data must remain in EU/geographically appropriate datacenters |
| DLP gap | No native DLP in E3; manual data governance + eventual Purview upgrade likely required |
| Email archiving | Financial regulations often require immutable, long-term email retention |
See [Vertical: Banking](../reference/vertical-banking.md) for full regulatory alignment.
---
*Previous: [Zero-Budget Hardening](zero-budget-hardening.md)*
*Next: [AD and Endpoint Hardening](ad-endpoint-hardening.md)*
For how Intune deployment becomes the natural entry point for broader security transformation, see [Endpoint Management Entry Vector](endpoint-management-entry-vector.md).

View File

@@ -0,0 +1,644 @@
# osquery: The Sovereign Discovery Platform
> *"Tenable sees what Tenable chooses to show you. osquery sees whatever you ask it to see. The difference is sovereignty."*
This document provides a complete blueprint for building a **custom vulnerability discovery, compliance, and asset inventory platform** on osquery—the open-source, cross-platform endpoint agent that exposes operating systems as SQL databases. It is designed for consultancies and clients who want **owned visibility** rather than rented scanner reports.
Osquery is the technical expression of the antifragile principle: **sovereign intelligence**. Your data. Your queries. Your infrastructure. No third-party black box.
---
## Why osquery Fits the Antifragile Posture
| Commercial Scanner | osquery |
|-------------------|---------|
| Proprietary detection logic | **Open-source SQL queries you can inspect, modify, and extend** |
| Data sent to vendor cloud | **Data stays on your infrastructure** |
| Vendor-defined scan scope | **You define what to query; if you can think it, you can ask it** |
| Per-asset licensing cost | **Free and open-source** |
| Quarterly or monthly scans | **Continuous or on-demand; you control the cadence** |
| Generic report templates | **Custom dashboards and reports built on your data** |
| Vendor lock-in | **Portable SQL queries; migrate to any platform** |
**The executive framing**:
> *"Tenable is a rented microscope. It shows you what the manufacturer decided you should see. osquery is a laboratory. You design the experiments, you collect the samples, and you interpret the results. It requires more expertise—but it produces intelligence that no competitor can replicate because it is built on your specific questions about your specific environment."*
---
## What osquery Actually Is
Osquery is an **endpoint agent** that runs on Windows, macOS, Linux, and FreeBSD. It exposes the operating system as a relational database with **hundreds of tables**:
| Table Category | Examples | What You Can Ask |
|----------------|----------|-----------------|
| **Processes** | `processes`, `process_memory_map`, `process_open_sockets` | "Show me processes listening on external ports" |
| **Network** | `listening_ports`, `interface_details`, `etc_hosts` | "Show me hosts with no firewall enabled" |
| **Users & Authentication** | `users`, `groups`, `shadow`, `logged_in_users` | "Show me accounts with password never expires" |
| **Software & Packages** | `programs`, `deb_packages`, `rpm_packages`, `chrome_extensions` | "Show me installed software with known vulnerable versions" |
| **System Configuration** | `os_version`, `system_info`, `registry` | "Show me all Windows Server 2012 machines" |
| **Security** | `startup_items`, `scheduled_tasks`, `authorizations` | "Show me persistence mechanisms" |
| **File Integrity** | `file_events`, `hash` | "Show me changes to /etc/passwd in the last hour" |
| **Hardware** | `usb_devices`, `system_info`, `cpu_info` | "Show me unmanaged USB devices" |
**The power**: You write SQL. osquery returns live system data. No proprietary query language. No vendor-defined limits.
---
## Deployment Architecture
### Model 1: Standalone / Ad-Hoc (Proof of Concept)
For a first sweep or targeted investigation:
```bash
# Install osquery on a single system
# Windows: choco install osquery
# macOS: brew install osquery
# Ubuntu: apt install osquery
# Run a query interactively
osqueryi "SELECT name, version, install_date FROM programs WHERE name LIKE '%Adobe%'"
# Run a query from file
osqueryi --json < queries/windows-software-inventory.sql > results.json
```
**Use case**: Consultant's laptop runs osqueryi against a script-generated target list via SSH/WinRM. No infrastructure. No agents permanently deployed. Perfect for first sweeps.
### Model 2: Scheduled Agent with Local Logging (Basic Monitoring)
Deploy osquery as a daemon with scheduled queries writing to local files or syslog:
```json
// /etc/osquery/osquery.conf
{
"schedule": {
"installed_software": {
"query": "SELECT name, version, install_date FROM programs;",
"interval": 86400,
"description": "Daily software inventory"
},
"listening_ports": {
"query": "SELECT lp.pid, lp.port, lp.protocol, p.name, p.path FROM listening_ports lp LEFT JOIN processes p ON lp.pid = p.pid WHERE lp.address != '127.0.0.1';",
"interval": 3600,
"description": "Hourly external listening ports"
},
"missing_patches": {
"query": "SELECT hotfix_id, installed_on FROM patches WHERE hotfix_id NOT IN (SELECT hotfix_id FROM patches WHERE installed_on > date('now', '-30 days'));",
"interval": 86400,
"description": "Daily patch compliance check"
}
},
"options": {
"logger_path": "/var/log/osquery",
"logger_plugin": "filesystem"
}
}
```
**Use case**: Small environments (50-500 endpoints) where centralized management is not yet justified. Logs are collected by existing SIEM or file forwarder.
### Model 3: FleetDM (The Recommended Control Plane)
FleetDM is an open-source management platform for osquery. It provides:
- Centralized query scheduling across thousands of endpoints
- Live query capability (ask a question; get answers in seconds)
- Policy enforcement (compliance checks with pass/fail reporting)
- Software inventory and vulnerability mapping
- Device health monitoring
- SSO integration
- API for automation and reporting
**Deployment**:
```
┌─────────────┐ ┌─────────────┐ ┌─────────────────┐
│ FleetDM │────▶│ MySQL │────▶│ Redis (cache) │
│ (Web/API) │ │ ( datastore)│ │ │
└──────┬──────┘ └─────────────┘ └─────────────────┘
│ HTTPS (TLS 1.2+)
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ osquery │ │ osquery │ │ osquery │
│ (Windows) │ │ (Linux) │ │ (macOS) │
└─────────────┘ └─────────────┘ └─────────────┘
```
**FleetDM pricing**:
- **Free tier**: Up to 1,000 hosts, full osquery management, basic vulnerability mapping
- **Premium**: ~$7/host/year for advanced features, SSO, API access, premium support
- **Self-hosted**: The software is open-source; you pay only for infrastructure
**The business case**:
| Solution | Cost for 1,000 hosts/year | Data Sovereignty |
|----------|--------------------------|------------------|
| Tenable.io | ~$50,000-€100,000 | Data in vendor cloud |
| Qualys VMDR | ~$40,000-€80,000 | Data in vendor cloud |
| FleetDM + osquery | ~$7,000 (premium) or $0 (free) + infrastructure | **Data in your infrastructure** |
---
## Query Packs for Vulnerability Discovery
### Windows Vulnerability Discovery Pack
```sql
-- windows-vuln-discovery.sql
-- Run via: osqueryi < windows-vuln-discovery.sql
-- 1. End-of-life operating systems
SELECT
si.computer_name,
os.name AS os_name,
os.version AS os_version,
os.build AS os_build,
CASE
WHEN os.version LIKE '6.1%' THEN 'Windows 7/Server 2008 R2 - END OF LIFE'
WHEN os.version LIKE '6.2%' THEN 'Windows 8/Server 2012 - END OF LIFE'
WHEN os.version LIKE '6.3%' THEN 'Windows 8.1/Server 2012 R2 - END OF LIFE'
WHEN os.version LIKE '10.0.1%' THEN 'Windows 10/Server 2016 - Check build'
WHEN os.build < '17763' THEN 'Windows 10/Server 2019 - Outdated build'
ELSE 'Current or check manually'
END AS eol_status
FROM os_version os
CROSS JOIN system_info si;
-- 2. Missing critical patches (last 90 days)
SELECT
si.computer_name,
COUNT(*) AS missing_patches
FROM patches p
CROSS JOIN system_info si
WHERE p.installed_on < date('now', '-90 days')
OR p.installed_on IS NULL
GROUP BY si.computer_name
HAVING missing_patches > 5;
-- 3. Software with known vulnerable versions (customizable)
SELECT
si.computer_name,
p.name,
p.version,
CASE
WHEN p.name LIKE '%Adobe Reader%' AND CAST(REPLACE(p.version, '.', '') AS INTEGER) < 2023000 THEN 'POTENTIALLY VULNERABLE'
WHEN p.name LIKE '%Java%' AND p.version LIKE '8u%' AND CAST(SUBSTR(p.version, 3) AS INTEGER) < 381 THEN 'POTENTIALLY VULNERABLE'
WHEN p.name LIKE '%Chrome%' AND CAST(REPLACE(SUBSTR(p.version, 1, 2), '.', '') AS INTEGER) < 120 THEN 'POTENTIALLY VULNERABLE'
ELSE 'REVIEW MANUALLY'
END AS vuln_status
FROM programs p
CROSS JOIN system_info si
WHERE p.name IN ('Adobe Reader', 'Java', 'Google Chrome', 'Mozilla Firefox', 'Microsoft Edge')
OR p.name LIKE '%Adobe%'
OR p.name LIKE '%Java%'
OR p.name LIKE '%Chrome%';
-- 4. Local administrators (excessive count = risk)
SELECT
si.computer_name,
COUNT(*) AS admin_count,
GROUP_CONCAT(u.username, '; ') AS admin_users
FROM users u
JOIN user_groups ug ON u.uid = ug.uid
JOIN groups g ON ug.gid = g.gid
CROSS JOIN system_info si
WHERE g.groupname = 'Administrators'
GROUP BY si.computer_name
HAVING admin_count > 3;
-- 5. Services listening on external interfaces
SELECT
si.computer_name,
lp.port,
lp.protocol,
p.name AS process_name,
p.path AS process_path,
lp.address
FROM listening_ports lp
LEFT JOIN processes p ON lp.pid = p.pid
CROSS JOIN system_info si
WHERE lp.address NOT IN ('127.0.0.1', '::1', '0.0.0.0')
AND lp.address NOT LIKE '169.254.%'
AND lp.port > 0;
-- 6. Firewall disabled profiles
SELECT
si.computer_name,
f.name AS profile_name,
f.enabled AS firewall_enabled,
CASE WHEN f.enabled = 0 THEN 'CRITICAL: FIREWALL DISABLED' ELSE 'OK' END AS status
FROM windows_firewall_rules f
CROSS JOIN system_info si
WHERE f.enabled = 0;
-- 7. BitLocker encryption status (Windows)
SELECT
si.computer_name,
d.letter,
d.type,
d.encrypted,
CASE WHEN d.encrypted = 0 THEN 'UNENCRYPTED' ELSE 'ENCRYPTED' END AS encryption_status
FROM bitlocker_info d
CROSS JOIN system_info si;
```
### Linux Vulnerability Discovery Pack
```sql
-- linux-vuln-discovery.sql
-- 1. OS version and kernel (check for EOL)
SELECT
si.hostname,
os.name,
os.version,
os.platform,
os.platform_like,
k.version AS kernel_version
FROM os_version os
CROSS JOIN system_info si
LEFT JOIN kernel_info k ON 1=1;
-- 2. Packages with known CVEs (requires vulners or manual correlation)
SELECT
si.hostname,
dp.name,
dp.version,
dp.source,
dp.arch
FROM deb_packages dp
CROSS JOIN system_info si
WHERE dp.name IN ('openssl', 'openssh-server', 'nginx', 'apache2', 'mysql-server', 'postgresql')
UNION ALL
SELECT
si.hostname,
rp.name,
rp.version,
rp.source,
rp.arch
FROM rpm_packages rp
CROSS JOIN system_info si
WHERE rp.name IN ('openssl', 'openssh-server', 'nginx', 'httpd', 'mariadb-server', 'postgresql-server');
-- 3. SSH hardening checks
SELECT
si.hostname,
c.key,
c.value,
CASE
WHEN c.key = 'PermitRootLogin' AND c.value = 'yes' THEN 'CRITICAL: Root login permitted'
WHEN c.key = 'PasswordAuthentication' AND c.value = 'yes' THEN 'HIGH: Password auth enabled'
WHEN c.key = 'Port' AND c.value != '22' THEN 'INFO: Non-standard port'
ELSE 'Review'
END AS risk
FROM ssh_configs c
CROSS JOIN system_info si
WHERE c.key IN ('PermitRootLogin', 'PasswordAuthentication', 'Port', 'Protocol', 'MaxAuthTries');
-- 4. Sudoers with NOPASSWD (privilege escalation risk)
SELECT
si.hostname,
su.source,
su.header,
su.rule_details
FROM sudo_rules su
CROSS JOIN system_info si
WHERE su.rule_details LIKE '%NOPASSWD%'
OR su.rule_details LIKE '%ALL=(ALL:ALL) ALL%';
-- 5. Listening ports with process attribution
SELECT
si.hostname,
lp.port,
lp.protocol,
lp.address,
p.name AS process_name,
p.pid,
p.path
FROM listening_ports lp
LEFT JOIN processes p ON lp.pid = p.pid
CROSS JOIN system_info si
WHERE lp.address NOT IN ('127.0.0.1', '::1', '0.0.0.0');
-- 6. Setuid/setgid binaries (privilege escalation paths)
SELECT
si.hostname,
f.path,
f.directory,
f.filename,
f.uid,
f.gid,
f.mode,
datetime(f.atime, 'unixepoch') AS last_accessed
FROM file f
CROSS JOIN system_info si
WHERE f.path IN ('/usr/bin', '/usr/sbin', '/bin', '/sbin')
AND (f.mode LIKE '%4000%' OR f.mode LIKE '%2000%');
-- 7. Container presence and image versions
SELECT
si.hostname,
dc.id,
dc.name,
dc.image,
dc.image_id,
dc.state,
dc.created
FROM docker_containers dc
CROSS JOIN system_info si
WHERE dc.state = 'running';
-- 8. Kubernetes pod security (if applicable)
SELECT
si.hostname,
kp.name,
kp.namespace,
kp.status,
kp.containers
FROM kubernetes_pods kp
CROSS JOIN system_info si;
```
### macOS Vulnerability Discovery Pack
```sql
-- macos-vuln-discovery.sql
-- 1. macOS version (check for EOL)
SELECT
si.computer_name,
os.name,
os.version,
os.platform,
os.build
FROM os_version os
CROSS JOIN system_info si;
-- 2. Installed applications (macOS apps)
SELECT
si.computer_name,
a.name,
a.bundle_short_version,
a.bundle_version,
a.path
FROM apps a
CROSS JOIN system_info si
WHERE a.name IN ('Safari', 'Google Chrome', 'Firefox', 'Microsoft Edge', 'Adobe Acrobat Reader', 'Zoom', 'Slack');
-- 3. Gatekeeper and SIP status
SELECT
si.computer_name,
g.key,
g.value
FROM gatekeeper g
CROSS JOIN system_info si
UNION ALL
SELECT
si.computer_name,
'SIP' AS key,
CASE WHEN sip.enabled = 1 THEN 'ENABLED' ELSE 'DISABLED' END AS value
FROM sip_config sip
CROSS JOIN system_info si;
-- 4. FileVault encryption status
SELECT
si.computer_name,
f.user_uuid,
f.status,
CASE WHEN f.status = 'Off' THEN 'UNENCRYPTED' ELSE 'ENCRYPTED' END AS encryption_status
FROM filevault_users f
CROSS JOIN system_info si;
```
---
## Building the Custom TVM Platform on osquery + FleetDM
### Step 1: Deploy FleetDM (1 day)
```bash
# Option A: Docker Compose (fastest for proof of concept)
git clone https://github.com/fleetdm/fleet.git
cd fleet/tools/osquery
docker-compose up -d
# Option B: Binary deployment for production
curl -L https://github.com/fleetdm/fleet/releases/latest/download/fleet.zip -o fleet.zip
unzip fleet.zip
./fleet prepare db
./fleet serve
```
### Step 2: Enroll Endpoints (1 day)
Generate an enrollment secret in FleetDM, then deploy osquery with FleetDM configuration:
```bash
# Windows (via Intune, GPO, or script)
# Install osquery MSI with FleetDM flags
osqueryd.exe --enroll_secret=YOUR_SECRET --tls_server=fleet.yourcompany.com:443
# Linux (via package manager + config)
apt install osquery
# Edit /etc/osquery/osquery.flags:
# --enroll_secret=YOUR_SECRET
# --tls_server=fleet.yourcompany.com:443
systemctl enable osqueryd && systemctl start osqueryd
# macOS (via MDM or script)
brew install osquery
# Similar flag configuration
launchctl load /Library/LaunchDaemons/com.facebook.osqueryd.plist
```
### Step 3: Define Policies (Compliance Checks)
FleetDM policies are scheduled queries that evaluate to PASS or FAIL:
```sql
-- Policy: Disk encryption enabled (Windows)
SELECT 1 FROM bitlocker_info WHERE encrypted = 1;
-- Policy: macOS FileVault enabled
SELECT 1 FROM filevault_users WHERE status = 'On';
-- Policy: No password authentication on SSH (Linux)
SELECT 1 FROM ssh_configs WHERE key = 'PasswordAuthentication' AND value = 'no';
-- Policy: No root login via SSH (Linux)
SELECT 1 FROM ssh_configs WHERE key = 'PermitRootLogin' AND value = 'no';
-- Policy: Windows Firewall enabled on all profiles
SELECT 1 FROM windows_firewall_rules WHERE enabled = 1 GROUP BY name HAVING COUNT(*) = 3;
-- Policy: Critical OS patches within 30 days
SELECT 1 FROM patches WHERE installed_on > date('now', '-30 days');
```
**Dashboard output**: FleetDM shows percentage compliance per policy across all enrolled hosts.
### Step 4: Vulnerability Correlation (The Custom Layer)
FleetDM's free tier includes basic CVE mapping for installed software. For advanced correlation, build a custom pipeline:
```python
# vuln-correlator.py
# Runs nightly: pulls FleetDM software inventory, correlates with CVE database
import requests
import sqlite3
from datetime import datetime
# 1. Pull software inventory from FleetDM API
FLEET_API = "https://fleet.yourcompany.com/api/v1"
HEADERS = {"Authorization": "Bearer YOUR_API_TOKEN"}
hosts = requests.get(f"{FLEET_API}/hosts", headers=HEADERS).json()["hosts"]
# 2. Connect to local CVE database (NVD dump or vulners)
conn = sqlite3.connect("cve-db.sqlite")
cursor = conn.cursor()
findings = []
for host in hosts:
host_id = host["id"]
host_name = host["hostname"]
# Get installed software
software = requests.get(f"{FLEET_API}/hosts/{host_id}/software", headers=HEADERS).json()["software"]
for app in software:
name = app["name"]
version = app["version"]
# Query CVE database for this software+version
cursor.execute("""
SELECT cve_id, severity, description
FROM cves
WHERE software_name = ? AND affected_versions LIKE ?
""", (name, f"%{version}%"))
vulns = cursor.fetchall()
for cve_id, severity, description in vulns:
findings.append({
"host": host_name,
"software": name,
"version": version,
"cve": cve_id,
"severity": severity,
"description": description[:200]
})
# 3. Generate report
with open(f"vuln-report-{datetime.now().strftime('%Y%m%d')}.json", "w") as f:
import json
json.dump(findings, f, indent=2)
# 4. Push critical findings to SIEM or Slack
# (Integration code here)
```
### Step 5: AI-Assisted Prioritization
Feed osquery/FleetDM data into the AI TVM prioritization engine:
```
[FleetDM Software Inventory] ──▶ [CVE Correlator] ──▶ [AI Prioritization]
[FleetDM Policy Failures] ──▶ [Risk Scoring] ──▶ [AI Prioritization]
[osquery Listening Ports] ──▶ [Exposure Analysis] ──▶ [AI Prioritization]
[osquery OS Version] ──▶ [EOL Detection] ──▶ [AI Prioritization]
──▶ [Executive Brief]
```
The AI receives structured, queryable data from osquery—not proprietary scan reports. This means:
- You can ask the AI: "Which hosts have both Adobe Reader and an open RDP port?"
- You can ask the AI: "Show me all Linux servers running kernel versions with known CVEs"
- You can ask the AI: "What changed in our software inventory since last week?"
---
## The Consultant's Delivery Model
### Engagement 1: Osquery Discovery Sprint (5 days)
| Day | Activity | Deliverable |
|-----|----------|-------------|
| 1 | Deploy FleetDM proof-of-concept | Operational FleetDM instance |
| 2 | Enroll 10-20 representative hosts | Live endpoint data flowing |
| 3 | Run vulnerability discovery query packs | Raw findings exported |
| 4 | Build custom queries for client's specific concerns | Client-specific query library |
| 5 | Present findings + propose scaled deployment | Board-ready report; deployment roadmap |
**Investment**: €3,500€5,500 (labor only; software is free)
**Standalone value**: Complete asset and vulnerability inventory of representative estate
### Engagement 2: Custom Platform Build (30 days)
- Scale FleetDM to full estate
- Build custom query library for client's specific compliance and security needs
- Integrate CVE correlation pipeline
- Build executive dashboards
- Train internal team on query authoring
- Hand over operational control
### Engagement 3: Continuous Improvement Retainer
- Monthly: New CVE correlation rules, query tuning, policy updates
- Quarterly: Purple team exercise using osquery data for detection validation
- Annually: Platform architecture review, query library refresh
---
## When Osquery Is the Right Choice
| Scenario | Recommendation |
|----------|---------------|
| Client has 50-5,000 endpoints, no existing scanner | **osquery + FleetDM is ideal.** Cheaper, more flexible, and sovereign. |
| Client has 5,000-50,000 endpoints, heterogeneous | **osquery + FleetDM can scale.** Consider premium tier or multi-node deployment. |
| Client needs compliance audit trails (PCI, SOC 2) | **Supplement with commercial scanner.** Auditors prefer vendor-validated reports. osquery provides operational intelligence; commercial scanner provides audit evidence. |
| Client has heavy OT/ICS environment | **osquery for IT endpoints; specialized scanner for OT.** osquery does not speak Modbus or OPC-UA. |
| Client wants "set and forget" | **Commercial scanner may be better.** osquery requires ongoing query authoring and maintenance. |
---
## Talking Points for the CTO
**When they say**: *"We are considering Tenable but it is expensive."*
**You respond**:
> *"Tenable is excellent at what it does. But it is a rented microscope with a fixed lens. Osquery is a laboratory. For the cost of one Tenable subscription, you can build a sovereign vulnerability discovery platform that answers questions Tenable never thought to ask. Let us run a 5-day proof of concept. If osquery does not find actionable vulnerabilities in your environment, you have the evidence to justify Tenable. If it does, you have a cheaper, more flexible alternative that you own outright."*
**When they say**: *"We do not have the expertise to write SQL queries."*
**You respond**:
> *"You do not need to write them from scratch. The osquery community has published thousands of battle-tested queries. FleetDM includes hundreds of pre-built policies. We start with those, customize them for your environment, and train your team to extend them. The expertise grows with the platform."*
**When they say**: *"Our SIEM already collects endpoint data."*
**You respond**:
> *"Your SIEM collects logs. Logs are what the system chose to record. Osquery queries are what you choose to ask. A log might tell you a process started. An osquery query can tell you every process with a network connection, its parent process, its binary hash, and whether that hash matches a known good baseline. The difference is interrogation versus observation."*
---
## Integration With Existing Frameworks
| Document | Integration |
|----------|-------------|
| [Zero-Budget Vulnerability Discovery](zero-budget-vulnerability-discovery.md) | osquery is the most powerful zero-budget discovery method; it replaces or supplements PowerShell/SSH scripts |
| [AI-Assisted TVM Blueprint](ai-assisted-tvm.md) | osquery provides the structured data feed for AI prioritization; it is the discovery layer of the AI TVM architecture |
| [Perimeter Scanning Capability](perimeter-scanning-capability.md) | osquery covers internal endpoints; perimeter scanning covers external attack surface; together they provide complete visibility |
| [Modular Engagements](../core/modular-engagements.md) | osquery sprint can be delivered as a standalone 5-day module or as the foundation of a larger TVM engagement |
| [Business Case Template](business-case-template.md) | osquery + FleetDM costs vs. commercial scanner costs |
---
*For script-based discovery without agents, see [Zero-Budget Vulnerability Discovery](zero-budget-vulnerability-discovery.md).*
*For the AI prioritization layer, see [AI-Assisted TVM Blueprint](ai-assisted-tvm.md).*
*For external attack surface scanning, see [Perimeter Scanning Capability](perimeter-scanning-capability.md).*

View File

@@ -0,0 +1,344 @@
# Perimeter Scanning Capability: Build, Partner, or Hybrid?
> *"You cannot prioritize what you cannot see. And your internal vulnerability scanner will never tell you what the internet sees."*
This document provides a strategic framework for building external attack surface visibility—the "outside-in" perspective that reveals what adversaries (and AI-powered scanners like Mythos) see when they look at your organization from the public internet.
It addresses the build-vs-partner decision for perimeter scanning and maps external findings into the AI-assisted TVM prioritization engine.
---
## Why External Scanning Is Non-Negotiable
### The Asymmetry Problem
An adversary attacking your organization starts from the outside. They see:
- Your public IP ranges
- Your exposed services and ports
- Your forgotten cloud storage buckets
- Your expired certificates
- Your development sites still publicly accessible
- Your subsidiary domains you forgot you owned
Your internal vulnerability scanner sees none of this. It scans from the inside, authenticated, with full knowledge of the network. The adversary scans from the outside, unauthenticated, with zero prior knowledge.
**The board framing**:
> *"Your internal scanner says your web servers are patched. But from the internet, we can see three development instances running outdated Apache versions on ports you did not know were exposed. The internal scanner is blind to what the adversary sees first. External scanning closes that blindness."*
### The Mythos-Specific Risk
AI-powered scanning agents do not sleep, do not get bored, and do not miss open ports because they were in a hurry. They:
- Scan entire IPv4 space continuously
- Correlate services with CVE databases in real time
- Chain findings: open port + service version + known exploit = instant target
- Discover forgotten assets faster than human reconnaissance teams
If you are not scanning your perimeter at least as aggressively as your adversaries, you are relying on luck.
---
## The Three Models
| Model | Description | Investment | Timeline | Best For |
|-------|-------------|-----------|----------|----------|
| **Build (Open-Source)** | Self-hosted Nuclei, OpenVAS, Amass on cheap VPS infrastructure | Low (€200-500/month infrastructure) | 1-2 weeks to operational | Tech-savvy teams; consultants who want independence; proof-of-concept |
| **Partner (Commercial)** | Shodan Enterprise, Censys, Tenable.asm, Cortex Xpanse, Mandiant ASM | Medium to high (€10K-€100K/year) | Immediate (SaaS) | Organizations needing continuous monitoring, compliance evidence, or limited internal expertise |
| **Hybrid** | Open-source stack for active scanning + commercial platform for passive discovery and trends | Medium (€15K-€30K/year) | 2-4 weeks | Most organizations; balances cost, capability, and coverage |
---
## Model 1: Build (Open-Source Stack)
### The Consultant's Scanning Infrastructure
For a consulting practice, owning your own scanning capability provides independence, speed, and a differentiator.
**Infrastructure**: 2-3 cheap VPS instances (Hetzner, DigitalOcean, Vultr)
- €5-10/month per instance
- Distributed across geographies (EU, US, Asia) to simulate global adversary perspective
- Containerized scanning workloads
**Core Stack**:
| Tool | Purpose | Why It Matters |
|------|---------|---------------|
| **Amass** | DNS enumeration, subdomain discovery, asset mapping | Finds forgotten domains, dev sites, acquisitions |
| **Naabu** | Fast port scanning | Identifies exposed services beyond standard ports |
| **httpx** | Web service fingerprinting | Identifies technologies, versions, and potential vulnerabilities |
| **Nuclei** | Vulnerability detection (10,000+ templates) | Specific CVE detection, misconfiguration checks, exposed panels |
| **Subfinder** | Passive subdomain discovery | Leverages certificate transparency, search engines, archives |
| **Katana** | Web crawler | Discovers hidden endpoints, API paths, exposed files |
| **Gau** (GetAllUrls) | URL enumeration from archives | Finds old URLs that might still resolve to live services |
| **OpenVAS / Greenbone** | Full vulnerability scanning | Deep inspection of discovered services |
| **Nmap + NSE scripts** | Service detection and vulnerability checks | Reliable, comprehensive, scriptable |
### Deployment Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ SCANNING CONTROLLER │
│ (Scheduling, results aggregation, report generation) │
│ - Cron jobs or Jenkins/GitHub Actions │
│ - SQLite/PostgreSQL for results storage │
│ - Python/PowerShell for report generation │
└────────────────────┬────────────────────────────────────────┘
┌───────────────┼───────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ EU VPS │ │ US VPS │ │ ASIA VPS│
│ Amass │ │ Amass │ │ Amass │
│ Nuclei │ │ Nuclei │ │ Nuclei │
│ Naabu │ │ Naabu │ │ Naabu │
└─────────┘ └─────────┘ └─────────┘
```
### The First External Scan Protocol
```bash
# 1. PASSIVE RECONNAISSANCE (no packets sent to target)
amass enum -d client-domain.com -o amass-results.txt
subfinder -d client-domain.com -o subfinder-results.txt
# Merge and deduplicate
cat amass-results.txt subfinder-results.txt | sort -u > all-domains.txt
# 2. DISCOVER LIVE SERVICES
cat all-domains.txt | httpx -o live-web-services.txt -tech-detect -status-code
cat all-domains.txt | naabu -p - -o live-ports.txt
# 3. VULNERABILITY DETECTION
nuclei -list live-web-services.txt -severity critical,high -o nuclei-findings.txt
# 4. DEEP INSPECTION (for high-value targets)
nmap -sV -sC -O --script vuln -iL live-ports-targets.txt -oA deep-scan
# 5. REPORT GENERATION
# Aggregate, deduplicate, prioritize
```
**What this produces in 4 hours**:
- Complete subdomain map
- All live web services with technology fingerprinting
- Critical and high-severity vulnerability findings
- Exposed development sites, admin panels, default credentials
- Certificate expiration warnings
- Geographic distribution of exposed services
### Limitations of the Build Model
| Limitation | Impact | Mitigation |
|-----------|--------|------------|
| IP blocking by CDNs / WAFs | Incomplete scan results | Rotate source IPs; use multiple VPS locations; respect rate limits |
| Legal exposure | Scanning without explicit authorization is illegal in many jurisdictions | Always have written authorization; define scope strictly; exclude third-party infrastructure |
| Maintenance burden | Tools require updates; templates require refresh | Automated CI/CD pipeline for tool updates; weekly template sync for Nuclei |
| No historical trending | Point-in-time snapshots only | Store results in database; generate trend reports quarterly |
| Limited cloud asset discovery | Cannot see inside AWS/Azure/GCP without API access | Supplement with cloud-native discovery (see zero-budget discovery) |
---
## Model 2: Partner (Commercial Platforms)
### Shodan / Censys (Passive Discovery)
**What they do**: Continuously scan the entire internet and index services, certificates, devices, and vulnerabilities. You query their database instead of scanning yourself.
**Best for**:
- Discovering forgotten assets ("We did not know we had a server in that IP range")
- Historical tracking ("When did this service first appear on the internet?")
- IoT/OT device discovery (Shodan specializes in industrial control systems)
- Certificate transparency monitoring (detects unauthorized certificates)
**Pricing**:
- Shodan API: ~$60/month for developer; Enterprise starts at ~$10K/year
- Censys: Similar pricing tiers
**The consultant use case**: Even without enterprise licensing, API credits allow you to query client IP ranges during assessments. The output is professional-grade and defensible.
### Tenable Attack Surface Management (Tenable.asm)
**What it does**: Continuous attack surface monitoring combining external scanning, cloud API integration, and business context.
**Best for**:
- Clients who need compliance-ready external scanning
- Continuous monitoring (not point-in-time)
- Integration with Tenable.io / Tenable.sc for unified internal + external view
**Pricing**: ~$15K-€50K/year depending on asset count
### Cortex Xpanse (Palo Alto Networks)
**What it does**: Enterprise-grade attack surface management with threat intelligence integration.
**Best for**:
- Large enterprises with complex M&A history
- Organizations needing integration with Palo Alto firewalls and Prisma Cloud
- High-frequency M&A environments where attack surface changes constantly
### Mandiant Attack Surface Management (Google Cloud)
**What it does**: Combines attack surface monitoring with Mandiant threat intelligence.
**Best for**:
- Organizations facing advanced persistent threats
- Clients who want attack surface data correlated with APT TTPs
---
## Model 3: Hybrid (Recommended for Most Clients)
The hybrid model combines the strengths of both approaches:
### The Consultant's Hybrid Stack
| Function | Tool | Model |
|----------|------|-------|
| **Continuous passive discovery** | Shodan API + Censys API | Partner (€500-€1K/month) |
| **Active vulnerability scanning** | Nuclei + OpenVAS on consultant VPS | Build (€200/month) |
| **Deep penetration testing** | Nmap + custom scripts + manual validation | Build (labor) |
| **Cloud asset correlation** | AWS/Azure/GCP APIs + native security tools | Build (free APIs) |
| **Historical trending and reporting** | Self-hosted database + Grafana + AI synthesis | Build (€50/month) |
| **Compliance validation** | Tenable.asm or Qualys WAS (optional) | Partner (if required) |
### Why Hybrid Wins
1. **Cost efficiency**: Passive discovery via APIs is cheap. Active scanning is cheaper self-hosted than commercial.
2. **Coverage**: APIs find things your scanners miss (historical data, third-party mentions). Active scanning validates exploitability.
3. **Independence**: You are not locked into a single vendor. If Shodan raises prices, you can shift to Censys or increase active scanning.
4. **Credibility**: Having your own infrastructure demonstrates technical competence. Clients trust consultants who own their tools.
---
## The Perimeter-to-TVM Integration
External scanning findings must feed into the vulnerability prioritization engine. Here is how:
### The Outside-In Risk Multiplier
A vulnerability on an **internet-facing** system is exponentially more dangerous than the same vulnerability on an internal workstation. The AI-assisted TVM engine weights findings accordingly:
| Finding Location | Risk Multiplier | Why |
|-----------------|-----------------|-----|
| Internet-facing, no WAF | 10x | Direct adversary access; no defence in depth |
| Internet-facing, behind WAF | 5x | WAF bypass possible; still directly reachable |
| DMZ, reachable from internet | 4x | Compromise enables lateral movement |
| Internal, privileged access | 3x | High impact if compromised; requires initial access first |
| Internal, standard user | 1x | Baseline risk |
| Air-gapped / OT | 2x-5x | Isolation is protection, but compromise is catastrophic |
### The Integration Pipeline
```
[External Scan Results]
├─ Nuclei findings (CVEs, misconfigs)
├─ Shodan/Censys exposed services
├─ Certificate issues
└─ Cloud-exposed storage/assets
[Correlation Engine]
├─ Map external finding to internal asset (IP → hostname → owner)
├─ Cross-reference with internal vulnerability scan
├─ Check compensating controls (WAF, CDN, rate limiting)
└─ Apply outside-in risk multiplier
[AI Prioritization]
├─ Exploitability prediction
├─ Threat intelligence correlation
├─ Business impact assessment
└─ Generate ranked remediation list
[Executive Dashboard]
├─ "Top 10 internet-facing risks"
├─ "Attack surface trend: growing or shrinking?"
├─ "Mean time to remediate externally exposed vulnerability"
└─ Board-ready brief
```
### The Weekly Cadence
| Day | Activity | Source |
|-----|----------|--------|
| Monday | Review Shodan/Censys alerts for new exposed services | Passive APIs |
| Tuesday | Run targeted Nuclei scan on new/changed assets | Active scanning |
| Wednesday | Correlate external findings with internal vulnerability data | Integration engine |
| Thursday | Generate AI-prioritized action list | AI TVM engine |
| Friday | Executive brief: "What changed on our perimeter this week?" | Automated synthesis |
---
## The Board Conversation
**When the CTO asks**: *"Do we need to pay for external scanning? Can't we just use our internal scanner?"*
**You respond**:
> *"Your internal scanner sees what is inside your walls. Mythos—and every criminal scanner on the internet—sees what is outside your walls. In our first external scan of your perimeter, we found [X] services exposed to the internet that your internal team did not know existed. [Y] of them have known vulnerabilities. [Z] of them are running end-of-life software. The internal scanner will never find these because they are outside its scope. External scanning is not optional. It is the perspective your adversary already has."*
**When the CFO asks**: *"How much does this cost?"*
**You respond**:
> *"A hybrid approach—combining API-based passive monitoring with active open-source scanning—costs approximately €1,000-€2,000 per month. That is less than the cost of one incident response retainer day. And it provides continuous visibility, not a quarterly snapshot. If you need compliance-ready reports, we can add Tenable or Qualys later. But the baseline visibility is achievable now, at low cost."*
---
## Legal and Ethical Considerations
### Authorization Is Mandatory
Never scan a client's infrastructure—or any infrastructure—without explicit, written authorization.
**The authorization letter must specify**:
- Exact IP ranges, domains, and cloud accounts in scope
- Excluded systems (production payment gateways, safety-critical OT, third-party services)
- Scanning intensity and timing restrictions
- Emergency contact for scan-related incidents
- Data handling: where scan results are stored, who has access, retention period
### Rate Limiting and Resilience
- Respect `robots.txt` on web services (though adversaries do not)
- Limit concurrent connections to avoid service disruption
- Scan during maintenance windows for critical services
- Have an immediate stop mechanism if unexpected impact occurs
### Data Sovereignty
- Scan results may contain sensitive data (service versions, internal hostnames, certificate details)
- Store results in the client's jurisdiction
- Encrypt at rest and in transit
- Delete results after the engagement unless contract specifies retention
---
## The Consultant's Advantage
Owning perimeter scanning capability provides three competitive advantages:
1. **Speed**: You can deliver external attack surface findings in 24 hours, not the 2-week procurement cycle of commercial platforms.
2. **Differentiation**: Most M365 consultants do not offer attack surface management. You do.
3. **Entry vector**: External scanning often reveals the most compelling findings—exposed admin panels, outdated services, forgotten acquisitions. These findings naturally lead to broader engagement.
**The pitch**:
> *"Before we discuss your M365 security or endpoint management, let us scan your perimeter. In 24 hours, we will show you what the internet sees. I suspect we will find something that changes your prioritization. If we do not, the scan was free. If we do, we have the evidence to justify the security investments you have been considering."*
---
## Integration With Existing Frameworks
| Document | Integration |
|----------|-------------|
| [AI-Assisted TVM Blueprint](ai-assisted-tvm.md) | Perimeter findings feed the AI prioritization engine with outside-in risk weighting |
| [Zero-Budget Vulnerability Discovery](zero-budget-vulnerability-discovery.md) | Internal discovery (scripts + osquery) + external scanning = complete visibility |
| [Business Case Template](business-case-template.md) | Perimeter scanning costs (€1K-€2K/month hybrid) vs. incident response costs |
| [Osquery: The Sovereign Discovery Platform](osquery-custom-platform.md) | osquery covers internal endpoint visibility; perimeter scanning covers external attack surface; together they provide complete visibility |
| [Modular Engagements](../core/modular-engagements.md) | Perimeter scan can be delivered as a standalone 2-3 day module or included in AI TVM |
---
*For internal vulnerability discovery without commercial tools, see [Zero-Budget Vulnerability Discovery](zero-budget-vulnerability-discovery.md).*
*For the sovereign endpoint discovery platform, see [Osquery: The Sovereign Discovery Platform](osquery-custom-platform.md).*
*For the AI-assisted prioritization layer, see [AI-Assisted TVM Blueprint](ai-assisted-tvm.md).*

View File

@@ -0,0 +1,322 @@
# Rapid Modernisation Plan
> *"We must change our strategy from 'detect the attacker in time' to 'become the target that is not worth attacking.' Reactive mode is unsustainable. We must ensure the game is played on our field."*
## For the Executive Reader
This is not a three-year digital transformation. It is a **180-day strategic reset** with measurable business outcomes at each phase gate.
| Phase | Timeline | What the Board Sees |
|-------|----------|---------------------|
| **Hygiene** | Days 0-30 | Visibility. For the first time, we know every identity, asset, and gap that could end the company. |
| **Control** | Days 30-60 | Containment. The highest-risk exposures are closed using tools already owned. |
| **Sovereignty** | Days 60-90 | Ownership. Proprietary intelligence is reclaimed. Recovery from disaster is proven, not assumed. |
| **Antifragility** | Days 90-180 | Advantage. The organization learns faster from disruption than competitors do. |
**Investment principle**: Configuration first. Procurement only if justified. Most value is extracted from existing tools before any new purchase is discussed.
**Governance**: Weekly steering committee. Monthly board update. Quarterly antifragility assessment. Hard go/no-go gates at days 30, 60, and 90.
**Modularity**: While this document presents the full 180-day program, every phase can be delivered as an independent, fixed-scope module. See [Modular Engagements](../core/modular-engagements.md) for the menu of standalone engagements.
*For the business case and financial justification, see [Business Case Template](business-case-template.md).*
*For board conversation guidance, see [C-Suite Conversation Guide](../core/c-suite-conversation-guide.md).*
---
## For the Practitioner
This playbook provides a **time-boxed, phase-gated roadmap** for transforming a fragile enterprise into an antifragile one. It is designed for immediate deployment in consulting engagements and can be adapted to organizational size, industry, and regulatory context.
The plan is structured in **four phases**: Hygiene (30 days), Control (60 days), Sovereignty (90 days), and Antifragility (180 days). Each phase builds on the previous. Skipping phases creates the illusion of progress while leaving structural fragility intact.
> **Core tenet**: Before any new purchase is discussed, exhaust the capabilities of existing tooling. See the [Zero-Budget Hardening Playbook](zero-budget-hardening.md) for the tactical expression of this principle.
---
## Phase 1: Hygiene (Days 030)
**Theme**: *You cannot defend what you cannot see.*
The first 30 days are aggressive, disruptive, and non-negotiable. The goal is not perfection; it is **visibility**. Every unknown identity, unmapped dependency, and unmonitored access path is a latent failure waiting to happen.
### Week 1-2: Identity and Access Blitz
**Tool strategy**: Use existing AD / Entra ID / IAM. No new purchases.
| Action | Owner | Deliverable | Existing Tool Leverage |
|--------|-------|-------------|------------------------|
| Aggressive identity audit | IAM / Security | Complete inventory of all human and non-human identities | ADUC, Entra ID portal, AWS IAM console |
| Disable all unknown / unused accounts | IAM | List of disabled accounts with business justification for exceptions | Existing IAM + PowerShell / CLI scripts |
| Rotate all critical passwords and shared secrets | Security Ops | Rotation log with verification | Existing IAM + LAPS (free from Microsoft) |
| Target: admin accounts, service accounts, krbtgt equivalents | AD / Cloud IAM | Documentation of every privileged account | Existing directory services |
| Implement password hygiene (minimum: audit) | IAM | Baseline report on password policy compliance | Native password policies + audit logs |
### Week 2-3: Perimeter and Communication Mapping
**Tool strategy**: Use native firewall management, open-source scanners, and manual audit before purchasing new NDR/VM platforms.
| Action | Owner | Deliverable | Existing Tool Leverage |
|--------|-------|-------------|------------------------|
| Audit all vendor / supplier access paths | Security / Procurement | Inventory of VPN, RDP, Citrix, SSH, FTP, SCP, API keys | Existing IAM, VPN logs, firewall logs |
| Review and document firewall rules | Network Team | Rule set with business justification for each | Native firewall management interfaces |
| Map public-facing assets from external perspective | Security | Attack surface report with P0 classification | Free/open-source: Shodan, certificate transparency logs, nmap |
| Implement aggressive vulnerability scanning | Security | Weekly scan results with trending | Existing scanner, Microsoft Defender Vulnerability Management, or OpenVAS |
### Week 3-4: Visibility and Monitoring Baseline
**Tool strategy**: Maximize existing EDR/SIEM before considering new platforms. A spreadsheet CMDB is infinitely better than no CMDB.
| Action | Owner | Deliverable | Existing Tool Leverage |
|--------|-------|-------------|------------------------|
| Deploy endpoint detection on all managed devices | SOC / MDE | Coverage report: % of estate monitored | Existing EDR (Defender, CrowdStrike, SentinelOne) |
| Establish log aggregation for critical systems | Security | Centralized logging for T0 and T1 assets | Existing SIEM, syslog server, or cloud native logging (Sentinel, CloudWatch, Cloud Logging) |
| Create initial CMDB seed for critical systems | IT / Security | CMDB populated with crown jewels | Existing ITAM, ServiceNow, or spreadsheet |
| Document "kill chain": shortest path to organizational failure | Security Architect | Threat model and mitigation map | Manual analysis + stakeholder interviews |
### Phase 1 Exit Criteria
- [ ] 100% of identities known and validated
- [ ] 100% of privileged access reviewed
- [ ] All public-facing assets identified and scanned
- [ ] Centralized logging operational for critical systems
- [ ] CMDB seeded with T0/T1 assets
- [ ] Initial "kill chain" documented
### Phase 1 Mantra
> *"Do not be afraid to break things temporarily. Disable first, justify second. Visibility before permission."*
---
## Phase 2: Control (Days 3060)
**Theme**: *What we have seen, we must now contain.*
With visibility established, the next 30 days focus on **closing the highest-risk gaps** without introducing operational paralysis. This is the phase of quick wins and surface reduction.
### Week 5-6: Attack Surface Reduction (ASR)
**Tool strategy**: ASR rules and PAWs are native Microsoft capabilities. For non-Microsoft environments, use existing endpoint management.
| Action | Owner | Deliverable | Existing Tool Leverage |
|--------|-------|-------------|------------------------|
| Eliminate shared accounts where possible | IAM | Reduction metric: % of shared accounts decommissioned | Existing IAM + access review process |
| Implement Attack Surface Reduction rules on endpoints | Endpoint Security | ASR policy deployed and compliance measured | Microsoft Defender ASR (already owned in E3/E5) |
| Harden admin access: dedicated PAWs, no browsing, no email | Security | PAW architecture documented and deployed | Existing Windows / Intune / GPO |
| Review and minimize permissions across all platforms | IAM / App Owners | Permission matrix with least-privilege gaps identified | Native IAM interfaces + scripts |
### Week 6-7: Network and DNS Security
**Tool strategy**: Use existing DNS infrastructure, firewall segmentation, and open-source sensors (Zeek/Suricata) before buying NDR.
| Action | Owner | Deliverable | Existing Tool Leverage |
|--------|-------|-------------|------------------------|
| Deploy DNS security (filtering, logging, anomaly detection) | Network | DNS security coverage report | Existing DNS infrastructure, Quad9/Cloudflare free tiers, Microsoft DNS security |
| Segment IT/OT networks where they intersect | Network / OT | Network segmentation diagram and policy | Existing firewalls and VLANs |
| Deploy network sensors at critical boundaries | SOC | Sensor coverage map with alerting validated | Zeek or Suricata (open-source) or existing IDS/IPS |
### Week 7-8: Multi-Factor Authentication and Conditional Access
**Tool strategy**: MFA and conditional access are native capabilities of Entra ID, Okta, and cloud IAM. No additional purchase required.
| Action | Owner | Deliverable | Existing Tool Leverage |
|--------|-------|-------------|------------------------|
| Enforce MFA on all remote access paths | IAM | MFA coverage: 100% of remote access | Entra ID, Okta, Duo, or native cloud IAM MFA |
| Implement conditional access policies | IAM / Cloud | Policy set: device compliance, location, risk score | Entra ID Conditional Access, AWS IAM, GCP IAM |
| Review and harden M365 / Google Workspace security | Cloud Team | Cloud security posture report | Microsoft Secure Score, Google Security Health Analytics |
### Phase 2 Exit Criteria
- [ ] Shared accounts reduced by minimum 50%
- [ ] ASR rules active on all managed endpoints
- [ ] MFA enforced on 100% of remote and privileged access
- [ ] DNS security operational
- [ ] Network segmentation policy defined and initial segments implemented
- [ ] Conditional access policies active for cloud workloads
### Phase 2 Mantra
> *"The goal is not to block everything. It is to ensure that every allowed path is known, justified, and monitored."*
---
## Phase 3: Sovereignty (Days 6090)
**Theme**: *Reclaim what should never have been rented.*
This is where the antifragile approach diverges sharply from conventional hardening. The focus shifts from defending the perimeter to **owning the intelligence** that drives the organization.
### Week 9-10: AI Sovereignty Assessment
**Tool strategy**: Discovery requires interviews and proxy log analysis. No purchase needed for assessment.
| Action | Owner | Deliverable | Existing Tool Leverage |
|--------|-------|-------------|------------------------|
| Inventory all AI usage: approved and shadow | Security / AI Lead | AI usage map with data classification | Proxy logs, SaaS billing review, employee interviews |
| Classify AI workloads by sovereignty requirement | Security Architect | T0/T1/T2 AI asset classification | Existing data classification framework |
| Identify highest-value local AI pilot candidate | AI Lead / Business | Pilot scope document with success criteria | Business stakeholder interviews |
| Assess vendor AI terms: data usage, training, termination | Legal / Security | Risk register for each AI provider | Legal review of existing contracts |
### Week 10-11: Local AI Infrastructure Deployment
**Tool strategy**: Start with existing hardware or low-cost sovereign cloud. Use open-source inference servers (Ollama, vLLM, llama.cpp).
| Action | Owner | Deliverable | Existing / Low-Cost Tool Leverage |
|--------|-------|-------------|----------------------------------|
| Deploy local inference infrastructure (on-prem or sovereign cloud) | Infrastructure | Operational inference cluster | Underutilized servers, retired workstations, or sovereign cloud VM |
| Establish model versioning and artifact management | MLOps / Security | Model registry with provenance tracking | Git + DVC or simple artifact storage |
| Implement access controls for model weights and training data | Security | T0-class protection for AI assets | Existing file servers, encryption, IAM |
| Deploy initial pilot: RAG or fine-tuned model on proprietary data | AI Team | Working pilot with performance baseline | Ollama, llama.cpp, or vLLM (open-source) + quantized open models |
### Week 11-12: Backup, Recovery, and Validation
**Tool strategy**: Use existing backup and DR infrastructure. The goal is to test and document, not to buy.
| Action | Owner | Deliverable | Existing Tool Leverage |
|--------|-------|-------------|------------------------|
| Perform full recovery drill of one critical system from backup | IT / Security | Recovery time documented, gaps identified | Existing backup solution |
| Validate backup integrity for all T0 assets | Backup Admin | Integrity report with sample restorations | Existing backup solution + integrity scripts |
| Test local AI pilot under degraded network conditions | AI / Infrastructure | Resilience validation report | Existing network infrastructure + manual testing |
| Document and exercise incident response for AI-specific threats | SOC / Security | Runbook: model poisoning, data exfiltration, adversarial input | Existing IR framework + internal knowledge |
### Phase 3 Exit Criteria
- [ ] All AI usage inventoried and classified
- [ ] Local inference infrastructure operational
- [ ] One high-value AI pilot deployed and measured
- [ ] T0 protection applied to model weights and training data
- [ ] Critical system recovery drill completed successfully
- [ ] AI-specific incident response runbook created
### Phase 3 Mantra
> *"We are moving from being consumers of intelligence to manufacturers of our own. The vault is built; now we fill it."*
---
## Phase 4: Antifragility (Days 90180)
**Theme**: *Build systems that grow stronger from disruption.*
The final phase converts the hardened foundation into an adaptive, learning organization. This is where antifragility becomes operational reality.
### Month 4: Structural Decoupling and Optionality
**Tool strategy**: Documentation, architecture, and open-source chaos tools (Chaos Mesh, Gremlin free tier, custom scripts). Work, not purchases.
| Action | Owner | Deliverable | Existing / Free Tool Leverage |
|--------|-------|-------------|------------------------------|
| Document exit architecture for all major platform dependencies | Enterprise Architecture | 90-day exit plan per critical vendor | Architecture documentation, existing runbooks |
| Implement abstraction layers for proprietary integrations | Engineering | Interface documentation and migration test | Existing development tools and frameworks |
| Establish dual-vendor readiness for one critical category | Procurement / Engineering | Technical proof of capability | Existing engineering capacity, open standards |
| Deploy chaos engineering: simulate critical dependency failure | Resilience Team | Chaos experiment report with findings | Chaos Mesh (open-source), custom scripts, Gremlin free tier |
### Month 5: Stress-to-Signal Conversion
**Tool strategy**: Process and culture changes require no licensing. Use existing EDR/SIEM for detection validation.
| Action | Owner | Deliverable | Existing Tool Leverage |
|--------|-------|-------------|------------------------|
| Implement blameless post-mortem process with structural mandates | Culture / Security | Post-mortem template and governance | Existing collaboration tools (Confluence, SharePoint, Notion) |
| Deploy production chaos engineering with automated rollback | Resilience Team | Monthly chaos experiment schedule | Existing orchestration + open-source chaos tools |
| Create feedback loop: incident findings → architecture changes | Security Architect | Closed-loop metrics: mean time to structural fix | Existing ticketing system (Jira, ServiceNow) |
| Launch "red team as a service": continuous adversarial testing | Security | Monthly red team report | Internal team + existing EDR/SIEM for detection validation |
### Month 6: Defensive AI and Continuous Modernisation
**Tool strategy**: Defensive AI runs on the local inference infrastructure already deployed. Posture measurement uses existing APIs and open-source dashboards.
| Action | Owner | Deliverable | Existing / Low-Cost Tool Leverage |
|--------|-------|-------------|----------------------------------|
| Expand local AI to defensive use cases: anomaly detection, code review, vulnerability prioritization | AI / Security | Defensive AI capability map | Local AI cluster deployed in Phase 3 |
| Implement automated security posture measurement | Security | Continuous compliance dashboard | Existing APIs (Microsoft Graph, AWS APIs) + Grafana or open-source dashboard |
| Evaluate and migrate additional AI workloads to local infrastructure | AI Lead | Migration roadmap with quarterly targets | Local AI infrastructure + business case templates |
| Conduct first antifragility maturity assessment | Consultant / Security | Baseline maturity score with gap analysis | Spreadsheet or existing GRC tool |
| Pilot organizational integration: embed security in one product team | Consultant / Engineering | Shift-left pilot metrics | Existing team structure + collaboration tools |
| **Deploy AI-assisted TVM operationalization** | AI / Security | AI TVM dashboard; <48h critical CVE response | Defender Exposure Management + Azure OpenAI or local LLM; see [AI-Assisted TVM Blueprint](ai-assisted-tvm.md) |
### Phase 4 Exit Criteria
- [ ] Exit architectures documented for top 5 vendor dependencies
- [ ] Chaos engineering operational in production
- [ ] Mean time to structural fix < 14 days from incident
- [ ] Defensive AI pilot operational
- [ ] First antifragility maturity assessment completed
- [ ] Quarterly antifragility review calendar established
### Phase 4 Mantra
> *"We do not want fewer incidents. We want incidents that teach us something we could not have learned any other way."*
---
## Governance and Cadence
### Weekly Steering Committee
- Review blockers and escalations
- Validate phase exit criteria
- Adjust scope based on organizational readiness
### Monthly Board Update
- Risk reduction metrics
- Antifragility maturity trend
- Investment vs. risk-exposure reduction
- Strategic narrative: "This is not a cost centre; it is optionality insurance"
### Quarterly Retrospective
- What failed that taught us something?
- What assumptions have been invalidated?
- What new dependencies have emerged?
- What can be simplified or removed?
---
## Success Metrics
| Dimension | Metric | Target |
|-----------|--------|--------|
| **Visibility** | % of assets in CMDB | 100% of T0/T1 within 30 days |
| **Control** | Mean time to contain new identity | < 1 hour |
| **Sovereignty** | % of proprietary AI workloads local | 100% of T0-class within 90 days |
| **Resilience** | Recovery time for critical system | < 4 hours |
| **Learning** | Structural fixes per incident | ≥ 1 |
| **Optionality** | Vendor dependencies without exit plan | 0 |
---
## Adaptation Guide
### Small Organizations (< 100 employees)
- Compress Phases 1-2 into 30 days
- Use managed sovereign cloud for local AI instead of on-premises hardware
- Focus on identity, backup, and one high-value AI pilot
- Leverage Microsoft Business Premium or Google Workspace security features fully before any additional purchase
### Regulated Industries (Finance, Healthcare, Critical Infrastructure)
- Extend Phase 1 to 45 days for compliance mapping
- Integrate regulatory requirements into T0 classification
- Add compliance validation gates at each phase exit
### Highly Distributed Organizations
- Prioritize network segmentation and DNS security in Phase 1
- Deploy edge inference nodes in Phase 3 instead of central cluster
- Emphasize operational resilience and disconnected operations
### Organizations with Heavy Technical Debt
- Accept that 20 years of debt cannot be cleared in 180 days
- Use defensive AI in Phase 4 to accelerate debt identification and prioritization
- Focus on "kill chain" protection rather than comprehensive cleanup
- Map every action to CIS IG1 to show standards alignment without additional framework investment
---
*Next: [Implementation Playbook](implementation-playbook.md)*
*Previous: [T0 Asset Framework](../core/t0-asset-framework.md)*

View File

@@ -0,0 +1,265 @@
# Zero-Budget Hardening Playbook
> *"The most expensive security tool is the one you already bought and never turned on."*
This playbook provides tactical guidance for hardening an enterprise's security posture using **existing tools, native platform capabilities, and open-source alternatives**. It is designed for consultants whose clients need to reduce technological debt and improve resilience without additional software procurement.
The philosophy is simple: **maximize current investment before discussing new investment**. This builds trust, demonstrates competence, and preserves optionality for strategic purchases later.
---
## The Underutilization Audit
Before proposing any new tool, conduct this audit. It typically reveals that the client already owns 60-80% of the capabilities they need.
### Microsoft-Centric Environments (Most Common)
> **Critical distinction**: Most of our clients own **E3**, not E5. The table below shows the E5 ideal; see [M365 E3 Hardening](m365-e3-hardening.md) for the pragmatic E3 reality.
| Capability | What E5 Includes | What E3 Includes | What Is Often Unused | Activation Effort |
|-----------|------------------|------------------|---------------------|-------------------|
| Endpoint Detection | Defender for Endpoint P2 (EDR, ASR) | Defender Antivirus only (no EDR) | Real-time protection, network protection | Low |
| SIEM / Log Analytics | Microsoft Sentinel | Log Analytics only (no Sentinel) | Basic KQL queries, log forwarding | Medium |
| Identity Protection | Entra ID P2 (PIM, conditional access, risk) | Entra ID Free (per-user MFA only) | Per-user MFA, basic audit | Low |
| Email Security | Defender for Office 365 P2 (Safe Links, Safe Attachments) | EOP only (basic anti-phishing) | Anti-malware, anti-spam tuning | Low |
| Data Protection | Microsoft Purview (DLP, labels) | None | N/A | N/A |
| Cloud Security | Microsoft Defender for Cloud | Basic Defender for Cloud (limited) | Secure score review | Low |
| PAM (Basic) | Entra ID PIM + LAPS | LAPS only (no PIM) | LAPS deployment | Low |
**E3 Strategy**: Maximize native E3 capabilities, augment with open-source tools (Wazuh, Sysmon), and selectively license add-ons for critical users rather than blanket E5 upgrades.
**The Pitch (E3 Clients)**:
> *"You own E3, not E5. That means we do not have EDR, conditional access, or advanced email filtering out of the box. But we do have solid foundations: antivirus, basic MFA, audit logging, and EOP. Our first job is to turn every E3 knob to maximum, then close the most dangerous gaps with free tools like Sysmon and Wazuh. If gaps remain that threaten your specific risk profile, we will size a selective upgrade—not a blanket one."*
### Multi-Cloud / Heterogeneous Environments
| Capability | Native Free/Cheap Options |
|-----------|--------------------------|
| Vulnerability scanning | AWS Inspector (basic), Azure Update Manager, Google OS Config |
| Configuration compliance | AWS Config (basic), Azure Policy, Google Organization Policy |
| Log aggregation | CloudWatch Logs, Azure Monitor Logs, Cloud Logging |
| Identity security | AWS IAM Access Analyzer, Azure AD Identity Protection, Google Cloud IAM Recommender |
| Network monitoring | VPC Flow Logs, Azure NSG Flow Logs, Google Cloud VPC Flow Logs |
| Cost anomaly detection | AWS Cost Anomaly Detection, Azure Cost Management, Google Cloud Billing Alerts |
### Open-Source Force Multipliers
When native capabilities are insufficient, these open-source tools can close gaps without license costs:
| Category | Tool | When to Use |
|----------|------|-------------|
| EDR / XDR | Wazuh | Need centralized endpoint visibility but no EDR budget |
| SIEM | Wazuh (again), Graylog, Grafana Loki | Need log analysis without commercial SIEM |
| Vulnerability Management | OpenVAS | Need scanning without commercial VM platform |
| Network Monitoring | Zeek, Suricata | Need IDS/IPS without commercial NDR |
| Asset Discovery | OpenLDAP scripts, Nmap, Masscan | Need network asset discovery |
| Threat Intelligence | MISP (free tier), AlienVault OTX | Need IOC sharing and correlation |
| Password Auditing | Hashcat, John the Ripper | Need to audit password strength internally |
| Backup Verification | Custom scripts (rsync, hash verification) | Need to validate backup integrity |
| Local AI Inference | Ollama, llama.cpp, vLLM | Need sovereign AI without API costs |
---
## The 30-Day Zero-Budget Sprint
This sprint assumes the client has a typical Microsoft-centric environment with E3 or E5 licensing. Adapt for other environments.
### Week 1: Turn On What You Own
> **Note for E3 clients**: Skip the ASR and advanced EDR steps below. E3 includes Defender Antivirus only. See [M365 E3 Hardening](m365-e3-hardening.md) for the E3-specific week 1 plan. The steps below assume E5 or Defender for Endpoint P2.
**Day 1-2: Microsoft Defender for Endpoint (E5 Only)**
- Verify onboarding coverage: what % of endpoints are reporting?
- Enable ASR rules in **Audit** mode (not block) to measure impact:
- Block executable content from email client and webmail
- Block JavaScript or VBScript from launching downloaded executable content
- Block Office applications from creating child processes
- Block Office applications from injecting code into other processes
- Block Adobe Reader from creating child processes
- Block persistence through WMI event subscription
- Enable exploit protection with default settings
- Enable network protection in **Audit** mode
**Day 3-4: Entra ID (Azure AD) Hardening**
- **E5 clients**: Enable security defaults **or** configure conditional access:
- Require MFA for all users, all cloud apps
- Block legacy authentication
- Require compliant or hybrid Azure AD joined device for admin roles
- Enable PIM for Global Administrator and other privileged roles
- **E3 clients**: Enable per-user MFA for all users (no conditional access available)
- Block legacy authentication tenant-wide
- Review and reduce standing admin assignments manually
- Document conditional access as a gap for steering committee
**Day 5: Email Security**
- **E5 clients**: Enable Safe Links and Safe Attachments for all recipients; configure anti-phishing policies with impersonation protection
- **E3 clients**: Tune EOP anti-phishing, anti-malware, and anti-spam to maximum aggression; configure impersonation protection in EOP; document Safe Links/Safe Attachments gap
- Enable mailbox auditing for all users (works in E3)
### Week 2: Visibility and Hygiene
**Day 6-7: Log Aggregation**
- Enable diagnostic settings for all Azure resources to Log Analytics
- Enable Microsoft 365 auditing
- If no Sentinel, use Log Analytics + KQL for basic querying
**Day 8-9: Identity Hygiene**
- Export all users, groups, and service principals
- Disable unused accounts (> 90 days inactive, no owner)
- Identify shared mailboxes with login capability and restrict
- Review enterprise applications (OAuth consents) and revoke suspicious grants
**Day 10: Secure Score Review**
- Review Microsoft Secure Score (Defender for Cloud + M365)
- Pick 5 improvements that require **no purchase**
- Execute them
### Week 3: Configuration and Control
**Day 11-12: Windows Defender Firewall**
- Enforce firewall on all profiles (domain, private, public)
- Enable logging for dropped packets
- Review and document any exceptions
**Day 13-14: LAPS (Local Administrator Password Solution)**
- Deploy LAPS via GPO or Intune
- Set unique random passwords for all local admin accounts
- Configure password expiration (30-60 days)
**Day 15: DNS Security**
- Enable DNS over HTTPS (DoH) on Windows 11 endpoints via Intune/GPO
- Configure DNS filtering (Quad9, Cloudflare for Teams free tier, or native Microsoft DNS security)
- Enable DNS query logging if infrastructure supports it
### Week 4: Validation and Documentation
**Day 16-17: Backup Verification**
- Inventory all backup jobs
- Select one non-critical system and perform test restore
- Document gaps in coverage or recovery time
**Day 18-19: External Perspective**
- Run basic external scan using free tools (Shodan search for your IP ranges, SSL Labs for public websites)
- Document exposed services and missing TLS configurations
**Day 20: Metrics and Reporting**
- Calculate "before and after" metrics:
- EDR coverage %
- MFA enrollment %
- Secure Score change
- Number of disabled unused accounts
- Number of ASR audit-mode triggers
- Present to stakeholders with cost: **$0 in new licensing**
---
## The 60-90 Day Extension: Configuration as Control
Once the initial sprint proves value, extend into structural improvements that require work but not purchase.
### Conditional Access Refinement
| Policy | Target | Risk Addressed |
|--------|--------|----------------|
| Require MFA from untrusted locations | All users | Credential stuffing, brute force from abroad |
| Require compliant device for sensitive apps | Finance, HR, Engineering | Data exfiltration from unmanaged devices |
| Block download from unmanaged devices | SharePoint, OneDrive | Shadow IT data leakage |
| Require password change on high user risk | All users | Compromised credential remediation |
### ASR Rules: From Audit to Block
After 30 days of audit-mode data:
- Review ASR rule hits
- Identify false positives and create exclusions
- Switch high-confidence rules to **Block** mode
- Monitor for 2 weeks, then iterate
### Automated Response (No SOAR Required)
Use native platform automation:
| Platform | Native Automation | Use Case |
|----------|-------------------|----------|
| Microsoft | Logic Apps + Sentinel / Defender APIs | Auto-isolate high-risk device, auto-disable compromised account |
| AWS | EventBridge + Lambda | Auto-snapshot compromised EC2, auto-revoke suspicious IAM key |
| Azure | Logic Apps + Azure Monitor | Auto-scale compromised resource, auto-trigger runbook |
| Google Cloud | Cloud Functions + Cloud Monitoring | Auto-suspend suspicious service account |
These require no additional licensing—only development time.
---
## AI Sovereignty on Existing Hardware
Local AI does not require a $50,000 GPU cluster to start. Many organizations have underutilized servers or workstations that can run quantized models.
### Minimum Viable Local AI
| Component | Specification | Typical Source |
|-----------|--------------|----------------|
| CPU inference host | 8+ cores, 32GB+ RAM | Underutilized server, retired workstation |
| Storage | 100GB SSD for models and data | Existing SAN or local SSD |
| GPU (optional) | NVIDIA with 8GB+ VRAM for faster inference | Existing CAD/ML workstation |
| Software | Ollama or llama.cpp | Free, open-source |
| Model | Llama 3.1 8B or Mistral 7B (4-bit quantized) | Free download |
**Pilot Workflow**: Internal code review assistant or security log summarizer. These are low-risk, high-signal use cases that prove local AI viability without disrupting operations.
---
## Common Objections and Responses
| Objection | Response |
|-----------|----------|
| "We need a proper EDR, not Defender." | Defender for Endpoint is a Leader in Gartner Magic Quadrant. Most organizations have not enabled its advanced features. Let us turn those on first and measure. |
| "Open source is not enterprise-grade." | Zeek, Suricata, Wazuh, and Ollama are used by Fortune 500 companies and government agencies. The issue is not the tool; it is the expertise to run it. |
| "We don't have time to configure this." | Configuration is a one-time investment with perpetual returns. Buying a new tool also requires configuration—plus negotiation, procurement, and onboarding. |
| "Our auditor wants to see vendor support." | For audit evidence, native platform capabilities (Microsoft, AWS, Google) come with vendor backing. Open-source can be supplemented with commercial support if needed. |
| "The board wants us to buy something." | The board wants risk reduction. Show them risk reduction at zero incremental cost, and they will trust you when you later recommend strategic purchases. |
---
## The Consultant's Value Proposition
When you deliver zero-budget hardening, you demonstrate:
1. **Independence**: You are not here to sell software. You are here to solve problems.
2. **Competence**: You know how to extract value from complex platforms.
3. **Speed**: Visible improvement in 30 days builds momentum and political capital.
4. **Trust**: When you later recommend a purchase, it will be because the gap genuinely requires it—not because you have a quota.
### The Opening Pitch
> *"Before we talk about what to buy, let us talk about what you already own. In our experience, most organizations are utilizing less than 40% of their existing security capabilities. Our 30-day sprint will turn on, tune, and operationalize what you have already paid for. If there is still a gap after that, we will recommend the minimum viable purchase to close it."*
---
## Integration With Rapid Modernisation
The Zero-Budget Hardening Playbook maps directly onto the [Rapid Modernisation Plan](rapid-modernisation-plan.md):
| Rapid Modernisation Phase | Zero-Budget Focus |
|--------------------------|-------------------|
| Hygiene (Days 0-30) | Turn on existing EDR, enable MFA, configure conditional access, inventory identities |
| Control (Days 30-60) | ASR rules, LAPS, DNS security, log aggregation with existing tools |
| Sovereignty (Days 60-90) | Local AI on existing hardware, backup verification with existing solution |
| Antifragility (Days 90-180) | Open-source network monitoring, native automation, chaos engineering with free tools |
---
*Previous: [Rapid Modernisation Plan](rapid-modernisation-plan.md)*
*Next: [Implementation Playbook](implementation-playbook.md)*

View File

@@ -0,0 +1,625 @@
# Zero-Budget Vulnerability Discovery
> *"Most organizations do not know what vulnerabilities they have because they have never looked. Not because Tenable is too expensive. Because nobody wrote a PowerShell script and ran it."*
This playbook provides practical, script-based methods for discovering vulnerabilities across Windows servers, Linux servers, containers, and network devices **without purchasing commercial vulnerability scanners** like Tenable, Qualys, or Rapid7. It is designed for the first sweep—the baseline discovery that proves value before any procurement discussion.
The approach is **agentless and authentication-based** where possible: we use existing administrative access (SSH, WinRM, RDP, Azure/AWS APIs) to collect inventory and correlate it with vulnerability data. No agents. No new licenses. Just scripts, open-source tools, and expertise.
---
## The Philosophy: Discovery Before Procurement
Before recommending Tenable, Qualys, or any commercial scanner, we prove that:
1. The client does not know their inventory
2. There are critical vulnerabilities that can be found with free tools
3. The commercial scanner will be worth the money—once we know what gaps it needs to fill
**The rule**: If a script run from a laptop finds 50 critical missing patches in 2 hours, the business case for a commercial scanner becomes trivial. The scanner is no longer a gamble. It is an operationalization of proven need.
---
## Method 1: Windows Server Enumeration (PowerShell)
Most Windows environments have at least partial administrative access. A PowerShell script run with domain admin or local admin credentials can enumerate the entire estate in hours.
### The Basic Script: What to Collect
```powershell
# Save as Get-ServerVulnBaseline.ps1
# Run from a management workstation with domain admin or appropriate privileges
$Computers = Get-ADComputer -Filter {OperatingSystem -like "*Server*"} | Select-Object -ExpandProperty Name
$Results = @()
foreach ($Computer in $Computers) {
try {
$Session = New-CimSession -ComputerName $Computer -OperationTimeoutSec 30
# OS Version and Build
$OS = Get-CimInstance -CimSession $Session -ClassName Win32_OperatingSystem
# Installed Hotfixes
$Hotfixes = Get-CimInstance -CimSession $Session -ClassName Win32_QuickFixEngineering |
Select-Object -ExpandProperty HotFixID
# Installed Software (Add/Remove Programs)
$Software = Get-CimInstance -CimSession $Session -ClassName Win32_Product |
Select-Object Name, Version, Vendor
# Windows Features / Roles
$Features = Get-WindowsFeature -ComputerName $Computer | Where-Object {$_.Installed} |
Select-Object -ExpandProperty Name
# Antivirus Status (Windows Defender or third-party)
$AV = Get-CimInstance -CimSession $Session -Namespace "root\SecurityCenter2" -ClassName AntiVirusProduct -ErrorAction SilentlyContinue
# Firewall Status
$Firewall = Get-NetFirewallProfile -CimSession $Session | Select-Object Name, Enabled
# Local Administrators
$Admins = Get-LocalGroupMember -Group "Administrators" -ErrorAction SilentlyContinue
$Results += [PSCustomObject]@{
ComputerName = $Computer
OSVersion = $OS.Caption
OSBuild = $OS.BuildNumber
LastBoot = $OS.LastBootUpTime
Hotfixes = ($Hotfixes -join ";")
SoftwareCount = $Software.Count
KeySoftware = ($Software | Where-Object {$_.Name -match "SQL|IIS|Exchange|SharePoint|Remote Desktop|Citrix"} | ForEach-Object {"$($_.Name)=$($_.Version)"} -join ";")
Features = ($Features -join ";")
AVProduct = if ($AV) { $AV.displayName } else { "None detected" }
FirewallEnabled = ($Firewall | Where-Object {$_.Enabled -eq $true}).Count
LocalAdmins = ($Admins | Measure-Object).Count
Reachable = $true
}
Remove-CimSession -CimSession $Session
}
catch {
$Results += [PSCustomObject]@{
ComputerName = $Computer
OSVersion = "Unreachable"
Reachable = $false
Error = $_.Exception.Message
}
}
}
$Results | Export-Csv -Path "ServerBaseline.csv" -NoTypeInformation
```
**What this produces in 30 minutes**:
- A CSV of every Windows Server with OS build, patches, software, roles, AV status, firewall status
- Immediate red flags: servers with no AV, no firewall, ancient OS builds, excessive local admins
- A hotfix list you can correlate against Microsoft Security Response Center bulletins
### The OS Build Risk Filter
Once you have the CSV, filter for end-of-life or near-end-of-life OS builds:
| OS / Build | Status | Risk |
|-----------|--------|------|
| Windows Server 2008 R2 / 2012 R2 | End of life | Critical |
| Windows Server 2016 (Build 14393) | Extended support | High |
| Windows Server 2019 (Build 17763) | Active, but check patch level | Medium |
| Windows Server 2022 (Build 20348) | Current | Low |
**The conversation**:
> *"We ran a script for 30 minutes and found 12 servers running operating systems that no longer receive security patches. Three of them are internet-facing. We do not need a €50,000 scanner to tell us that is a kill chain. We need it to track the remediation. But first, we fix these 12."*
---
## Method 2: Linux Server Enumeration (Bash / SSH)
For Linux estates, SSH-based enumeration is fast and requires no agents.
### The Basic Script
```bash
#!/bin/bash
# Save as linux-vuln-baseline.sh
# Run from a jump host with SSH key access to target servers
SERVERS=$(cat server-list.txt)
OUTPUT_DIR="./linux-baseline-$(date +%Y%m%d)"
mkdir -p $OUTPUT_DIR
for SERVER in $SERVERS; do
echo "Scanning $SERVER..."
ssh -o ConnectTimeout=10 -o StrictHostKeyChecking=no $SERVER "
echo '=== OS ==='
cat /etc/os-release
echo '=== KERNEL ==='
uname -r
echo '=== PACKAGES ==='
if command -v rpm >/dev/null; then rpm -qa --last; fi
if command -v dpkg >/dev/null; then dpkg -l; fi
if command -v apt >/dev/null; then apt list --installed 2>/dev/null; fi
echo '=== SERVICES ==='
systemctl list-units --type=service --state=running
echo '=== LISTENING PORTS ==='
ss -tlnp
echo '=== USERS WITH SHELL ==='
grep -E 'bash|sh|zsh' /etc/passwd
echo '=== SUDOERS ==='
cat /etc/sudoers 2>/dev/null | grep -v '^#' | grep -v '^$'
echo '=== SSH CONFIG ==='
grep -E 'PermitRootLogin|PasswordAuthentication|Port' /etc/ssh/sshd_config
" > "$OUTPUT_DIR/$SERVER.txt" 2>&1
done
echo "Results in $OUTPUT_DIR"
```
**What this produces**:
- Per-server files with OS, kernel, all installed packages, running services, listening ports, user accounts, SSH hardening
- Immediate red flags: password authentication enabled, root login permitted, ancient kernels, unnecessary services exposed
### The Package-to-CVE Correlation
For the first sweep, you do not need a commercial correlator. Use open-source tools:
**Option A: Grype (recommended)**
```bash
# Install grype (single binary, no dependencies)
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin
# On each server, generate SBOM and scan
syft packages dir:/ -o json > /tmp/sbom.json
grype sbom:/tmp/sbom.json -o table > /tmp/vulns.txt
```
**Option B: Vulners (for Linux package managers)**
```bash
# For Ubuntu/Debian with apt
apt-get install -y apt-vulns-severity
apt-get --just-print upgrade | apt-vulns-severity
# For RHEL/CentOS with yum
yum install -y yum-plugin-security
yum --security check-update
```
**What this produces**:
- A list of installed packages with known CVEs
- Severity ratings
- Whether fixes are available in the distribution repositories
---
## Method 3: Container and Application SBOM
Modern environments run containers. Containers bundle vulnerabilities. SBOM + CVE correlation is the fastest way to find them.
### SBOM Generation
```bash
# Install Syft
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin
# Generate SBOM from running containers
docker ps --format "{{.Names}}" | while read container; do
syft $container -o spdx-json > "sboms/${container}.json"
done
# Generate SBOM from container images in registry
# (Requires registry access; adapt for ACR, ECR, GCR, Harbor, etc.)
```
### CVE Scanning the SBOMs
```bash
# Install Grype
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin
# Scan all SBOMs
for sbom in sboms/*.json; do
grype sbom:$sbom -o json > "vulns/$(basename $sbom .json)-vulns.json"
done
# Aggregate critical findings
jq -r '.matches[] | select(.vulnerability.severity == "Critical") | [.artifact.name, .artifact.version, .vulnerability.id, .vulnerability.severity] | @tsv' vulns/*.json | sort | uniq -c | sort -rn > critical-vulns.txt
```
**What this produces in 1 hour**:
- A complete inventory of every container's software components
- Every known CVE in those components
- Critical vulnerabilities ranked by frequency (if 15 containers have the same vulnerable log4j version, that is your top fix)
**The conversation**:
> *"We generated software bills of materials for your 40 running containers and found 340 known vulnerabilities. 12 are critical. Five of those critical vulnerabilities are in your customer-facing API container. We have the updated base image ready. No scanner purchase required."*
---
## Method 4: Network-Based Unauthenticated Scanning
When you cannot authenticate to every system, network scanning fills gaps.
### OpenVAS / Greenbone (Free)
Greenbone Community Edition is a full vulnerability scanner that requires only network access:
```bash
# Deploy via Docker (fastest way to test)
docker run -d -p 443:443 --name openvas greenbone/community-edition:latest
# Log in, create a target list, run scan
# Produces: full vulnerability report with CVSS, CVE references, and remediation guidance
```
**Limitations**: Community edition is not licensed for commercial use in some jurisdictions. For client engagements, use Greenbone Cloud Service (pay-per-scan) or deploy OpenVAS from source.
### Nmap Vulnerability Scripts
```bash
# Fast service discovery
nmap -sV -sC -O --top-ports 1000 -oA network-sweep $TARGET_NETWORK
# Vulnerability detection with NSE scripts
nmap --script vuln -p 21,22,23,25,53,80,110,135,139,143,443,445,993,995,1723,3306,3389,5900,8080 $TARGET_IP
# SMB vulnerability check (ETERNALBLUE, etc.)
nmap --script smb-vuln* -p 445 $TARGET_IP
# SSL/TLS weakness check
nmap --script ssl-enum-ciphers,ssl-heartbleed,ssl-poodle -p 443 $TARGET_IP
```
**What this produces**:
- Unauthenticated vulnerability findings
- Service versions that can be correlated with CVEs
- Network topology and unexpected exposed services
### ProjectDiscovery Stack (Modern, Fast, Free)
```bash
# Install
GO111MODULE=on go install -v github.com/projectdiscovery/naabu/v2/cmd/naabu@latest
GO111MODULE=on go install -v github.com/projectdiscovery/httpx/cmd/httpx@latest
GO111MODULE=on go install -v github.com/projectdiscovery/nuclei/v2/cmd/nuclei@latest
GO111MODULE=on go install -v github.com/owasp-amass/amass/v4/...@master
# Reconnaissance pipeline
# 1. Find live hosts
naabu -list targets.txt -o live-hosts.txt
# 2. Identify web services
httpx -list live-hosts.txt -o web-services.txt
# 3. Run vulnerability templates (10,000+ community templates)
nuclei -list web-services.txt -severity critical,high -o findings.txt
# 4. DNS enumeration
amass enum -d example.com -o dns-findings.txt
```
**What Nuclei produces**:
- Specific CVE detections (CVE-2024-XXXX)
- Misconfiguration findings (exposed .git, default credentials, open redirects)
- Technology fingerprinting
- All findings mapped to specific CVEs with remediation links
---
## Method 5: Osquery Cross-Platform Discovery (The Sovereign Method)
> *"Tenable is a rented microscope. osquery is a laboratory."*
For clients who want **owned visibility** rather than rented scanner reports, osquery is the most powerful zero-budget discovery method available. It is an open-source agent that exposes the operating system as a SQL database—Windows, Linux, and macOS.
### Why osquery Belongs Here
| Script-Based Discovery | osquery-Based Discovery |
|----------------------|------------------------|
| Point-in-time (run once, get a snapshot) | Continuous or scheduled (run every hour, every day) |
| Per-platform scripts (PowerShell for Windows, bash for Linux) | Single SQL query language across all platforms |
| Static output (CSV, text files) | Structured, queryable data you can ask follow-up questions of |
| Requires admin access every time | Agent enrolls once; queries run remotely via FleetDM |
| Hard to scale past 100 systems | Scales to 10,000+ endpoints with FleetDM control plane |
| Cannot detect runtime state (running processes, open ports in real time) | Real-time process, network, and configuration visibility |
### The 2-Hour osquery Proof of Concept
```bash
# Install osquery on a management workstation
# Windows: choco install osquery
# macOS: brew install osquery
# Ubuntu: apt install osquery
# Run interactive discovery queries against the local system
# (For remote systems, copy the binary or use FleetDM enrollment)
# 1. Windows software inventory with vulnerability flagging
osqueryi "SELECT si.computer_name, p.name, p.version,
CASE WHEN p.name LIKE '%Adobe%' AND CAST(REPLACE(p.version, '.', '') AS INTEGER) < 2023000 THEN 'POTENTIALLY VULNERABLE' ELSE 'REVIEW' END AS status
FROM programs p CROSS JOIN system_info si;"
# 2. Linux listening ports with process attribution
osqueryi "SELECT si.hostname, lp.port, lp.protocol, p.name, p.path
FROM listening_ports lp LEFT JOIN processes p ON lp.pid = p.pid
CROSS JOIN system_info si WHERE lp.address NOT IN ('127.0.0.1', '::1');"
# 3. SSH hardening check (Linux)
osqueryi "SELECT si.hostname, c.key, c.value,
CASE WHEN c.key = 'PermitRootLogin' AND c.value = 'yes' THEN 'CRITICAL' ELSE 'OK' END AS risk
FROM ssh_configs c CROSS JOIN system_info si
WHERE c.key IN ('PermitRootLogin', 'PasswordAuthentication', 'Port');"
# 4. End-of-life OS detection (all platforms)
osqueryi "SELECT si.hostname, os.name, os.version, os.build,
CASE
WHEN os.platform = 'windows' AND os.version LIKE '6.1%' THEN 'Windows 7/2008 R2 - EOL'
WHEN os.platform = 'windows' AND os.version LIKE '6.2%' THEN 'Windows 8/2012 - EOL'
WHEN os.platform = 'centos' AND os.version LIKE '7%' THEN 'CentOS 7 - EOL June 2024'
WHEN os.platform = 'ubuntu' AND os.version LIKE '18.04%' THEN 'Ubuntu 18.04 - EOL April 2023'
ELSE 'Check manually'
END AS eol_status
FROM os_version os CROSS JOIN system_info si;"
```
**What this produces in 2 hours**:
- Software inventory across all enrolled endpoints with version-based vulnerability flagging
- Real-time network exposure map (every listening port, every process)
- Configuration drift detection (firewall status, SSH hardening, encryption state)
- End-of-life operating system inventory
### Scaling to the Estate: FleetDM (Free Tier)
FleetDM is the open-source management platform for osquery. Free for up to 1,000 hosts:
```bash
# Deploy FleetDM in Docker (15 minutes)
git clone https://github.com/fleetdm/fleet.git
cd fleet/tools/osquery
docker-compose up -d
# Enroll endpoints with a single command per host
# FleetDM provides live query capability: ask a question, get answers in seconds
```
**For the complete osquery blueprint**—including query packs for Windows, Linux, and macOS vulnerability discovery, compliance policies, CVE correlation pipeline, and the consultant's 5-day delivery model—see **[Osquery: The Sovereign Discovery Platform](osquery-custom-platform.md)**.
### When to Use osquery vs. Scripts
| Scenario | Use Scripts | Use osquery |
|----------|------------|-------------|
| One-time sweep of 20-50 servers | ✅ Fast, no installation | Overkill |
| Continuous monitoring of 200+ endpoints | ❌ Unsustainable | ✅ Designed for this |
| Client needs compliance dashboards | ❌ Ad-hoc reports | ✅ Built-in policy engine |
| Cross-platform environment (Windows + Linux + macOS) | ❌ Separate scripts | ✅ Single query language |
| Client wants to own the data and queries | ❌ Vendor-dependent | ✅ Full sovereignty |
---
## Method 6: Cloud-Native Discovery (No Agents)
For Azure / AWS / GCP environments, the cloud provider already has the data. You just need to query it.
### Azure
```powershell
# Azure VM inventory with OS info
Get-AzVM | Select-Object Name, ResourceGroupName, Location,
@{Name="OS";Expression={$_.StorageProfile.OsDisk.OsType}},
@{Name="ImagePublisher";Expression={$_.StorageProfile.ImageReference.Publisher}},
@{Name="ImageOffer";Expression={$_.StorageProfile.ImageReference.Offer}},
@{Name="ImageSKU";Expression={$_.StorageProfile.ImageReference.Sku}},
@{Name="ImageVersion";Expression={$_.StorageProfile.ImageReference.Version}}
# Azure Update Manager: which VMs are missing critical updates?
# (Requires Azure Update Manager enabled; basic is free)
Get-AzSoftwareUpdateConfiguration | Where-Object {$_.ScheduleConfiguration.Frequency -eq "Hourly"}
# Azure Defender for Cloud secure score (free tier)
Get-AzSecuritySecureScore
```
### AWS
```bash
# EC2 instance inventory
aws ec2 describe-instances --query 'Reservations[].Instances[].[InstanceId,ImageId,PlatformDetails,InstanceType,LaunchTime,State.Name]' --output table
# Inspector findings (if Inspector v1/v2 is enabled; v2 is free for basic scanning)
aws inspector2 list-findings --severity CRITICAL
# Systems Manager Patch Compliance (if SSM agent is installed)
aws ssm describe-instance-patch-states --filters Key=Compliance,Values=NON_COMPLIANT
```
### GCP
```bash
# VM inventory
gcloud compute instances list --format="table(name,zone,status,machineType,disks[0].licenses[0])"
# OS Config vulnerability reports (if OS Config is enabled)
gcloud compute os-config vuln-reports list
```
**The conversation**:
> *"You already own Azure Update Manager and AWS Inspector Basic. They are free. You are not using them. Before we discuss Tenable, let us turn on the vulnerability discovery tools you already pay for as part of your cloud subscription."*
---
## Method 7: The SBOM-to-CVE Pipeline (Your Brainstorm, Implemented)
You mentioned SBOM collection and CVE validation. Here is a lightweight, zero-cost pipeline:
### Architecture
```
[Target System] → [Syft SBOM Generator] → [Grype CVE Scanner] → [Local AI Prioritizer] → [Executive Brief]
```
### Step-by-Step
```bash
#!/bin/bash
# zero-budget-tvm-pipeline.sh
# Run this from a management host with SSH/WinRM access to the estate
mkdir -p sboms vulns reports
# 1. COLLECT: Generate SBOMs from accessible systems
# Windows (via PowerShell remoting)
pwsh -c "
\$Servers = Get-ADComputer -Filter {OperatingSystem -like '*Server*'}
foreach (\$s in \$Servers) {
# Use Syft Windows binary if available, or fallback to registry enumeration
Invoke-Command -ComputerName \$s.Name -ScriptBlock {
Get-ItemProperty 'HKLM:\\Software\\Microsoft\\Windows\\CurrentVersion\\Uninstall\\*' |
Select-Object DisplayName, DisplayVersion, Publisher | Export-Csv C:\\tmp\\software.csv
} -ErrorAction SilentlyContinue
}
"
# Linux (via SSH)
for server in $(cat linux-servers.txt); do
ssh $server "
if command -v syft >/dev/null; then
syft dir:/ -o spdx-json
else
# Fallback: package manager output
if command -v rpm >/dev/null; then rpm -qa; fi
if command -v dpkg >/dev/null; then dpkg -l; fi
fi
" > "sboms/${server}.json" 2>/dev/null &
done
wait
# 2. SCAN: Correlate with CVE database
for sbom in sboms/*.json; do
if command -v grype >/dev/null; then
grype sbom:$sbom -o json > "vulns/$(basename $sbom .json)-vulns.json"
fi
done
# 3. PRIORITIZE: Extract critical/high, aggregate
jq -s '
[.[] | .matches[]? | select(.vulnerability.severity == "Critical" or .vulnerability.severity == "High") |
{cve: .vulnerability.id, severity: .vulnerability.severity, package: .artifact.name, version: .artifact.version}]
| group_by(.cve)
| map({cve: .[0].cve, severity: .[0].severity, package: .[0].package, affected_systems: length})
| sort_by(.affected_systems)
| reverse
' vulns/*.json > reports/aggregated-vulns.json
# 4. REPORT: Generate human-readable summary
cat > reports/executive-summary.md << 'EOF'
# Vulnerability Discovery Report
## Generated: $(date)
### Top Findings
EOF
jq -r '.[:20] | .[] | "- **\(.cve)** (\(.severity)): \(.package) — \(.affected_systems) systems affected"' reports/aggregated-vulns.json >> reports/executive-summary.md
echo "Report complete: reports/executive-summary.md"
```
**What this produces in 2-4 hours**:
- SBOMs for all accessible systems
- CVE correlation for every software component
- Aggregation: "CVE-2024-XXXX affects 23 of your servers"
- Executive summary: top 20 findings in Markdown
---
## The First Sweep Protocol
When you walk into a client with no vulnerability management program, run this sequence:
### Day 1: Discovery
| Hour | Activity | Tools |
|------|----------|-------|
| 0-1 | Identify scan targets from AD, Azure, AWS, or network range | Active Directory, cloud consoles |
| 1-3 | Run Windows PowerShell enumeration script | PowerShell, CIM sessions |
| 3-5 | Run Linux SSH enumeration script | Bash, SSH |
| 5-6 | Run network scan (Nmap + Nuclei) on external perimeter | Nmap, Nuclei |
| 6-8 | Generate container SBOMs and scan with Grype | Syft, Grype, Docker |
### Day 2: Correlation
| Hour | Activity | Tools |
|------|----------|-------|
| 0-2 | Correlate OS builds with Microsoft end-of-life list | Manual / spreadsheet |
| 2-4 | Correlate Linux packages with CVE database | Grype, vulners |
| 4-6 | Aggregate findings: top 20 vulnerabilities by frequency and severity | jq, Excel |
| 6-8 | Validate top 5 findings manually (exploitability check) | Nuclei, manual research |
### Day 3: Presentation
| Hour | Activity | Output |
|------|----------|--------|
| 0-2 | Create one-page executive summary | Markdown / PowerPoint |
| 2-4 | Present to steering committee: "Here is what we found in 48 hours with scripts" | Meeting |
| 4-6 | Discuss: what is the remediation path? What tools do we need to sustain this? | Roadmap |
**The conversation at Day 3**:
> *"In 48 hours, using only scripts and free tools, we found 340 known vulnerabilities across your estate. 23 are critical. Five of those are on internet-facing systems. Three are on end-of-life operating systems that cannot be patched. We can fix the patchable ones in two weeks. The unpatachable ones require architecture decisions. Here is the evidence. Now we can have an honest conversation about whether Tenable is worth the investment—or whether we build this capability with open-source tooling first."*
---
## When to Recommend Commercial Scanners
After the first sweep, you will know whether the client needs a commercial scanner:
| Scenario | Recommendation |
|----------|---------------|
| First sweep found <100 vulns, mostly patchable | **Do not buy Tenable yet.** Use scripts + cloud-native scanning + Intune/WSUS/SCCM for 6 months. Reassess. |
| First sweep found 100-500 vulns, client wants continuous visibility | **Deploy osquery + FleetDM first.** Provides owned, continuous monitoring for a fraction of scanner cost. Reassess in 6 months. |
| First sweep found 500+ vulns, heterogeneous estate | **Consider Tenable or Qualys** for continuous scanning and compliance reporting. Scripts cannot sustain at this scale. osquery can supplement for real-time data. |
| Client needs compliance evidence (PCI, ISO 27001, SOC 2) | **Commercial scanner required.** Auditors want vendor-validated scan reports, not scripts. |
| Client has OT/IOT/embedded devices | **Specialized scanner required.** Traditional tools do not speak Modbus, BACnet, or proprietary protocols. |
| Client wants continuous attack surface monitoring | **Consider Tenable.asm, Cortex Xpanse, or Mandiant ASM.** Script-based discovery is point-in-time. |
---
## Honest Limitations
| What Script-Based Discovery Does Well | What It Cannot Do |
|--------------------------------------|-------------------|
| Finds missing patches and known CVEs | Cannot find zero-days or configuration logic flaws |
| Maps software inventory accurately | Cannot assess business impact without human context |
| Identifies end-of-life systems | Cannot provide the compliance audit trail auditors demand |
| Generates SBOMs for containers | Cannot scan air-gapped or offline systems without physical access |
| Costs zero in licensing | Requires administrative access (SSH/WinRM/domain admin) |
| Produces evidence fast | Requires technical expertise to interpret and act on findings |
**Note**: osquery addresses several script-based limitations: it enables continuous monitoring, scales to thousands of endpoints via FleetDM, and provides real-time process/network visibility. The trade-off is agent deployment and query maintenance. See [Osquery: The Sovereign Discovery Platform](osquery-custom-platform.md).
---
## Integration With AI-Assisted TVM
The output of zero-budget discovery feeds directly into the AI-assisted TVM prioritization engine:
```
[zero-budget discovery] → Raw vulnerability data + SBOMs + OS inventories
[AI Prioritization] → Exploitability prediction + asset criticality + threat intel correlation
[Remediation Pipeline] → AI-generated scripts → human validation → deployment → validation
[Continuous Monitoring] → Re-scan → drift detection → quarterly purple team exercise
```
**The retained value**: Even if the client later buys Tenable, the SBOM pipeline and container scanning remain valuable. Tenable does not generate SBOMs as cleanly as Syft. Tenable does not scan containers as natively as Grype. The open-source stack **complements** the commercial scanner, it does not replace it.
---
*For the sovereign discovery platform built on osquery, see [Osquery: The Sovereign Discovery Platform](osquery-custom-platform.md).*
*For the AI-assisted TVM prioritization layer, see [AI-Assisted TVM Blueprint](ai-assisted-tvm.md).*
*For the perimeter scanning strategy, see [Perimeter Scanning Capability](perimeter-scanning-capability.md).*
*For the business case including tool costs, see [Business Case Template](business-case-template.md).*