Compare commits

..

9 Commits

Author SHA1 Message Date
tomas.kracmar 0d52474c30 chore: Add missing deliverable files (team guide, backlog, sample engagement) 2026-06-05 12:54:44 +02:00
Claude Sonnet 4.6 dc83336567 feat: Add assessment team guide for Brownhat Diagnostic execution
New: assessment-templates/assessment-team-guide.md

Pre-engagement: access checklist (M365, AD, docs); tool preparation
with deployment times; what to do if access is not ready.

Day 1 discipline: deploy ASTRAL and PULSAR before workshops start.
Step-by-step ASTRAL and PULSAR deployment commands. Passive external
scan in background. Microsoft Secure Score baseline.

Workshop signals: table of client statements -> likely findings ->
what to check on Day 2. Feeds technical assessment planning.

Day 2-3 tool runs in sequence:
1. CAExporter (30 min) - CA policy reality check; report-only mode;
   exclusion groups defeating the purpose
2. BloodHound (1-2h) - 5 required queries; KRBTGT last set check;
   Domain Admins on workstations; service account attack paths
3. Elysium (2-4h) - privilege requirements noted; privacy model
   explanation; what to document
4. Purple Knight (30 min) - indicators to focus on; cross-reference
   with BloodHound
5. Entra ID manual checks (1h) - app registrations, guest accounts,
   MFA registration status, AD Connect sync account
6. Intune/endpoint check (30 min) - via ASTRAL output
7. External attack surface (30-60 min) - Nmap, Shodan, crt.sh
8. Firewall rule review (30-60 min) - what to look for
9. Backup spot check (30 min) - the 'green tick' test

Kill chain synthesis: explicit step-by-step method for tracing
from outside to organisational failure.

Finding triage: kill chain test table; common priority inflation
mistakes.

Quick wins: 8-item checklist; three tests a quick win must pass.

Report structure: 5 sections, target 15-25 pages, specific guidance
per section including what makes a weak vs strong finding.

ASERAL/PULSAR handover requirements before leaving site.

9 common assessment mistakes named explicitly.

Post-assessment checklist: 10 items before submitting the report.

index.md and assessment-templates/README.md updated.

Co-Authored-By: Tom Kracmar <tom+claude@cat6.cz>
2026-06-05 10:42:18 +00:00
Claude Sonnet 4.6 097e93a431 feat: Add sample engagement for mid-market hybrid organisation
New: playbooks/sample-engagement-mid-market.md
  Client profile: 500 employees, 10 admins, AD+M365 E3, Intune,
  3rd party on-prem/cloud mix, NIS2 important entity, 3 offices,
  hybrid work, 80 external contractors. Fictional: Nexus Operations s.r.o.

  Sections:
  - Client profile and engagement context
  - Discovery call findings and disqualifier check
  - Brownhat Diagnostic: kill chain analysis, P0/P1/P2 findings table
  - 5 quick wins closeable before Day 30
  - Module recommendation and rationale (Modules 2, 6, 1, 7)
  - Day 30/90/180 deliverables specific to this client
  - Findings backlog pre-populated (23 items, P0 all closed by Day 90)
  - NIS2 Article 21 compliance map with evidence per measure
  - Investment estimate (55-80 consultant days)
  - Consultant notes: CISO handover, NIS2 pressure, two-domain AD,
    SAP credentials scope, contractor offboarding process dependency

index.md: Sample engagement added to playbooks table

Co-Authored-By: Tom Kracmar <tom+claude@cat6.cz>
2026-06-05 10:26:20 +00:00
Claude Sonnet 4.6 10f9a9bded fix: Correct ADO/M365 integration claim in findings backlog
No native ADO -> Planner/To Do sync exists. Replace with accurate options:
- Teams tab: pin ADO board into Teams channel (built-in, no setup)
- Power Automate: available for notifications/Planner push but adds
  complexity; not recommended as default

Co-Authored-By: Tom Kracmar <tom+claude@cat6.cz>
2026-06-05 10:14:20 +00:00
Claude Sonnet 4.6 486c092c32 feat: Add three concrete deployment options to findings backlog
Replace vague 'live where client works' with three ordered options:

Option 1 (default): ADO Work Items
  ASTRAL is already in ADO; Work Items are built in, zero additional
  tooling. Board setup guidance included. M365 Planner/To Do sync
  via ADO connector or Power Automate: non-technical owners see
  assigned findings in their daily task list without opening ADO.
  ASTRAL integration: link Work Items to drift PRs directly.

Option 2 (upgrade): CISO Assistant
  For clients building toward formal GRC. Bridges backlog to risk
  register: findings promoted from operational backlog to documented
  risks with treatment plans and compliance evidence links. Docker
  Compose, self-hosted, 30 minutes to deploy.

Option 3 (fallback): Git flat file
  For clients with technical capability and preference for minimal
  tooling. Template retained. Limitation noted: no notifications,
  no Planner sync - if the IT lead needs nudging, use ADO instead.

Co-Authored-By: Tom Kracmar <tom+claude@cat6.cz>
2026-06-05 10:12:03 +00:00
Claude Sonnet 4.6 5c4e91179d feat: Add findings backlog as pragmatic alternative to risk register
New: assessment-templates/findings-backlog.md
  Design principles: lives where client works, every finding has an owner,
  feeds the housekeeping stream, accumulates from all sources.
  Format: 6-field minimal entry (ID, finding, source, priority, owner,
  status) with optional target date/effort/notes/closed date.
  P0/P1/P2 priority using kill chain test.
  Flat file template for Git-based clients.
  Population guide: Day 30 (from Brownhat), subsequent modules, continuous
  tools (ASTRAL drift, PULSAR alerts, Elysium, BloodHound).
  Monthly housekeeping cycle structure.
  Relationship to formal risk register explained.
  Backlog health indicators (warning signs it is not functioning).

Wired into existing framework:
  move-fast-and-fix-things.md: Rule 4 now names the backlog as the queue
  rapid-modernisation-plan.md: Day 30 item 7 and Phase 1 action updated
  engagement-model.md: Section 4 deliverables table updated at all stages
  assessment-templates/README.md: Production-ready templates section added
  index.md: Findings Backlog added to Assessment and Tools table

Co-Authored-By: Tom Kracmar <tom+claude@cat6.cz>
2026-06-05 10:09:08 +00:00
Claude Sonnet 4.6 6162bb474f fix: Replace cloud AI cost rows in business case direct costs table
Remove 'Cloud AI vendor price shock' (not a security risk; unverifiable
number) and 'Competitive intelligence loss from AI training' (inaccurate
claim that contradicts corrections made throughout the framework).

Replace with:
- Incident response and forensics (EUR 150-500K, real range)
- Business interruption during recovery (client-specific daily revenue)

All five rows now map directly to risks the programme addresses and
are quantifiable in a CFO conversation.

Co-Authored-By: Tom Kracmar <tom+claude@cat6.cz>
2026-06-05 09:59:12 +00:00
Claude Sonnet 4.6 3b69f255ec feat: Add concrete milestone deliverables at Day 30/90/180
rapid-modernisation-plan.md: New 'Milestone Deliverables' section with
23 numbered, verifiable deliverables across three milestones.

Day 30 (7 deliverables): Brownhat Diagnostic, ASTRAL deployed, PULSAR
deployed, T0 accounts hardened, attack surface report, quick wins closed,
stale account queue opened. Hard gate: if ASTRAL/PULSAR not deployed,
the bottleneck is access provisioning not scope.

Day 90 (9 more deliverables): MFA for all users enforced (not enrolled),
legacy auth blocked, CA baseline, P0/P1 vulns closed, BloodHound before/
after, vendor access hardened, T0 backup verified, ASTRAL restore drill,
PULSAR top 5 alert rules with runbooks.

Day 180 (7 more deliverables): Alert runbooks, custom detection rules,
client IT lead independence (live walkthrough), housekeeping 3 cycles,
module completion packages, risk register closure evidence, retained scope.

Each milestone includes the verifiable evidence column and a 'what this
value stands alone' statement. Section closes with honest timeline
modifiers (large AD, high user count, OT environments).

business-case-template.md: The Ask updated to quote the three milestones
explicitly.

Co-Authored-By: Tom Kracmar <tom+claude@cat6.cz>
2026-06-05 09:54:49 +00:00
Claude Sonnet 4.6 878fca3f0b feat: Rewrite rapid-modernisation-plan and business-case for realism
rapid-modernisation-plan.md:
- Add honest framing section: what 180 days delivers vs. what takes 2-3 years
- Extend Phase 1 from 30 to 60 days; rename to Visibility
- Remove dangerous 'disable all unknown accounts in week 1-2' instruction
- Replace Phase 3 (AI Sovereignty) with Signal and Retained Capability
- Phase 3 now: detection engineering, alert runbooks, knowledge transfer
- Phase 4 made explicitly open-ended (not complete at day 180)
- Fix success metrics: remove unverifiable targets, replace with honest ones
- Remove 'compress Phases 1-2 into 30 days for small orgs' adaptation
- Add 'What This Plan Is Not' practitioner section
- ASTRAL and PULSAR integrated as Phase 1 deliverables
- AI Sovereignty moved to multi-year parallel initiative

business-case-template.md:
- Break-even corrected: Day 90 -> 12-18 months post-programme
- Phase budget table updated: 30/30/30/90 -> 60/60/60/ongoing
- Phase names and deliverables aligned with revised RMP
- AI sovereignty removed from core deliverables
- Sensitivity analysis: 3 scenarios -> 4 including abort condition
- Alternatives table: AI sovereignty removed from Antifragile programme description
- ROI table: cloud AI cost line replaced with audit preparation time saving
- The Ask: 30-day first gate -> 60-day first gate

Co-Authored-By: Tom Kracmar <tom+claude@cat6.cz>
2026-06-05 09:47:25 +00:00
9 changed files with 1407 additions and 248 deletions
@@ -2,7 +2,19 @@
> *"What gets measured gets managed. What gets managed honestly becomes antifragile."* > *"What gets measured gets managed. What gets managed honestly becomes antifragile."*
This directory contains diagnostic tools, maturity models, and assessment resources for evaluating organizational antifragility. Two production-ready tools are available now; additional assessments are in active development. This directory contains diagnostic tools, maturity models, and assessment resources for evaluating organizational antifragility.
## Production-Ready Templates
| Template | Purpose |
|----------|---------|
| [Assessment Team Guide](assessment-team-guide.md) | Technical execution guide for the Brownhat Diagnostic: tool sequence (ASTRAL, PULSAR, BloodHound, Elysium, Purple Knight, CAExporter), what to look for, kill chain synthesis, report structure, common mistakes. |
| [Findings Backlog](findings-backlog.md) | Single source of truth for all findings across every module and diagnostic. The input queue for the housekeeping stream. Pragmatic alternative to a formal risk register for organisations that do not have one. |
| [NIST CSF 2.0 Baseline Assessment](nist-csf-baseline.md) | The Brownhat Diagnostic: structured 2-half-day workshop, gap analysis, kill chain identification |
| [Module Completion Report](module-completion-report.md) | Completion package template for every module; includes backlog update |
| [Antifragile Risk Register](antifragile-risk-register.md) | Formal risk register template; the backlog feeds into this for organisations with mature risk management |
| [Risk Register Example](risk-register-example.md) | 8 fully populated entries from a realistic engagement — calibration reference |
| [M365 Project Risk Register](m365-project-risk-register.md) | M365-specific risk register with phase gates |
## Planned Assessments ## Planned Assessments
@@ -0,0 +1,514 @@
# Assessment Team Guide: Technical Execution of the Brownhat Diagnostic
> *"The workshop tells you what the client thinks is happening. The tools tell you what is actually happening. Run the tools before the second session — the findings change the conversation."*
This guide covers the technical execution of the Brownhat Diagnostic. It is the companion to the [NIST CSF 2.0 Baseline Assessment](nist-csf-baseline.md), which covers the workshop methodology. Read both before your first diagnostic.
**Division of labour**: The workshop facilitator runs the NIST CSF sessions and manages the client conversation. The technical assessor runs the tools, collects evidence, and builds the findings. These can be the same person in smaller engagements, but if you have two people, split them — the findings from Day 2 tool runs should inform the workshop conversation, not interrupt it.
---
## Before You Arrive: Pre-Engagement Preparation
### Access to Request (Before Kickoff)
Send this checklist to the client IT lead at least 5 business days before Day 1. Missing access on Day 1 is the most common cause of diagnostic delay.
**M365 / Entra ID**:
- [ ] Global Reader role in Entra ID (read-only; sufficient for most checks)
- [ ] Entra ID audit log access (to verify logging is enabled before PULSAR deploys)
- [ ] Exchange admin centre read access
- [ ] SharePoint admin centre read access
- [ ] Intune read access (Device Management / Endpoint Manager)
- [ ] Microsoft Secure Score access
- [ ] Conditional Access policies read access
**Active Directory**:
- [ ] Domain User account on the domain(s) — BloodHound collection only needs this
- [ ] Read access to ADUC (Active Directory Users and Computers)
- [ ] Ability to run PowerShell on a domain-joined machine (for BloodHound collector and Elysium — see notes below on Elysium privilege requirements)
**Network / Infrastructure**:
- [ ] Access to firewall management interface (read-only; to review ruleset)
- [ ] VPN access or on-site working arrangement for Day 2 tool runs
- [ ] Previous pentest or audit reports (if any exist)
**Documents to request**:
- [ ] Network diagram (any version, however outdated — better than none)
- [ ] Asset inventory or CMDB export (even a spreadsheet)
- [ ] Previous security audit or pentest report
- [ ] List of third-party SaaS tools (from procurement or IT)
- [ ] Organisational chart for IT/security team
> **What to do if access is not ready**: Do not delay the workshop waiting for full access. Start Session 1 with what you have. Deploy ASTRAL and PULSAR as soon as any M365 access is confirmed — they produce value from minute one. Tool runs that need AD access can happen Day 23 once an account is provisioned.
### Tool Preparation
Have these ready to deploy before Day 1. Do not learn a tool at a client's expense.
| Tool | Preparation | Time to deploy |
|------|-------------|---------------|
| ASTRAL | ADO project created; pipeline YAMLs ready; `bootstrap-tenant.ps1` reviewed | 24 hours on-site |
| PULSAR | Docker Compose environment ready; `bootstrap-tenant.ps1` reviewed | 12 hours on-site |
| BloodHound CE | Installed on assessment laptop; SharpHound collector downloaded | 15 minutes |
| Elysium | Cloned and tested in lab; KHDB download initiated (large file — download before arriving) | 30 min setup; KHDB download 3060 min |
| CAExporter | Downloaded and tested | 10 minutes |
| Purple Knight | Downloaded from Semperis (free, requires registration) | 15 minutes |
| E8-CAT | Downloaded and tested (for Australian clients or E8-aligned clients) | 10 minutes |
| Nmap / Shodan | Nmap installed; Shodan account active (free tier sufficient) | Ready |
---
## Day 1: Deploy First, Ask Questions Later
The single most important discipline: **deploy ASTRAL and PULSAR before the first workshop session begins.** The baseline they capture is a point-in-time snapshot. If you wait until after the workshops, the baseline may already reflect changes the client made in response to your questions.
### Morning: Deploy Listening Tools
Before Session 1 starts, or during the first 30-minute introductions slot:
**Step 1 — ASTRAL deployment** (~2 hours, can run in background)
```powershell
# On the client's Azure DevOps or your assessment instance
.\deploy\bootstrap-tenant.ps1 -TenantName "<client>.onmicrosoft.com"
```
Follow the [ASTRAL onboarding runbook](https://github.com/cqrenet/astral/blob/main/deploy/onboarding-runbook.md). The initial full backup pipeline run captures the complete M365 configuration baseline. This is your "before" snapshot — everything you find during the assessment is measured against this.
**What ASTRAL captures on first run**:
- All Intune profiles, policies, compliance policies, applications, scripts
- All Conditional Access policies (with full named-object resolution via CAExporter integration)
- All Entra ID app registrations and enterprise applications
- All authentication methods and named locations
- Produces HTML/PDF as-built documentation automatically
**Step 2 — PULSAR deployment** (~1 hour, can run in background)
```bash
cp .env.example .env
# Fill in CLIENT_ID, CLIENT_SECRET, TENANT_ID from bootstrap output
docker compose up --build -d
```
Once running, trigger a manual fetch to confirm audit log ingestion is working:
```
GET http://localhost:8000/api/fetch-audit-logs
```
**What PULSAR captures immediately**: All M365 admin audit events from the Management Activity API (Exchange, SharePoint, Teams, Entra, Intune). Retention starts from this moment — every admin action from here forward is permanently searchable. For clients with no prior log retention, this is instant value.
**Step 3 — Microsoft Secure Score baseline** (10 minutes)
Navigate to `security.microsoft.com → Secure Score`. Screenshot the current score and the top 10 recommended actions. This is a quick reference point for the workshop conversation and gives the client a number they immediately understand.
**Step 4 — Passive external scan** (runs in background during workshop)
```bash
# From your assessment machine
nmap -sV --open -p 80,443,8080,8443,3389,22,21,25,993,995 [client-public-IPs]
# Shodan CLI for ASN-based discovery
shodan search "org:[client-org-name]" --fields ip_str,port,banner,product
```
Also check:
- Certificate transparency logs: `crt.sh/?q=[client-domain]` — reveals subdomains, expired certs, shadow IT domains
- Shodan for the VPN endpoint specifically: firmware version, known CVEs
- `whois` and reverse DNS for all IP ranges the client mentions
---
## During the Workshops: What to Listen For
The [NIST CSF Baseline](nist-csf-baseline.md) has the full question set. Below are the specific signals to listen for that indicate P0/P1 findings. Note these immediately — they feed the technical checklist for Day 2.
| What the client says | What it likely means | Check on Day 2 |
|---------------------|---------------------|----------------|
| "We haven't tested our backups recently" | No restore has ever been done | Recovery drill required; check backup destination |
| "We use shared admin accounts" | Multiple people using one credential | Elysium; AD audit; no MFA possible on shared account |
| "Contractors have the same access as employees" | Likely no offboarding process; stale accounts | Elysium; AD account audit; HR cross-reference |
| "We have MFA but I think some people have exemptions" | CA policies in report-only or with large exclusion groups | CAExporter; Entra ID CA policy review |
| "The acquisition brought in a second AD" | Forest trusts; uncharted attack paths; duplicate admin accounts | BloodHound must cover both domains |
| "We use [legacy on-prem system] with its own accounts" | Shadow identity; service accounts not in scope of central IAM | Manual AD service account audit |
| "IT handles offboarding when HR tells us" | Offboarding depends on HR notification — often delayed | Elysium; compare AD accounts to HR list |
| "I'm not sure who all has admin access" | No privileged access inventory | BloodHound; ADUC privileged group audit |
| "We have a firewall but nobody has reviewed the rules in years" | Accumulated rules; likely any/any entries; retired services still open | Firewall rule export and review |
| "Some of our developers have direct access to production" | Uncontrolled privileged access to production systems | Scope question for Module 6 |
---
## Day 23: Technical Tool Runs
Run tools in this order. Earlier tools inform later ones.
### 1. CAExporter — Conditional Access Baseline (30 minutes)
Run first. The CA policy export reveals whether MFA is actually enforced or just configured. This is consistently the most surprising finding in M365 environments.
```powershell
# Requires Entra ID reader access
.\CAExporter.ps1 -TenantId <tenant-id> -OutputPath .\ca-export\
```
**What to look for**:
- Policies in **Report-Only** mode (not enforced — common; clients assume they are protected when they are not)
- Large **exclusion groups** containing most users ("AllUsers_ExceptionGroup" type)
- Policies that claim to block legacy authentication but have exclusions that defeat the purpose
- No policy enforcing device compliance
- Multiple overlapping policies with unclear precedence
**Output**: Excel workbook with one row per policy, conditions and controls expanded, groups and apps named rather than showing GUIDs. This is the CA baseline document.
---
### 2. BloodHound — AD Attack Path Analysis (12 hours collection + analysis)
```powershell
# Run SharpHound from a domain-joined machine using the assessor domain account
.\SharpHound.exe -c All --zipfilename nexus-bloodhound.zip
```
Copy the zip to your assessment machine and import into BloodHound CE.
**Required queries** (run these first, every engagement):
```cypher
-- Shortest paths to Domain Admin from all non-admin users
MATCH p=shortestPath((u:User {admincount:false})-[*1..]->(g:Group {name:"DOMAIN ADMINS@DOMAIN.LOCAL"})) RETURN p
-- All Domain Admin members with direct login sessions on workstations
MATCH (u:User)-[:MemberOf]->(g:Group {name:"DOMAIN ADMINS@DOMAIN.LOCAL"})
MATCH (u)-[:HasSession]->(c:Computer) WHERE NOT c.name CONTAINS "DC" RETURN u.name, c.name
-- Kerberoastable accounts with high privilege
MATCH (u:User {hasspn:true}) WHERE u.admincount=true RETURN u.name, u.serviceprincipalnames
-- ASREPRoastable accounts (no Kerberos pre-auth)
MATCH (u:User {dontreqpreauth:true}) RETURN u.name, u.enabled
-- Service accounts with paths to Domain Admin
MATCH p=shortestPath((u:User)-[*1..5]->(g:Group {name:"DOMAIN ADMINS@DOMAIN.LOCAL"}))
WHERE u.name CONTAINS "$" OR u.name CONTAINS "SVC" OR u.name CONTAINS "SERVICE"
RETURN p
```
**What to document**:
- Number of paths to Domain Admin from non-admin users (the "847 paths" number from the sample)
- Shortest path length and the specific nodes on it — this is your kill chain
- Domain Admins with sessions on non-DC workstations — P1 finding in almost every environment
- Any service accounts that are Kerberoastable and have high privilege — often P0
- KRBTGT last password set date (check in ADUC or PowerShell)
```powershell
# KRBTGT last password set
Get-ADUser krbtgt -Properties PasswordLastSet | Select PasswordLastSet
```
---
### 3. Elysium — Password Audit (24 hours, requires elevated AD access)
> **Privilege requirement**: Elysium requires Domain Admin or equivalent (DSInternals needs to read password hashes). Confirm this access before scheduling. If it cannot be arranged during the diagnostic, schedule it for week 1 of Module 6.
```powershell
# Run from a domain controller or with delegated rights
.\Elysium.ps1 -Domain <domain-fqdn> -OutputPath .\elysium-output\
```
**What Elysium finds**:
- Accounts matching known-breached password hashes (from the KHDB — download before arriving)
- Accounts with blank passwords
- Accounts with passwords matching dictionary patterns
- Duplicate passwords across accounts (shared credential detection)
**Output to document**:
- Total accounts audited
- Accounts matching KHDB (breached) — split by privileged vs non-privileged
- Accounts with common passwords
- Any privileged account with a compromised or weak password → immediate P0
**Privacy handling**: Elysium does not transmit usernames or plaintext passwords. The KHDB comparison is local. The output is a list of SAMAccountNames to reset — not passwords. Communicate this clearly to the client before running.
---
### 4. Purple Knight — AD Security Scoring (30 minutes)
Purple Knight (Semperis, free) runs a broad checklist of AD security misconfigurations. Run it from any domain-joined machine.
```powershell
.\PurpleKnight.ps1
```
The report scores against ~100 indicators. **Focus on**:
- LDAP signing and channel binding status
- AdminSDHolder unusual members
- Protected Users group membership (or absence of it for admins)
- Reversible encryption enabled accounts
- Unconstrained delegation (computers and users)
- Machine account quota (default 10 — often abused for relay attacks)
- Exchange permissions on AD objects (if Exchange exists on-prem)
Cross-reference Purple Knight findings with BloodHound. Purple Knight finds the indicators; BloodHound shows how they chain together into attack paths.
---
### 5. Entra ID Manual Checks (1 hour)
These cannot be automated — they require visual inspection in the Entra admin centre.
**App registrations and enterprise applications**:
- Navigate to: `Entra ID → App registrations → All applications`
- Filter by: "High privilege permissions" — look for `Mail.ReadWrite`, `Directory.ReadWrite.All`, `User.ReadWrite.All`
- Note any apps with these permissions that are: (a) published by unknown parties, (b) have no documented owner, (c) were consented to by users rather than admins
- This is consistently where the most surprising findings live — OAuth consent abuse is underdetected in every mid-market environment
**Guest accounts**:
- Navigate to: `Entra ID → Users → Filter: User type = Guest`
- How many guests are there? When was their last sign-in? Are any of them former contractors?
**MFA registration status**:
- Navigate to: `Entra ID → Users → Per-user MFA` (legacy view) OR `Identity Protection → Monitoring → Authentication methods → User registration details`
- What % of users have MFA registered? What % have it enforced?
- Are there any break-glass accounts? Are they properly protected and audited?
**Entra ID Connect sync account** (hybrid environments only):
- Navigate to: `Entra ID → App registrations → find the sync account`
- Check what rights it has in Entra ID
- Cross-reference with on-prem AD: does this account have DCSync rights? (BloodHound query: search for the account name and check its paths)
---
### 6. Intune / Endpoint Check (30 minutes — via ASTRAL output or direct)
ASTRAL's first run will have produced an Intune inventory. Review:
- **Enrollment rate**: What % of devices are enrolled? What platforms?
- **Compliance policy coverage**: Is there a compliance policy? What does it enforce? Is it assigned to all devices?
- **Conditional Access integration**: Is the "Require compliant device" CA policy active — or in report-only?
- **Stale devices**: Devices with last check-in > 90 days are likely personal devices or ghost entries. Note the count.
- **Script inventory**: What PowerShell scripts are deployed via Intune? Any that look unfamiliar?
---
### 7. External Attack Surface (3060 minutes)
By Day 2, the Nmap and Shodan scans from Day 1 should have results.
**Review**:
- Any RDP (3389) exposed to internet → P0 in almost every context
- Any management interfaces (firewalls, switches, VPN management) accessible from internet
- Any services with outdated banners suggesting old software versions
- Certificate expiry on any internet-facing service
- VPN endpoint firmware version → check against vendor advisory for known CVEs
**Additional check — subdomain enumeration**:
```bash
# Using crt.sh results and DNS brute force
cat crt-sh-results.txt | grep "<client-domain>" | sort -u
# For each subdomain found: what is it? Is it documented? Is it still active?
```
Undocumented subdomains pointing to forgotten services are a regular P1 finding.
---
### 8. Firewall Rule Review (3060 minutes)
Request an export of the firewall ruleset. Most firewall platforms support CSV or XML export.
**What to look for**:
- Rules with `source = ANY` and `destination = ANY` (any/any) → almost always P2 but sometimes P1 if it covers a sensitive segment
- Rules allowing direct internet access from server VLANs → P1
- Rules created for a specific project that are still active years later → P2
- Rules referencing IP addresses that no longer correspond to live systems
- No rule for blocking outbound traffic (egress filtering absent) → P1 for environments with sensitive data
---
### 9. Backup and Recovery Spot Check (30 minutes)
Ask the IT lead to show you, live:
- Where backups are stored (destination)
- When the last backup ran and whether it completed successfully
- Whether the backup destination is on the same network segment as the system being backed up
- Whether anyone has ever triggered a test restore and what the result was
> **The standard answer**: "Backups run every night and we get a green tick." The right follow-up: "Show me the most recent successful restore test." In most environments, one has never been performed.
Document: backup target, last run, completion status, last restore test (date or "never").
---
## Synthesising Findings: From Data to Kill Chain
After tool runs are complete, before writing the report, do this step explicitly. Sit with your notes and answer one question:
**"What is the shortest sequence of steps an adversary with no prior access could take to cause the organisation to fail to operate?"**
Build the kill chain step by step:
1. Start from the outside (what can be accessed without credentials?)
2. What is the first credential gain? (phishing, password spray against legacy auth, VPN without MFA)
3. What does that credential give access to? (M365 if MFA is not enforced; VPN if no MFA there)
4. What can you do with M365 access? (read all email, access SharePoint, escalate via app permissions)
5. What is the path from M365 access to domain admin? (Entra ID admin → AD Connect sync account → DCSync)
6. What does domain admin give you? (everything on-prem, including ERP, backup servers)
7. What is the impact? (data exfiltration, ransomware, operational disruption)
Write this as a chain, not a list. The [sample engagement kill chain](sample-engagement-mid-market.md#kill-chain-assessment) shows the format.
---
## Finding Triage and Priority Assignment
For every finding, apply the kill chain test:
| Question | Priority |
|----------|----------|
| Is this a node on the kill chain? | **P0** — fix before anything else |
| If exploited, does material harm result even if not on the kill chain? | **P1** — fix this engagement |
| Real finding, real risk, but not on the kill chain and not immediately material? | **P2** — housekeeping queue |
| Best practice recommendation with no exploitable risk? | **Observation** — note in report, do not count as a finding |
**Common priority inflation mistakes**:
- Marking "no security awareness training programme" as P0 — it is P2 at most
- Marking every missing patch as P0 — only patches for internet-facing or kill-chain systems
- Marking "weak password policy" as P0 when Elysium shows no actual weak passwords — the policy is P2; actual weak credentials on privileged accounts are P0
---
## Quick Wins Identification
A quick win must pass three tests:
1. **Closeable in hours or days, not weeks** — requires no procurement, no change window longer than one day, no significant testing
2. **Uses only existing tools and permissions** — no new purchase, no new deployment
3. **Meaningfully reduces risk** — not cosmetic
For M365/AD environments, the standard quick wins checklist:
- [ ] Activate CA policies already in Report-Only mode
- [ ] Remove large exception groups from CA compliance policies
- [ ] Block legacy authentication (CA policy template exists in every tenant)
- [ ] Enforce MFA at organisation level in GitHub / other SaaS tools
- [ ] Disable accounts confirmed as departed contractors (HR-verified, scripted disable)
- [ ] Enable audit logging where it is off (often disabled on legacy servers to save disk)
- [ ] Revoke suspicious OAuth app permissions (for obvious unknowns with high privilege)
- [ ] Change default credentials on any system where they are confirmed unchanged
---
## Report Structure
The Brownhat Diagnostic report has five sections. Target length: 1525 pages. Not more — if it is longer, it will not be read.
### 1. Executive Summary (2 pages)
- Current state in one paragraph — honest, not alarming
- Kill chain: the specific path, named, diagrammed if possible
- P0 count, P1 count, P2 count
- Quick wins: what was closed immediately (if Day 1 quick wins were executed)
- Recommended first module and rationale
- NIS2 compliance gap summary (if applicable): which Article 21 measures have evidence, which do not
### 2. Methodology (0.5 pages)
- Workshop dates, attendees
- Tools used (ASTRAL, PULSAR, BloodHound, Elysium, Purple Knight, CAExporter, external scan)
- Access used (read-only Entra ID, domain user for BloodHound, domain admin for Elysium)
- What was NOT assessed (explicitly scoped out — sets expectations)
### 3. Findings (815 pages)
Organise by priority tier, not by domain.
**P0 — Kill Chain Nodes**: Each finding gets a half-page: the finding in one sentence, the evidence, the business impact in non-technical language, and the remediation. Name the specific accounts, policies, or systems involved. "Admin accounts lack MFA" is a weak finding. "3 of 5 Global Administrator accounts — `admin@nexus.onmicrosoft.com`, `it-admin@nexus.onmicrosoft.com`, and the break-glass account — can authenticate without MFA because the Conditional Access policy 'Require MFA' is in Report-Only mode" is a finding.
**P1 — Material Risk**: Same format, briefer. One paragraph per finding.
**P2 — Housekeeping Queue**: Table format only. ID, finding, why it matters in one sentence.
### 4. Module Recommendation (2 pages)
- Recommended sequence with rationale
- What each module closes (map to specific P0/P1 findings)
- Timeline estimate
- Investment estimate (effort ranges, not day rates — rates go in the proposal)
### 5. Quick Wins Closed (0.5 pages)
List what was already fixed during the diagnostic. This is the most important page for client confidence — they paid for the diagnostic and something is already better.
---
## Backlog Population
Before leaving the client site (or within 24 hours):
1. Create the ADO Work Items project (or agree on the tool with Ondřej)
2. Enter every finding as a Work Item: ID, finding text (one sentence), source (Brownhat Diagnostic), priority (P0/P1/P2), owner (named person)
3. Move quick wins to Closed with the date they were resolved
4. Brief the named IT lead on the backlog: where it lives, how the monthly cycle works, who owns what
5. Pin the ADO board as a Teams tab if applicable
The backlog handover is not optional. A diagnostic that produces a report but no maintained tracking system has a half-life of one steering committee meeting.
---
## ASTRAL and PULSAR Handover
By the end of the diagnostic engagement:
**ASTRAL**:
- First full backup has run and committed to the ADO repository
- Client IT lead can access the ADO project and review the baseline
- Drift detection is live — the first drift PR, if one occurs, should be reviewed together with the client as a training exercise
- Reviewer notification configured to email or Teams-notify Ondřej
**PULSAR**:
- Audit events ingesting and searchable
- Teams tab pinned in the IT channel
- Basic search walkthrough done with client IT lead: show them how to find a specific event, how to filter by actor and operation
- No alert rules yet — those come in Module 2/3 when there is a hardened baseline to alert against
---
## Common Mistakes in Assessment Execution
**Starting tool runs before access is confirmed.** Tool runs that fail eat time and erode confidence. Confirm credentials work before you need them.
**Running Elysium without telling the client what it does.** "We are going to compare your password hashes against a database of known-compromised credentials" needs to be explained before it happens. Most clients are fine with it once they understand the privacy model. Zero clients want a surprise.
**Presenting findings before you have run BloodHound.** The kill chain often only becomes clear once BloodHound has shown how the pieces connect. Do not anchor the client on an incomplete kill chain in Session 2 and then have to walk it back.
**Marking everything P0.** If you present 15 P0 findings, the client has no way to act. Real P0 items are rare — typically 38 in a first diagnostic. If you have more, re-examine your priority assignments.
**Leaving without a named owner for every P0.** The diagnostic ends. The report goes out. Nobody fixes the P0 items because nobody has their name on them. Get owner names in the room before you leave.
**Forgetting to document what you ran and what access you used.** The methodology section of the report should be written from notes taken during the assessment, not reconstructed from memory three days later.
---
## Post-Assessment Checklist
Before submitting the report:
- [ ] Kill chain written as a chain, not a list
- [ ] Every P0 finding has: evidence citation, specific named assets, remediation steps, named owner
- [ ] Quick wins section lists what was already fixed
- [ ] Module recommendation is tied to specific findings ("Module 2 closes P0-001, P0-002, P1-001, P1-004")
- [ ] ASTRAL baseline committed and accessible to client
- [ ] PULSAR ingesting and accessible to client
- [ ] Findings backlog populated in agreed tool, owners assigned
- [ ] Report reviewed for any claim that is an assertion rather than evidence (replace with what was found)
- [ ] NIS2 compliance map completed if client is in scope
- [ ] Next steps section includes: module recommendation, first meeting date, decision required from client
---
*Companion documents:*
*[NIST CSF 2.0 Baseline Assessment](nist-csf-baseline.md) — workshop methodology and questionnaires*
*[Sample Engagement: Mid-Market Hybrid](../playbooks/sample-engagement-mid-market.md) — calibration reference for findings and recommendations*
*[Findings Backlog](findings-backlog.md) — where findings land and how the housekeeping stream works*
*[Sovereign Tool Stack](../playbooks/sovereign-tool-stack.md) — full tool reference with deployment guidance*
*[Module Menu](../core/modular-engagements.md) — module selection after the diagnostic*
@@ -0,0 +1,216 @@
# Findings Backlog
> *"A finding without a home is a finding that will never be fixed. The risk register is the right home. The backlog is the one that actually exists."*
## The Problem This Solves
Every assessment, module, and engagement produces findings. Some get fixed immediately. Most do not. In organisations with mature risk management, findings go into a risk register, get assigned owners, get reviewed quarterly, and get closed over time.
In practice, most organisations do not have a working risk register. They have a template someone downloaded, a spreadsheet that was last updated during the ISO 27001 attempt three years ago, or a GRC tool that nobody logs into. Findings that go into these systems disappear.
The **findings backlog** is the pragmatic alternative. It is not a replacement for a formal risk register — it is the lightweight, maintainable system that fills the gap between "finding documented in a report" and "finding tracked to closure." For organisations that eventually build a working risk register, the backlog feeds into it. For organisations that never do, the backlog is their risk register in all but name.
---
## Deployment Options
Three options, in order of preference. Choose based on what the client will actually open.
### Option 1 — Azure DevOps Work Items (recommended for ASTRAL clients)
If ASTRAL is deployed, the client already has an ADO project. Work Items are built in — no additional tooling, no additional cost, same context as the ASTRAL drift PRs and restore pipeline. This is the default.
**Setup**: Create a Work Item type called "Security Finding" (or use the built-in Bug or Task type with a tag). Create a board with columns: `New → Triaged → In Progress → Blocked → Closed`. Add custom fields: Priority (P0/P1/P2), Source (Brownhat / BloodHound / Elysium / ASTRAL / PULSAR / Module N), and Target Date.
**Why it works**: Consultants who are already reviewing ASTRAL drift PRs see the backlog in the same tool. The client's IT lead who owns remediation works in the same project. No context switching.
**Teams tab**: Pin the ADO board directly into the relevant Teams channel as a tab — built into the Azure DevOps app for Teams, no additional setup. The IT lead who lives in Teams can view and update Work Items without opening ADO in a browser. This is the recommended surface for non-technical owners: it is always visible, requires no context switch, and keeps the canonical data in ADO.
**Power Automate (optional)**: If you need to push notifications into Teams or create tasks in Planner when a P0 item is opened or a target date is missed, Power Automate can bridge ADO to the M365 ecosystem. This adds complexity and a dependency on Power Automate flows — use it only if the Teams tab alone is not driving the right behaviour. There is no native ADO → Planner sync without Power Automate.
**ASTRAL integration**: When ASTRAL raises a drift PR for an unauthorised configuration change that cannot be immediately restored, link the ADO Work Item to the ASTRAL PR. The PR description, the before/after diff, and the reviewer decision are all in the same project — the Work Item is the remediation task, the ASTRAL PR is the evidence.
---
### Option 2 — CISO Assistant (upgrade path for clients building toward GRC)
[CISO Assistant](https://github.com/intuitem/ciso-assistant-community) is an open-source GRC platform already in the [Sovereign Tool Stack](../playbooks/sovereign-tool-stack.md). It provides risk register functionality, compliance framework mapping (NIS2, ISO 27001, DORA), evidence tracking, and audit-ready reporting — all self-hosted.
**When to use it instead of ADO Work Items**: When the client has an intent to build a formal risk management programme and needs a tool that can grow into it. CISO Assistant bridges the gap between a pragmatic backlog and a formal risk register: the same findings that start as backlog items can be promoted to documented risks with treatment plans, residual risk assessments, and compliance evidence links.
**How the backlog feeds CISO Assistant**: During each module, findings are entered into the backlog in ADO or a flat file. At quarterly review, P1 and significant P2 items that are not yet closed are promoted to CISO Assistant as risk entries with the evidence collected during the engagement. The backlog is operational; CISO Assistant is the strategic record.
**Deployment**: Docker Compose, ~30 minutes. Self-hosted on the client's infrastructure or on a VPS. See the sovereign tool stack for deployment guidance.
---
### Option 3 — Git flat file (fallback for clients without ADO or preference for simplicity)
A Markdown file committed to the same repository as ASTRAL (or a dedicated security repository). Zero additional tooling. Fully auditable via Git history. Works offline.
**When to use it**: Clients who have the technical capability to maintain a Markdown file in Git and prefer minimal tooling. Also useful as a transitional format before ADO Work Items are fully configured.
**Limitation**: No native assignment notifications, no Planner sync, no board view. Progress is visible only to people who look at the repository. For clients where the IT lead needs to be nudged, a flat file will be ignored. Use ADO Work Items or CISO Assistant instead.
The flat file template is provided below.
---
## Design Principles
**It must live where the client actually opens things.** If the backlog is in a tool the client never looks at, it does not exist. The three options above are ordered by likelihood of adoption. ADO Work Items wins because ASTRAL is already there — the path of least resistance is the path most likely to be walked.
**Every finding has an owner.** A finding without a named owner is not tracked — it is archived. The owner does not need to be the person who fixes it. They need to be the person who is accountable for whether it gets fixed.
**Priority drives the housekeeping stream.** The backlog is the input queue for Rule 4 (housekeeping as a permanent stream). The monthly housekeeping cycle picks from the backlog, resolves what it can, and updates statuses. Without the backlog, the housekeeping stream has no queue to work from.
**It accumulates from all sources.** Every diagnostic, every module, every ASTRAL drift alert, every PULSAR-flagged event, every BloodHound finding, every Elysium result feeds the backlog. Not just the big assessments. The backlog is the single source of truth for everything that has been found and not yet fixed.
---
## The Format
A minimal backlog entry has six fields. Do not add more until the client is actually maintaining this one.
| Field | What it contains |
|-------|-----------------|
| **ID** | Sequential identifier (B-001, B-002…). Never reuse an ID. |
| **Finding** | One sentence: what is wrong. Not "review accounts" — "47 user accounts belong to staff who have left; credentials remain valid." |
| **Source** | Which assessment or tool produced this: Brownhat Diagnostic, BloodHound, Elysium, ASTRAL drift, PULSAR alert, Module 6, etc. |
| **Priority** | P0 / P1 / P2 — using the kill chain test (see below) |
| **Owner** | Named person, not a team. "AD Team" is not an owner. "Marek Novák" is. |
| **Status** | Open / In Progress / Blocked / Closed |
Optional fields that add value once the basic discipline is established:
| Field | What it contains |
|-------|-----------------|
| **Target date** | The date by which this should be resolved. Not when the project ends — when this specific item should be done. |
| **Effort** | S / M / L — rough estimate; S = fixable in a day or less, M = a few days, L = needs planning |
| **Notes** | Blockers, context, related items, change window requirements |
| **Closed date** | When it was actually closed. Important for demonstrating progress to auditors. |
---
## Priority Assignment
Use the kill chain test from the [Consultant Field Guide](../core/consultant-field-guide.md):
**P0 — Kill chain node.** If exploited, the organisation fails to operate. Fix before anything else. Examples: admin accounts without MFA, unpatched internet-facing system with known active exploit, backup that has never been restored, KRBTGT password over 365 days old.
**P1 — Material damage.** If exploited, significant harm results but the organisation survives. Fix within the current engagement. Examples: service accounts with non-expiring passwords, open RDP from internet, legacy authentication not blocked, stale privileged accounts.
**P2 — Should be fixed.** Real finding, real risk, but not existential. Goes into the housekeeping queue for the next available cycle. Examples: weak password policy on non-privileged accounts, missing DNS security filtering, unreviewed firewall rules from two years ago, undocumented vendor access with low privilege.
> **On priority inflation**: The most common backlog failure is everything being marked P0. If everything is urgent, nothing is. Be ruthless. An environment with more than 510 P0 items either has a genuinely catastrophic security posture (in which case the immediate conversation is with the executive sponsor) or the priority assignments are wrong.
---
## Backlog Template (Flat File Version)
For clients whose teams work in a repository (preferred — the backlog lives alongside ASTRAL):
```markdown
# Findings Backlog
Last reviewed: [DATE] | Owner: [NAME] | Next review: [DATE]
## P0 — Kill Chain (fix immediately)
| ID | Finding | Source | Owner | Status | Target |
|----|---------|--------|-------|--------|--------|
| B-001 | | | | | |
## P1 — Material Risk (fix this engagement)
| ID | Finding | Source | Owner | Status | Target |
|----|---------|--------|-------|--------|--------|
| B-010 | | | | | |
## P2 — Housekeeping Queue (monthly cycle)
| ID | Finding | Source | Owner | Status | Target |
|----|---------|--------|-------|--------|--------|
| B-100 | | | | | |
## Closed
| ID | Finding | Closed date | Closed by |
|----|---------|-------------|-----------|
| | | | |
```
Use ID ranges to signal priority at a glance: B-001B-009 for P0, B-010B-099 for P1, B-100+ for P2.
---
## Populating the Backlog
### On Day 30 (from the Brownhat Diagnostic)
The Brownhat Diagnostic produces the first population of the backlog. Every finding from the diagnostic gets an entry. Quick wins that are closed immediately during the engagement go straight to Closed with the closing date. Everything else — including findings the client acknowledges but cannot act on immediately — goes into the backlog with the appropriate priority.
The consultant populates the initial backlog as part of the diagnostic deliverable. It is not a separate engagement. It is what happens to findings instead of filing them in a PDF.
### From subsequent modules
Every module completion package includes an update to the backlog:
- Findings that were closed by the module move to Closed
- New findings discovered during the module are added
- The risk register update in the module completion package cross-references the backlog IDs
### From continuous tools (ASTRAL, PULSAR, Elysium)
- **ASTRAL** — when a drift PR is raised for an unauthorised configuration change, a backlog entry is created if the change is not immediately remediated. The ASTRAL PR link goes in the Notes field.
- **PULSAR** — when an alert is investigated and reveals a structural gap (not just an event), a backlog entry is created. The PULSAR event ID goes in the Notes field.
- **Elysium** (quarterly run) — each new compromised or weak credential that cannot be immediately reset gets a backlog entry.
- **BloodHound** (quarterly run) — new or persistent attack paths get backlog entries with the path description.
---
## The Housekeeping Cycle
The monthly housekeeping cycle (Rule 4 of [Move Fast and Fix Things](../core/move-fast-and-fix-things.md)) works from the backlog. The cycle has a simple structure:
1. **Review open P0 items.** If any P0 is still open and not blocked by an external dependency, it is the first priority. If it has been open more than 30 days without progress, it escalates to the executive sponsor.
2. **Work P1 items.** Each cycle resolves at least one P1 item. If no P1 items are being resolved, the housekeeping stream is not functioning — find the blocker.
3. **Advance P2 items.** Move through the P2 queue at the capacity available. Not every cycle will close P2 items. Every cycle should move at least one P2 item to In Progress.
4. **Review and reprioritise.** As the environment changes, priorities shift. A P2 item that has been open for six months may be a P0 in disguise if nothing above it has been fixed.
5. **Update statuses.** Every item touched in the cycle gets a status update, even if the update is "Blocked — waiting for change window."
The output of each cycle is a one-page summary: items closed this cycle, items In Progress, blockers, and the current P0/P1 count. This summary goes to the named client lead. If retained capability is in scope, it goes to the CQRE consultant as well.
---
## Relationship to the Risk Register
If the client has a working risk register, the backlog and the risk register coexist:
- The backlog is **operational** — it is where findings live while they are being worked
- The risk register is **strategic** — it captures the risk that the finding represents, the treatment decision, and the residual risk after treatment
- When a P0 or P1 item is closed, the consultant works with the client to create or update the corresponding risk register entry with the closure evidence
If the client does not have a working risk register, the backlog is effectively doing the risk register's job at a reduced level of formality. Do not pretend otherwise. If the client ever needs to demonstrate risk management to a regulator or auditor, the backlog — with its closure dates, ownership, and priority history — is defensible evidence. A GRC tool with empty fields is not.
For clients who want to build a proper risk register: the backlog entries, once they have closure evidence attached, become the input for the risk register's treatment and closure records. The backlog is not wasted effort — it is the work that feeds the register.
---
## Backlog Health Indicators
These are warning signs that the backlog is not functioning:
| Indicator | What it means |
|-----------|--------------|
| P0 items open for more than 30 days with no progress | Executive escalation required; a P0 that nobody is moving is a political problem, not a technical one |
| More than 20 items in the backlog with no owner | The backlog was populated but not handed over properly; go back and assign owners |
| No items closed in the last monthly cycle | The housekeeping stream is not running; find the responsible person and re-establish the cadence |
| All items are P2 | Priority inflation has happened in reverse; the consultant needs to revisit severity assignments |
| The backlog has not been updated since the last engagement | The backlog is a report, not a system; the client has reverted to treating findings as documentation rather than as work |
---
*Related: [Module Completion Report](module-completion-report.md) — each module updates the backlog as part of its completion package.*
*Related: [Antifragile Risk Register](antifragile-risk-register.md) — the formal risk register template; the backlog feeds into it.*
*Related: [Move Fast and Fix Things — Rule 4](../core/move-fast-and-fix-things.md#rule-4-run-housekeeping-as-a-permanent-stream) — the backlog is the queue that Rule 4 works from.*
*Related: [Engagement Model](../core/engagement-model.md) — backlog setup is part of every module kickoff.*
@@ -211,11 +211,11 @@ Each module documents its prerequisites in detail in [Modular Engagements](modul
| Stage | Deliverable | | Stage | Deliverable |
|-------|-------------| |-------|-------------|
| Brownhat Diagnostic | Current-state assessment report (6-domain, kill chain, quick wins) + prioritised module roadmap | | Brownhat Diagnostic | Current-state assessment report (6-domain, kill chain, quick wins) + prioritised module roadmap + **findings backlog opened** (all diagnostic findings entered with P0/P1/P2 priority and named owner) |
| Module kickoff | Written scope agreement; access checklist confirmation; communication channel setup | | Module kickoff | Written scope agreement; access checklist confirmation; communication channel setup; backlog reviewed and updated with module-specific prerequisites |
| Weekly (during delivery) | Change log update; check-in summary (decisions made, items pending, risks) | | Weekly (during delivery) | Change log update; check-in summary (decisions made, items pending, risks); backlog items resolved this week noted |
| Module completion | Configuration baseline document; scripts/rules in client repository; operating runbooks; risk register update; metrics baseline; next-step recommendation | | Module completion | Configuration baseline document; scripts/rules in client repository; operating runbooks; **backlog updated** (items closed with evidence, new findings added); risk register update where one exists; metrics baseline; next-step recommendation |
| Retained relationship | Monthly advisory summary or capability review report | | Retained relationship | Monthly advisory summary or capability review report; **backlog health review** (P0 count, items closed this cycle, blockers) |
**Ownership**: Every script, detection rule, query, configuration file, and document produced during an engagement belongs to the client permanently. We do not retain privileged access to client environments after an engagement closes. We do not license anything we build — it is yours. **Ownership**: Every script, detection rule, query, configuration file, and document produced during an engagement belongs to the client permanently. We do not retain privileged access to client environments after an engagement closes. We do not license anything we build — it is yours.
@@ -107,7 +107,7 @@ Housekeeping is not janitorial work. It is attack surface reduction at a structu
- Firewall rules added for temporary access that became permanent - Firewall rules added for temporary access that became permanent
- Old GPOs, old admin rights, old certificates - Old GPOs, old admin rights, old certificates
**The engagement implication**: Every module scoping conversation must include a housekeeping component. It is not optional and not deferrable. The client names a resource, a cadence (minimum monthly), and a queue. The queue is populated from module findings and from continuous discovery. Progress is tracked and reviewed at every steering committee. If there is no resourcing for housekeeping, the engagement model must reflect that — because every fix we make will be partially undone within 18 months by new accumulation if the stream does not exist. **The engagement implication**: Every module scoping conversation must include a housekeeping component. It is not optional and not deferrable. The client names a resource, a cadence (minimum monthly), and a queue. The queue is the [Findings Backlog](../assessment-templates/findings-backlog.md) — the single place where every finding from every diagnostic and module lands, prioritised, owned, and tracked to closure. The backlog is populated from module findings and from continuous discovery tools (ASTRAL drift, PULSAR alerts, quarterly BloodHound and Elysium runs). Progress is tracked and reviewed at every steering committee. If there is no resourcing for housekeeping, the engagement model must reflect that — because every fix we make will be partially undone within 18 months by new accumulation if the stream does not exist.
--- ---
+3
View File
@@ -59,6 +59,7 @@ Operational and persuasion documents used in engagements. **Start every new clie
| [AD and Endpoint Hardening](playbooks/ad-endpoint-hardening.md) | On-prem AD, Windows endpoints, hybrid identity | Infrastructure Consultants, Security Engineers | | [AD and Endpoint Hardening](playbooks/ad-endpoint-hardening.md) | On-prem AD, Windows endpoints, hybrid identity | Infrastructure Consultants, Security Engineers |
| [Zero-Budget Hardening](playbooks/zero-budget-hardening.md) | Maximize existing tools, minimize new purchases | Consultants, CISOs, IT Managers | | [Zero-Budget Hardening](playbooks/zero-budget-hardening.md) | Maximize existing tools, minimize new purchases | Consultants, CISOs, IT Managers |
| [Implementation Playbook](playbooks/implementation-playbook.md) | Tactical step-by-step delivery guide | Technical Leads, Security Engineers | | [Implementation Playbook](playbooks/implementation-playbook.md) | Tactical step-by-step delivery guide | Technical Leads, Security Engineers |
| [Sample Engagement: Mid-Market Hybrid](playbooks/sample-engagement-mid-market.md) | Complete worked example: 500 employees, AD+M365 E3, NIS2 scope — findings, kill chain, module sequence, Day 30/90/180 deliverables, populated backlog | Consultants, New Hires |
| [CQRE Product Suite](playbooks/cqre-product-suite.md) | ASTRAL, PULSAR, and AURORA: product details, framework alignment, deployment, and positioning | Consultants, Account Managers | | [CQRE Product Suite](playbooks/cqre-product-suite.md) | ASTRAL, PULSAR, and AURORA: product details, framework alignment, deployment, and positioning | Consultants, Account Managers |
| [Sovereign Tool Stack](playbooks/sovereign-tool-stack.md) | Full arsenal: Prowler, BloodHound, CISO Assistant, ASTRAL, PULSAR, AURORA, Wazuh, Shuffle | Consultants, CTOs, CISOs | | [Sovereign Tool Stack](playbooks/sovereign-tool-stack.md) | Full arsenal: Prowler, BloodHound, CISO Assistant, ASTRAL, PULSAR, AURORA, Wazuh, Shuffle | Consultants, CTOs, CISOs |
| [Privileged Access Architecture](playbooks/privileged-access-architecture.md) | PAM design: Teleport, Tailscale/Headscale, JIT access, vendor access governance | Security Architects, Infrastructure Consultants, OT Leads | | [Privileged Access Architecture](playbooks/privileged-access-architecture.md) | PAM design: Teleport, Tailscale/Headscale, JIT access, vendor access governance | Security Architects, Infrastructure Consultants, OT Leads |
@@ -84,6 +85,8 @@ Operational and persuasion documents used in engagements. **Start every new clie
| Document | Purpose | Audience | | Document | Purpose | Audience |
|----------|---------|----------| |----------|---------|----------|
| [Assessment Team Guide](assessment-templates/assessment-team-guide.md) | Technical execution guide for the Brownhat Diagnostic: tool sequence, what to run, what to look for, kill chain synthesis, report structure | Assessors, Technical Consultants |
| [Findings Backlog](assessment-templates/findings-backlog.md) | Single source of truth for all findings across every engagement; input queue for the housekeeping stream; pragmatic alternative to a formal risk register | Consultants, IT Leads, Client Teams |
| [NIST CSF 2.0 Baseline Assessment](assessment-templates/nist-csf-baseline.md) | The Brownhat Diagnostic: structured 2-half-day workshop, gap analysis, prioritised module roadmap | Consultants, CISOs, IT Managers | | [NIST CSF 2.0 Baseline Assessment](assessment-templates/nist-csf-baseline.md) | The Brownhat Diagnostic: structured 2-half-day workshop, gap analysis, prioritised module roadmap | Consultants, CISOs, IT Managers |
| [NIST CSF 2.0 — česká verze](assessment-templates/nist-csf-baseline-cs.md) | Brownhat Diagnostika: dotazníky a průvodce workshopem v češtině | Consultants running Czech-language workshops | | [NIST CSF 2.0 — česká verze](assessment-templates/nist-csf-baseline-cs.md) | Brownhat Diagnostika: dotazníky a průvodce workshopem v češtině | Consultants running Czech-language workshops |
| [Module Completion Report](assessment-templates/module-completion-report.md) | Template for the deliverable package at the end of every module | Consultants | | [Module Completion Report](assessment-templates/module-completion-report.md) | Template for the deliverable package at the end of every module | Consultants |
@@ -14,9 +14,9 @@ This template provides a reusable structure for building financial justification
| Element | Content | | Element | Content |
|---------|---------| |---------|---------|
| **Investment ask** | €[X] over 180 days, phase-gated with go/no-go decisions at days 30, 60, 90 | | **Investment ask** | €[X] over 180 days, phase-gated with go/no-go decisions at days 60, 120, 180 |
| **Primary return** | Reduction of existential cyber risk; regulatory compliance evidence; competitive differentiation through AI sovereignty | | **Primary return** | Reduction of existential cyber risk; regulatory compliance evidence; operational resilience demonstrable to auditors and insurers |
| **Break-even** | Day 90 (via avoided regulatory fine exposure, reduced insurance premiums, or operational resilience) | | **Break-even** | 1218 months post-programme: insurance premium reductions take one renewal cycle; regulatory evidence value accumulates from day 1; incident avoidance value is probabilistic but compounding |
| **Risk of inaction** | Quantified below; summary: [X]% probability of material incident within 24 months at estimated cost of €[Y] | | **Risk of inaction** | Quantified below; summary: [X]% probability of material incident within 24 months at estimated cost of €[Y] |
### Page 2: Cost of Inaction ### Page 2: Cost of Inaction
@@ -27,11 +27,11 @@ This template provides a reusable structure for building financial justification
| Risk Category | Probability (Client-Specific) | Average Industry Cost | Expected Value | | Risk Category | Probability (Client-Specific) | Average Industry Cost | Expected Value |
|--------------|------------------------------|----------------------|----------------| |--------------|------------------------------|----------------------|----------------|
| Ransomware incident (recovery + downtime) | [X]% | €4.5M | €[X * 4.5M] | | Ransomware incident (recovery + downtime) | [X]% | €4.5M average (IBM 2024) | €[X * 4.5M] |
| Regulatory fine (DORA / NIS2 / national) | [X]% | 1-2% global turnover | €[X * % GT] | | Regulatory fine (DORA / NIS2 / national) | [X]% | Up to 2% global turnover (NIS2); up to 1% daily (DORA) | €[X * % GT] |
| Data breach notification and remediation | [X]% | €3.8M (per IBM Cost of Data Breach Report) | €[X * 3.8M] | | Data breach notification and remediation | [X]% | €3.8M average (IBM Cost of Data Breach 2024) | €[X * 3.8M] |
| Cloud AI vendor price increase / lock-in | [X]% | 200-500% price shock | €[X * shock] | | Incident response and forensics | [X]% | €150K500K (external IR firm + legal + crisis comms, independent of breach cost) | €[X * 325K] |
| Competitive intelligence loss (cloud AI training) | [X]% | Unquantifiable but existential | High | | Business interruption during recovery | [X]% | €[daily revenue] × [estimated downtime days] — client-specific | €[X * daily] |
**Calculation**: **Calculation**:
@@ -58,11 +58,11 @@ Present this as: *"Without intervention, the organization faces an expected loss
| Phase | Timeline | Primary Activity | Estimated Cost | Go/No-Go Gate | | Phase | Timeline | Primary Activity | Estimated Cost | Go/No-Go Gate |
|-------|----------|-----------------|----------------|---------------| |-------|----------|-----------------|----------------|---------------|
| **1. Hygiene** | Days 0-30 | Configuration of existing tools; identity cleanse; visibility | €[X] (primarily labor) | Day 30: Demonstrate risk reduction or stop | | **1. Visibility** | Days 060 | Kill chain mapping; T0 identity hardening; ASTRAL/PULSAR deployment; T0 backup verified | €[X] (primarily labor) | Day 60: Kill chain documented and T0 hardening complete |
| **2. Control** | Days 30-60 | ASR, MFA enforcement, network segmentation, vendor lockdown | €[X] (labor + minimal tooling) | Day 60: Validate control effectiveness | | **2. Control** | Days 60120 | MFA for all users; CA baseline; attack surface reduction; vendor hardening | €[X] (labor + minimal tooling) | Day 120: MFA enforced 100%; P0/P1 vulnerabilities closed |
| **3. Sovereignty** | Days 60-90 | Local AI pilot; recovery drills; T0 asset protection | €[X] (labor + local inference hardware if needed) | Day 90: Prove local AI viability | | **3. Signal** | Days 120180 | Detection rules; alert runbooks; knowledge transfer; housekeeping stream operational | €[X] (labor) | Day 180: Client operates independently; housekeeping running |
| **4. Antifragility** | Days 90-180 | Chaos engineering; red team; continuous improvement | €[X] (labor + external testing) | Day 180: Maturity assessment and next-phase planning | | **4. Retained capability** | Ongoing | Quarterly retained scope; detection engineering; housekeeping; structural improvements | €[X]/quarter | Ongoing: measurable queue reduction; annual BloodHound/Elysium |
| **Total** | 180 days | | **€[X]** | | | **Total (180-day programme)** | 180 days | | **€[X]** | |
#### Cost Categories #### Cost Categories
@@ -78,11 +78,11 @@ Present this as: *"Without intervention, the organization faces an expected loss
| Alternative Approach | Cost | Timeline | Risk | | Alternative Approach | Cost | Timeline | Risk |
|---------------------|------|----------|------| |---------------------|------|----------|------|
| **Do nothing** | €0 | — | Expected loss €[X] over 24 months | | **Do nothing** | €0 | — | Expected loss €[X] over 24 months; growing regulatory exposure |
| **Traditional security audit** | €[X] | 90 days | Produces report; no structural change | | **Traditional security audit** | €[X] | 90 days | Produces report; no structural change; findings age immediately |
| **Full E5 licensing upgrade** | €[X]/user/year | 30 days | Solves some gaps; does not address architecture or AI sovereignty | | **Full E5 licensing upgrade** | €[X]/user/year | 30 days | Solves tooling gaps; does not address architecture, process, or accumulated technical debt |
| **Managed security service (MSSP)** | €[X]/month | Ongoing | Outsources detection; does not reduce structural fragility | | **Managed security service (MSSP)** | €[X]/month | Ongoing | Outsources detection; does not reduce structural fragility; dependency without capability transfer |
| **Antifragile program (this proposal)** | €[X] | 180 days | Structural change, regulatory evidence, AI sovereignty, measurable resilience | | **Antifragile programme (this proposal)** | €[X] | 180 days + retained | Structural change, regulatory evidence, measurable kill chain closure, client operational independence |
--- ---
@@ -97,7 +97,7 @@ Present this as: *"Without intervention, the organization faces an expected loss
| Avoided ransomware recovery | Probability reduction × €4.5M | €[X] | €[Y] | | Avoided ransomware recovery | Probability reduction × €4.5M | €[X] | €[Y] |
| Avoided regulatory fine | Probability reduction × % GT | €[X] | €[Y] | | Avoided regulatory fine | Probability reduction × % GT | €[X] | €[Y] |
| Insurance premium reduction | 10-20% reduction on cyber premium | €[X] | €[Y] | | Insurance premium reduction | 10-20% reduction on cyber premium | €[X] | €[Y] |
| Cloud AI cost stabilization | Shift from variable API costs to fixed infra | €[X] | €[Y] | | Audit preparation time reduction | ASTRAL Git trail replaces manual evidence gathering for ISO 27001, NIS2, DORA | €[X] | €[Y] |
| Reduced incident response cost | Faster detection and containment | €[X] | €[Y] | | Reduced incident response cost | Faster detection and containment | €[X] | €[Y] |
| **Total Quantifiable Return** | | **€[X]** | **€[Y]** | | **Total Quantifiable Return** | | **€[X]** | **€[Y]** |
@@ -105,7 +105,7 @@ Present this as: *"Without intervention, the organization faces an expected loss
| Return Category | Description | | Return Category | Description |
|----------------|-------------| |----------------|-------------|
| **Competitive moat** | Proprietary data improves only your models; competitors cannot replicate your operational intelligence | | **Regulatory agility** | Demonstrable continuous controls accelerate regulatory approvals, certification audits, and partnership due diligence |
| **Regulatory agility** | Demonstrable resilience accelerates regulatory approvals, market entries, and partnership discussions | | **Regulatory agility** | Demonstrable resilience accelerates regulatory approvals, market entries, and partnership discussions |
| **Talent retention** | Engineers and security professionals prefer organizations that invest in durability over firefighting | | **Talent retention** | Engineers and security professionals prefer organizations that invest in durability over firefighting |
| **M&A readiness** | Clean identity architecture, tested recovery, and documented controls increase valuation and reduce due-diligence friction | | **M&A readiness** | Clean identity architecture, tested recovery, and documented controls increase valuation and reduce due-diligence friction |
@@ -139,17 +139,18 @@ Present as: *"This program delivers a [X]% return in year one, rising to [Y]% in
| Scenario | Investment Adjustment | Outcome | | Scenario | Investment Adjustment | Outcome |
|----------|----------------------|---------| |----------|----------------------|---------|
| **Best case** | No additional tooling needed | Program completes under budget; all value from configuration | | **Best case** | No additional tooling needed; client IT team engaged and responsive | Programme completes on timeline; all value from configuration; client operational independence achieved at day 180 |
| **Base case** | Local AI hardware required for pilot | Slight budget increase; sovereign intelligence proven | | **Base case** | Minor tooling additions; moderate IT team availability; some change management friction | Programme completes with 24 week slippage on Phase 2 (MFA rollout change management is the usual bottleneck); strong kill chain closure and detection capability |
| **Worst case** | Deeper technical debt than anticipated | Extend Phase 1 by 30 days; additional labor cost; still cheaper than incident | | **Challenging** | Significant technical debt discovered in Phase 1; IT team constrained; change windows infrequent | Phase 1 extended by 46 weeks; Phase 2 scope narrowed to kill chain critical path; programme value is still genuine — the findings alone are worth the investment; honest client conversation required at day 60 gate |
| **Abort condition** | Executive sponsor departure; IT team fully occupied by another major project; scope fundamentally different from discovery call | Programme paused or stopped at the next gate. Partial phases produce partial value — ASTRAL/PULSAR deployed, kill chain documented. Better to stop honestly than to produce a report that nobody acts on. |
--- ---
### Page 6: Recommendation and Next Steps ### Page 6: Recommendation and Next Steps
**The Ask (Full Program)**: **The Ask (Full Programme)**:
> *"We recommend approval of a 180-day antifragile enterprise program, structured in four 30-60-90-180 day phases with hard go/no-go gates. The initial 30-day investment is €[X] with a defined deliverable: identification and initial closure of the organizational kill chain. If measurable risk reduction is not demonstrated by Day 30, the program stops with no further obligation."* > *"We recommend approval of a 180-day antifragile enterprise programme with three hard milestones. By Day 30: your kill chain is documented, ASTRAL and PULSAR are live, and your most privileged accounts are hardened. By Day 90: MFA covers the entire organisation, your kill chain is closed, and you have detection capability on M365. By Day 180: your team operates the systems independently, housekeeping is running as a permanent stream, and everything we built is in your repository. That is the 180-day programme. What comes after is a retained scope — scoped separately, renewed quarterly."*
**The Ask (Modular Alternative)**: **The Ask (Modular Alternative)**:
@@ -4,20 +4,104 @@
## For the Executive Reader ## For the Executive Reader
This is not a three-year digital transformation. It is a **180-day strategic reset** with measurable business outcomes at each phase gate. This is not a three-year digital transformation. It is a **180-day foundation programme** with measurable progress at each phase gate.
| Phase | Timeline | What the Board Sees | | Phase | Timeline | What the Board Sees |
|-------|----------|---------------------| |-------|----------|---------------------|
| **Hygiene** | Days 0-30 | Visibility. For the first time, we know every identity, asset, and gap that could end the company. | | **Visibility** | Days 060 | We know the kill chain. T0 assets are identified, critical privileges are mapped, and logging is operational. |
| **Control** | Days 30-60 | Containment. The highest-risk exposures are closed using tools already owned. | | **Control** | Days 60120 | The highest-risk kill chain nodes are closed. MFA is enforced on privileged accounts. Critical gaps have evidence-backed remediation. |
| **Sovereignty** | Days 60-90 | Ownership. Proprietary intelligence is reclaimed. Recovery from disaster is proven, not assumed. | | **Signal** | Days 120180 | Detection capability is built on the hardened foundation. Housekeeping is running as a permanent stream. The organisation can operate and maintain what was built. |
| **Antifragility** | Days 90-180 | Advantage. The organization learns faster from disruption than competitors do. | | **Antifragility** | Ongoing | Structural improvement, retained capability, and progressive reduction of technical debt. This phase does not end. |
**What 180 days delivers**: A hardened foundation, closed kill chain, operational detection capability, and the processes to sustain them. Not a complete transformation — a credible, maintained starting point.
**What 180 days does not deliver**: Elimination of all technical debt (that takes years), full AI sovereignty (that is a multi-year journey), or zero vendor dependencies (that is an ongoing programme). Promising otherwise is dishonest and destroys client trust when reality arrives.
**Investment principle**: Configuration first. Procurement only if justified. Most value is extracted from existing tools before any new purchase is discussed. **Investment principle**: Configuration first. Procurement only if justified. Most value is extracted from existing tools before any new purchase is discussed.
**Governance**: Weekly steering committee. Monthly board update. Quarterly antifragility assessment. Hard go/no-go gates at days 30, 60, and 90. **Governance**: Weekly check-in with named client lead. Monthly steering committee. Hard go/no-go gates at days 30, 90, and 180.
**Modularity**: While this document presents the full 180-day program, every phase can be delivered as an independent, fixed-scope module. See [Modular Engagements](../core/modular-engagements.md) for the menu of standalone engagements. **Modularity**: Every phase can be delivered as an independent, fixed-scope module. See [Modular Engagements](../core/modular-engagements.md) for the standalone engagement menu.
---
## Milestone Deliverables: What You Hold in Your Hands
The three milestone dates — Day 30, Day 90, Day 180 — are not arbitrary progress checkpoints. Each produces a specific, verifiable set of deliverables. A client who stops at Day 30 still holds something of lasting value. A client who reaches Day 180 holds everything below in a form they can operate without us.
### Day 30: Intelligence and Immunity
*Precondition: Brownhat Diagnostic complete, access provisioned by kickoff.*
| # | Deliverable | Verified by |
|---|-------------|-------------|
| 1 | **Brownhat Diagnostic report** — kill chain identified, up to 5 immediate quick wins, prioritised module roadmap | Delivered document |
| 2 | **ASTRAL deployed** — complete M365 tenant configuration snapshot committed to Git; drift detection running; event-driven change probe active | First drift PR visible in ADO |
| 3 | **PULSAR deployed** — all M365 admin audit events ingesting; logs searchable from day 1 forward; 12-month retention accumulating | Oldest log entry confirmed in UI |
| 4 | **T0 accounts hardened** — every Global Admin, Domain Admin, and high-privilege service principal identified; MFA enforced; documented with owner | CA sign-in logs show MFA enforced for T0 accounts |
| 5 | **Public attack surface report** — all internet-facing assets enumerated; P0 findings (internet-exposed + critical CVE) identified and prioritised | Delivered report |
| 6 | **Quick wins closed** — up to 5 immediate improvements from Brownhat findings, using existing tools, zero procurement | Closed items documented in change log |
| 7 | **Findings backlog opened** — all Brownhat Diagnostic findings entered with P0/P1/P2 priority, owner assigned per item, monthly cadence confirmed; this is the input queue for the housekeeping stream | Backlog visible in agreed system; all findings from items 16 above entered |
**The Day 30 value**: You know your kill chain. Your M365 configuration is under version control and your audit logs are being retained — permanently, from this day. Your most privileged accounts are hardened. This stands on its own regardless of what follows.
*Day 30 is a hard gate. If ASTRAL and PULSAR are not deployed and T0 accounts are not confirmed as MFA-enforced by day 30, the engagement has a resourcing or access problem that must be resolved before proceeding.*
---
### Day 90: Kill Chain Closed
*Everything from Day 30, plus:*
| # | Deliverable | Verified by |
|---|-------------|-------------|
| 8 | **MFA enforced for all users** — not just enrolled; enforced via Conditional Access policy | CA sign-in logs: zero successful authentications without MFA for in-scope users |
| 9 | **Legacy authentication blocked tenant-wide** | CAExporter export + sign-in logs: zero legacy auth sign-ins in past 7 days |
| 10 | **Conditional Access baseline deployed** — device compliance, sign-in risk, location policies active and tested | CA policy set exported by CAExporter; test sign-in matrix documented |
| 11 | **P0 and P1 vulnerabilities closed** — from Day 30 attack surface report | Rescan confirming closure; residual items in risk register |
| 12 | **AD attack paths reduced** — BloodHound before/after comparison showing measurable reduction in paths to Domain Admin | BloodHound report: path count comparison |
| 13 | **Vendor remote access hardened** — time-bounded, MFA-required, session-recorded for all third-party access | Vendor access log showing new controls enforced |
| 14 | **T0 backup integrity verified** — at least one successful restore per T0 system, timed and documented | Backup test report per T0 system |
| 15 | **ASTRAL: first restore drill** — a rejected drift PR has triggered the restore pipeline; restore validated against a real change | ADO restore pipeline run log |
| 16 | **PULSAR: top 5 alert rules operational** — rules written, test-triggered, runbooks drafted for each | Alert rule set visible; test trigger documented |
**The Day 90 value**: Your kill chain is closed. MFA covers the entire organisation. The highest-risk attack paths are measurably reduced. Any incident from this point has a detection and response capability behind it — and your configuration is auditable back to day 1.
---
### Day 180: Operational Independence
*Everything from Day 90, plus:*
| # | Deliverable | Verified by |
|---|-------------|-------------|
| 17 | **Alert runbooks complete** — documented response procedure for every active PULSAR alert rule; escalation paths defined | Runbook set reviewed and signed off by client IT lead |
| 18 | **Custom detection rules** — at least 3 rules written for client-specific TTPs identified in Phase 1 kill chain | Rules deployed; test-triggered and confirmed |
| 19 | **Client IT lead operational independence** — client IT lead demonstrates ability to: review ASTRAL drift PRs, search PULSAR events, trigger and verify an alert rule | Live walkthrough completed without consultant prompting |
| 20 | **Housekeeping stream running** — 3 consecutive monthly cycles completed; accounts resolved per cycle tracked | Queue status report showing 3 cycles; measurable reduction |
| 21 | **Module completion packages delivered** — every runbook, script, configuration file, and detection rule in client's own repository | Repository contents confirmed; client confirms ownership |
| 22 | **Risk register closure evidence** — before/after comparison for every risk addressed during the programme; residual risks documented | Risk register delivered and reviewed with executive sponsor |
| 23 | **Retained capability scope agreed** — written scope for continuation: cadence, activities, named owner | Signed retained scope or explicit decision to defer |
**The Day 180 value**: You are no longer dependent on us. The systems run, the detections fire, the housekeeping happens on schedule. What continues in the retained scope is enhancement — not maintenance of what we built.
---
### What These Milestones Assume
These deliverables are based on a typical M365-primary environment with:
- A named client lead with 3040% availability
- Access provisioned before kickoff (accounts, MFA, Global Admin for ASTRAL/PULSAR bootstrap)
- An IT admin who can execute configuration changes with guidance
- Weekly check-ins not cancelled
**What shifts the timeline:**
- Large or complex AD environments add 24 weeks to Day 90 work
- High user count (500+) adds 24 weeks to MFA rollout change management
- Constrained IT team availability is the single most common cause of slippage — budget for it honestly in the scope
- OT environments: see the [Critical Infrastructure Adaptation](move-fast-and-fix-things.md#the-critical-infrastructure-adaptation); Day 90 timelines for network segmentation work are longer
**What does not shift the Day 30 milestone:** ASTRAL and PULSAR deploy in hours. The Brownhat Diagnostic is 2 half-day workshops. T0 account hardening is 12 weeks of focused work. If Day 30 deliverables are not met, the bottleneck is access provisioning or client availability — both of which are addressable before kickoff.
*For the business case and financial justification, see [Business Case Template](business-case-template.md).* *For the business case and financial justification, see [Business Case Template](business-case-template.md).*
*For board conversation guidance, see [C-Suite Conversation Guide](../core/c-suite-conversation-guide.md).* *For board conversation guidance, see [C-Suite Conversation Guide](../core/c-suite-conversation-guide.md).*
@@ -26,295 +110,290 @@ This is not a three-year digital transformation. It is a **180-day strategic res
## For the Practitioner ## For the Practitioner
This playbook provides a **time-boxed, phase-gated roadmap** for transforming a fragile enterprise into an antifragile one. It is designed for immediate deployment in consulting engagements and can be adapted to organizational size, industry, and regulatory context. ### What This Plan Is Not
The plan is structured in **four phases**: Hygiene (30 days), Control (60 days), Sovereignty (90 days), and Antifragility (180 days). Each phase builds on the previous. Skipping phases creates the illusion of progress while leaving structural fragility intact. Before using this roadmap with a client, be honest about what it commits to.
> **Core tenet**: Before any new purchase is discussed, exhaust the capabilities of existing tooling. See the [Zero-Budget Hardening Playbook](zero-budget-hardening.md) for the tactical expression of this principle. **Not a sprint.** The most common failure mode is treating security modernisation as a project that ends. It does not end. The 180-day programme establishes processes and capabilities that must run permanently. If the client does not have the internal resources to continue what we build, we need to have that conversation before we start.
**Not a full audit.** Phase 1 does not produce a complete identity inventory, a comprehensive vulnerability assessment, or an exhaustive compliance gap analysis. It produces a kill chain map and enough visibility to close existential risks. The full audit takes months and tends to produce reports that paralyse rather than mobilise.
**Not compatible with staff paralysis.** Organisations dealing with active incidents, leadership changes, or major concurrent projects cannot execute this plan on the stated timeline. The timeline is predicated on a named client lead with 3040% availability and access provisioned before day 1.
**Not vendor-agnostic in execution.** The plan references Microsoft 365 environments as the primary context because that is most clients' reality. Non-Microsoft environments follow the same logic but require different specific tools. See the Platform Adaptation appendix in [Modular Engagements](../core/modular-engagements.md).
--- ---
## Phase 1: Hygiene (Days 030) ## Phase 1: Visibility (Days 060)
**Theme**: *You cannot defend what you cannot see.* **Theme**: *You cannot defend what you cannot see. You cannot fix what you cannot prioritise.*
The first 30 days are aggressive, disruptive, and non-negotiable. The goal is not perfection; it is **visibility**. Every unknown identity, unmapped dependency, and unmonitored access path is a latent failure waiting to happen. The first 60 days are about **kill chain mapping and critical visibility** — not about fixing everything. The goal is a clear, ranked picture of what would end the organisation, and initial closure of the most accessible existential gaps.
### Week 1-2: Identity and Access Blitz > **Why 60 days, not 30**: A 30-day identity blitz sounds fast. It is also the fastest path to disabling a service account that runs payroll at 2 AM on Friday. Week 1 is documentation and baseline. Fixes require understanding the environment first. See the engagement model's week 1 discipline — it applies to every phase of this plan.
**Tool strategy**: Use existing AD / Entra ID / IAM. No new purchases. ### Weeks 12: Baseline and Kill Chain Mapping
| Action | Owner | Deliverable | Existing Tool Leverage | **No changes in week 1.** Document and understand.
|--------|-------|-------------|------------------------|
| Aggressive identity audit | IAM / Security | Complete inventory of all human and non-human identities | ADUC, Entra ID portal, AWS IAM console |
| Disable all unknown / unused accounts | IAM | List of disabled accounts with business justification for exceptions | Existing IAM + PowerShell / CLI scripts |
| Rotate all critical passwords and shared secrets | Security Ops | Rotation log with verification | Existing IAM + LAPS (free from Microsoft) |
| Target: admin accounts, service accounts, krbtgt equivalents | AD / Cloud IAM | Documentation of every privileged account | Existing directory services |
| Implement password hygiene (minimum: audit) | IAM | Baseline report on password policy compliance | Native password policies + audit logs |
### Week 2-3: Perimeter and Communication Mapping | Action | Owner | Deliverable |
|--------|-------|-------------|
| Export current identity state: all accounts, groups, privilege assignments | IAM / Security | Identity inventory — stale, active, privileged, service |
| Run BloodHound collection; run Elysium password audit | Security | AD attack path map; compromised credential list |
| Run CAExporter for Conditional Access documentation | Security | Human-readable CA policy register with gaps highlighted |
| Deploy ASTRAL for M365 configuration baseline | Security | Committed tenant baseline; first drift detection operational |
| Map all public-facing assets | Security | External attack surface register with P0 classification |
| Identify the kill chain: shortest path from "nothing bad" to "organisation fails" | Security Architect | Kill chain document — maximum 2 pages; reviewed with executive sponsor |
**Tool strategy**: Use native firewall management, open-source scanners, and manual audit before purchasing new NDR/VM platforms. ### Weeks 34: T0 Identity Hardening
| Action | Owner | Deliverable | Existing Tool Leverage | Target: privileged accounts only. Not all accounts.
|--------|-------|-------------|------------------------|
| Audit all vendor / supplier access paths | Security / Procurement | Inventory of VPN, RDP, Citrix, SSH, FTP, SCP, API keys | Existing IAM, VPN logs, firewall logs |
| Review and document firewall rules | Network Team | Rule set with business justification for each | Native firewall management interfaces |
| Map public-facing assets from external perspective | Security | Attack surface report with P0 classification | Free/open-source: Shodan, certificate transparency logs, nmap |
| Implement aggressive vulnerability scanning | Security | Weekly scan results with trending | Existing scanner, Microsoft Defender Vulnerability Management, or OpenVAS |
### Week 3-4: Visibility and Monitoring Baseline | Action | Owner | Deliverable |
|--------|-------|-------------|
| Force-reset accounts identified as compromised by Elysium (P0) | IAM | Password reset log with verification |
| Enforce MFA on all T0 accounts: Global Admins, Domain Admins, backup admins, service principals with high privilege | IAM | MFA coverage report for T0 accounts |
| Review and disable accounts that are clearly orphaned: departed employees confirmed by HR | IAM | Disable log — only accounts with confirmed ownership resolution |
| Rotate KRBTGT and critical service account passwords | AD | Rotation log; tested without service disruption |
| Review and remove direct Global Admin assignments; move toward PIM or named individual accounts | IAM | Privilege assignment review |
**Tool strategy**: Maximize existing EDR/SIEM before considering new platforms. A spreadsheet CMDB is infinitely better than no CMDB. > **What we do not do in weeks 34**: We do not attempt to disable all unknown accounts. We do not attempt to resolve all service account ownership. We do not attempt to achieve 100% MFA on all users. These are Phase 2 activities, started after the kill chain is closed and the environment is understood.
| Action | Owner | Deliverable | Existing Tool Leverage | ### Weeks 56: Logging, Perimeter, and Critical Asset Inventory
|--------|-------|-------------|------------------------|
| Deploy endpoint detection on all managed devices | SOC / MDE | Coverage report: % of estate monitored | Existing EDR (Defender, CrowdStrike, SentinelOne) | | Action | Owner | Deliverable |
| Establish log aggregation for critical systems | Security | Centralized logging for T0 and T1 assets | Existing SIEM, syslog server, or cloud native logging (Sentinel, CloudWatch, Cloud Logging) | |--------|-------|-------------|
| Create initial CMDB seed for critical systems | IT / Security | CMDB populated with crown jewels | Existing ITAM, ServiceNow, or spreadsheet | | Deploy PULSAR for M365 audit log ingestion | Security | Audit events ingested; watermarks established; search operational |
| Document "kill chain": shortest path to organizational failure | Security Architect | Threat model and mitigation map | Manual analysis + stakeholder interviews | | Enable logging for T0 systems where it is missing | Security | Logging coverage report for T0/T1 assets |
| Audit all vendor and third-party remote access paths | Security / Procurement | Vendor access inventory with remove/restrict list |
| Scan public-facing assets for critical CVEs | Security | Prioritised findings: P0 (internet-facing, critical CVE), P1, P2 |
| Seed CMDB with T0 assets | IT / Security | T0 asset register with ownership, backup status, recovery procedure |
| Validate backup integrity for T0 assets | Backup Admin | Backup test report — at least one successful restore per T0 system |
### Weeks 78: Kill Chain Closure and Phase 1 Wrap
| Action | Owner | Deliverable |
|--------|-------|-------------|
| Close P0 vulnerabilities identified in week 56 scan | Security | Remediation log with verification |
| Restrict or close the highest-risk vendor access paths | Security / Procurement | Vendor access changes confirmed |
| Implement basic network segmentation between IT and OT (if applicable) | Network / OT | Segmentation policy; validated firewall rules |
| Phase 1 review: re-run BloodHound and Elysium against week 1 baseline | Security | Before/after comparison; revised kill chain assessment |
| Establish findings backlog: all Phase 1 findings entered with priority, owner, target date | IAM / Security | [Findings Backlog](../assessment-templates/findings-backlog.md) populated; named owner; monthly housekeeping cadence confirmed |
### Phase 1 Exit Criteria ### Phase 1 Exit Criteria
- [ ] 100% of identities known and validated - [ ] Kill chain documented and reviewed with executive sponsor
- [ ] 100% of privileged access reviewed - [ ] T0 accounts: MFA enforced, privilege reviewed, compromised credentials reset
- [ ] All public-facing assets identified and scanned - [ ] P0 vulnerabilities (internet-facing, critical CVE) closed
- [ ] Centralized logging operational for critical systems - [ ] ASTRAL deployed; M365 baseline committed
- [ ] CMDB seeded with T0/T1 assets - [ ] PULSAR deployed; M365 audit logs ingesting
- [ ] Initial "kill chain" documented - [ ] T0 asset CMDB complete with backup integrity verified
- [ ] Vendor access inventory complete; highest-risk paths closed
- [ ] Housekeeping stream established: named owner, cadence, populated queue
### Phase 1 Mantra **What "complete" does not mean at day 60**: All identities validated. All shared accounts eliminated. MFA on 100% of users. Zero legacy protocols. These are legitimate targets — they belong in the housekeeping queue and Phase 2 work, tracked, resourced, and given realistic timescales.
> *"Do not be afraid to break things temporarily. Disable first, justify second. Visibility before permission."*
--- ---
## Phase 2: Control (Days 3060) ## Phase 2: Control (Days 60120)
**Theme**: *What we have seen, we must now contain.* **Theme**: *Close the kill chain. Build on what is understood, not what is assumed.*
With visibility established, the next 30 days focus on **closing the highest-risk gaps** without introducing operational paralysis. This is the phase of quick wins and surface reduction. Phase 2 takes the kill chain map from Phase 1 and systematically closes the structural gaps. The work is less about discovery and more about verified remediation with proper change management.
### Week 5-6: Attack Surface Reduction (ASR) ### Weeks 910: MFA and Identity Hardening (Broad Rollout)
**Tool strategy**: ASR rules and PAWs are native Microsoft capabilities. For non-Microsoft environments, use existing endpoint management. Phase 1 hardened T0. Phase 2 extends to all users — with proper change management.
| Action | Owner | Deliverable | Existing Tool Leverage | | Action | Owner | Deliverable |
|--------|-------|-------------|------------------------| |--------|-------|-------------|
| Eliminate shared accounts where possible | IAM | Reduction metric: % of shared accounts decommissioned | Existing IAM + access review process | | Enforce MFA on all remote access: not just T0, but all users | IAM | MFA coverage report (% of users) — target 100% enforced, not just enrolled |
| Implement Attack Surface Reduction rules on endpoints | Endpoint Security | ASR policy deployed and compliance measured | Microsoft Defender ASR (already owned in E3/E5) | | Block legacy authentication protocols tenant-wide | IAM | Legacy auth block confirmed via CAExporter and sign-in log review |
| Harden admin access: dedicated PAWs, no browsing, no email | Security | PAW architecture documented and deployed | Existing Windows / Intune / GPO | | Deploy Conditional Access baseline: device compliance, location, sign-in risk | IAM | CA policy set deployed and tested; rollback documented |
| Review and minimize permissions across all platforms | IAM / App Owners | Permission matrix with least-privilege gaps identified | Native IAM interfaces + scripts | | Continue housekeeping queue: first monthly cycle | IAM | Accounts resolved this cycle; queue status report |
### Week 6-7: Network and DNS Security > **Change management is the constraint here, not technical complexity.** MFA rollout for 500 users requires helpdesk preparation, communication, exception handling, and at minimum two weeks of lead time. Scope this honestly. A rollout that generates 200 support tickets and forces an exception for the CEO because his phone broke is a rollout that gets walked back.
**Tool strategy**: Use existing DNS infrastructure, firewall segmentation, and open-source sensors (Zeek/Suricata) before buying NDR. ### Weeks 1112: Attack Surface Reduction
| Action | Owner | Deliverable | Existing Tool Leverage | | Action | Owner | Deliverable |
|--------|-------|-------------|------------------------| |--------|-------|-------------|
| Deploy DNS security (filtering, logging, anomaly detection) | Network | DNS security coverage report | Existing DNS infrastructure, Quad9/Cloudflare free tiers, Microsoft DNS security | | Deploy Intune compliance policies; enforce device compliance in CA | Endpoint / IAM | Compliance policy set; non-compliant device access blocked |
| Segment IT/OT networks where they intersect | Network / OT | Network segmentation diagram and policy | Existing firewalls and VLANs | | Harden admin access: dedicated admin accounts, PAW where feasible | Security | Admin account architecture; PAW deployed for T0 admins |
| Deploy network sensors at critical boundaries | SOC | Sensor coverage map with alerting validated | Zeek or Suricata (open-source) or existing IDS/IPS | | Implement ASR rules on all managed endpoints | Endpoint Security | ASR policy deployed; compliance measured |
| Review and remove excessive application permissions (OAuth grants, service principals) | IAM | App permission audit; high-risk grants reviewed and reduced |
### Week 7-8: Multi-Factor Authentication and Conditional Access ### Weeks 1314: Network Hardening and Vendor Governance
**Tool strategy**: MFA and conditional access are native capabilities of Entra ID, Okta, and cloud IAM. No additional purchase required. | Action | Owner | Deliverable |
|--------|-------|-------------|
| Implement DNS security: filtering and logging | Network | DNS security coverage report |
| Harden vendor remote access: time-bounded, MFA, session recording | Security / Procurement | Vendor access gateway operational; access policy enforced |
| Patch P1 vulnerabilities from Phase 1 scan | Security | Remediation log; rescan confirming closure |
| Establish change window discipline: all production changes through approved process | IT / Security | Change management process documented and operational |
| Action | Owner | Deliverable | Existing Tool Leverage | ### Weeks 1516: Verification and Phase 2 Wrap
|--------|-------|-------------|------------------------|
| Enforce MFA on all remote access paths | IAM | MFA coverage: 100% of remote access | Entra ID, Okta, Duo, or native cloud IAM MFA | | Action | Owner | Deliverable |
| Implement conditional access policies | IAM / Cloud | Policy set: device compliance, location, risk score | Entra ID Conditional Access, AWS IAM, GCP IAM | |--------|-------|-------------|
| Review and harden M365 / Google Workspace security | Cloud Team | Cloud security posture report | Microsoft Secure Score, Google Security Health Analytics | | Re-run BloodHound, Elysium, and CAExporter against Phase 1 baseline | Security | Attack path reduction report; before/after metrics |
| Run Purple Knight / E8-CAT against AD and M365 | Security | Security score comparison; residual findings list |
| Review ASTRAL drift log for Phase 12 period | Security | Configuration change audit; unauthorised drift incidents |
| Review PULSAR audit log: anomalous events flagged, investigated, resolved | Security | Audit review report |
| Update risk register: what Phase 12 closed, what remains open, what Phase 3 addresses | Security | Updated risk register signed off by client lead |
| Housekeeping queue: second monthly cycle | IAM | Queue status; cumulative accounts resolved |
### Phase 2 Exit Criteria ### Phase 2 Exit Criteria
- [ ] Shared accounts reduced by minimum 50% - [ ] MFA enforced for 100% of users (not just enrolled — enforced via CA policy)
- [ ] Legacy authentication blocked tenant-wide
- [ ] CA baseline deployed and tested
- [ ] ASR rules active on all managed endpoints - [ ] ASR rules active on all managed endpoints
- [ ] MFA enforced on 100% of remote and privileged access - [ ] P1 vulnerabilities from Phase 1 scan closed
- [ ] DNS security operational - [ ] Vendor remote access hardened and inventoried
- [ ] Network segmentation policy defined and initial segments implemented - [ ] Attack path reduction measurable against Phase 1 BloodHound baseline
- [ ] Conditional access policies active for cloud workloads - [ ] Housekeeping queue running; two cycles completed
### Phase 2 Mantra
> *"The goal is not to block everything. It is to ensure that every allowed path is known, justified, and monitored."*
--- ---
## Phase 3: Sovereignty (Days 6090) ## Phase 3: Signal and Retained Capability (Days 120180)
**Theme**: *Reclaim what should never have been rented.* **Theme**: *Build detection on the hardened foundation. Build the capability to sustain what was built.*
This is where the antifragile approach diverges sharply from conventional hardening. The focus shifts from defending the perimeter to **owning the intelligence** that drives the organization. Phase 3 starts only after Phase 2 exit criteria are met. Detection engineering on an unhardened environment is waste — the signal-to-noise ratio is too low to produce actionable intelligence.
### Week 9-10: AI Sovereignty Assessment > **Why not AI Sovereignty in Phase 3**: AI sovereignty — local models, owned inference infrastructure, sovereign cognitive capability — is a multi-year programme, not a 30-day sprint. Hardware procurement alone typically takes 612 weeks. Claiming it as a Phase 3 deliverable sets up the engagement to fail. AI sovereignty begins with the audit work in Phase 1 (AI usage inventory, classification, assessment of vendor terms) and continues as a separate parallel initiative. The Azure OpenAI Sovereignty Bridge is the appropriate near-term stepping stone. See [AI Sovereignty Framework](../core/ai-sovereignty-framework.md) and [Azure OpenAI Sovereignty Bridge](../core/azure-openai-sovereignty-bridge.md).
**Tool strategy**: Discovery requires interviews and proxy log analysis. No purchase needed for assessment. ### Weeks 1718: Detection Engineering Foundation
| Action | Owner | Deliverable | Existing Tool Leverage | | Action | Owner | Deliverable |
|--------|-------|-------------|------------------------| |--------|-------|-------------|
| Inventory all AI usage: approved and shadow | Security / AI Lead | AI usage map with data classification | Proxy logs, SaaS billing review, employee interviews | | Write initial PULSAR alert rules: CA policy changes, new Global Admin assignments, bulk mailbox export, app permission grants outside change window | Security | Alert rule set deployed; test-triggered and validated |
| Classify AI workloads by sovereignty requirement | Security Architect | T0/T1/T2 AI asset classification | Existing data classification framework | | Review SIEM coverage: which T0 events generate alerts, which do not | Security | Detection coverage map against MITRE ATT&CK top 10 for M365 |
| Identify highest-value local AI pilot candidate | AI Lead / Business | Pilot scope document with success criteria | Business stakeholder interviews | | Tune ASTRAL rolling PRs: configure reviewer notification, test reject/restore flow | Security | ASTRAL review workflow operational; first restore test completed |
| Assess vendor AI terms: data usage, training, termination | Legal / Security | Risk register for each AI provider | Legal review of existing contracts | | Establish alert response runbooks: who gets notified, what they do, what they escalate | Security / Client Lead | Runbooks for top 5 alert types |
### Week 10-11: Local AI Infrastructure Deployment ### Weeks 1920: Endpoint and Identity Detection
**Tool strategy**: Start with existing hardware or low-cost sovereign cloud. Use open-source inference servers (Ollama, vLLM, llama.cpp). | Action | Owner | Deliverable |
|--------|-------|-------------|
| Deploy Wazuh or verify existing EDR coverage for on-premise systems | Security | Endpoint detection coverage report |
| Write custom detection rules for kill chain-specific TTPs identified in Phase 1 | Security | Custom rule set tuned to client environment |
| Establish weekly threat review cadence: PULSAR event summary + ASTRAL drift review | Security / Client Lead | First weekly review completed; format agreed |
| AI usage audit: classify current AI workflows by data sensitivity and vendor agreement | Security / Legal | AI usage register; high-risk workflows flagged for remediation |
| Action | Owner | Deliverable | Existing / Low-Cost Tool Leverage | ### Weeks 2124: Knowledge Transfer and Handover
|--------|-------|-------------|----------------------------------|
| Deploy local inference infrastructure (on-prem or sovereign cloud) | Infrastructure | Operational inference cluster | Underutilized servers, retired workstations, or sovereign cloud VM |
| Establish model versioning and artifact management | MLOps / Security | Model registry with provenance tracking | Git + DVC or simple artifact storage |
| Implement access controls for model weights and training data | Security | T0-class protection for AI assets | Existing file servers, encryption, IAM |
| Deploy initial pilot: RAG or fine-tuned model on proprietary data | AI Team | Working pilot with performance baseline | Ollama, llama.cpp, or vLLM (open-source) + quantized open models |
### Week 11-12: Backup, Recovery, and Validation The most important deliverable of Phase 3 is **the client's ability to operate everything without us.**
**Tool strategy**: Use existing backup and DR infrastructure. The goal is to test and document, not to buy. | Action | Owner | Deliverable |
|--------|-------|-------------|
| Action | Owner | Deliverable | Existing Tool Leverage | | Runbook completion: every system built or modified has an operating runbook | Security / Client Team | Runbook set reviewed and signed off by client IT lead |
|--------|-------|-------------|------------------------| | Client training: ASTRAL drift review workflow, PULSAR event search, alert response | Security | Training delivered; client IT lead can demonstrate competency |
| Perform full recovery drill of one critical system from backup | IT / Security | Recovery time documented, gaps identified | Existing backup solution | | Housekeeping queue: third and fourth monthly cycles | IAM | Queue status; cumulative resolution metrics |
| Validate backup integrity for all T0 assets | Backup Admin | Integrity report with sample restorations | Existing backup solution + integrity scripts | | Document what was built: configuration baseline document for every module | Security | Module completion package delivered |
| Test local AI pilot under degraded network conditions | AI / Infrastructure | Resilience validation report | Existing network infrastructure + manual testing | | Phase 3 review: risk register update, metrics summary, Phase 4 / retained capability recommendation | Security | Final 180-day programme review with executive sponsor |
| Document and exercise incident response for AI-specific threats | SOC / Security | Runbook: model poisoning, data exfiltration, adversarial input | Existing IR framework + internal knowledge |
### Phase 3 Exit Criteria ### Phase 3 Exit Criteria
- [ ] All AI usage inventoried and classified - [ ] PULSAR alert rules operational for top 5 M365 risk scenarios
- [ ] Local inference infrastructure operational - [ ] ASTRAL drift review workflow operational; first restore tested
- [ ] One high-value AI pilot deployed and measured - [ ] Custom detection rules written for client-specific TTPs
- [ ] T0 protection applied to model weights and training data - [ ] Weekly threat review cadence established and running
- [ ] Critical system recovery drill completed successfully - [ ] All runbooks completed and signed off by client IT lead
- [ ] AI-specific incident response runbook created - [ ] Client IT lead can operate ASTRAL and PULSAR without consultant support
- [ ] AI usage registered and high-risk workflows flagged
### Phase 3 Mantra - [ ] Housekeeping queue: four consecutive cycles completed
> *"We are moving from being consumers of intelligence to manufacturers of our own. The vault is built; now we fill it."*
--- ---
## Phase 4: Antifragility (Days 90180) ## Phase 4: Antifragility (Ongoing)
**Theme**: *Build systems that grow stronger from disruption.* **Theme**: *The programme does not end. The organisation learns faster from disruption than competitors do.*
The final phase converts the hardened foundation into an adaptive, learning organization. This is where antifragility becomes operational reality. Phase 4 is not a 30-day sprint. It is an ongoing operational posture. The 180-day programme establishes the foundation; Phase 4 is what happens when that foundation is maintained and extended over months and years.
### Month 4: Structural Decoupling and Optionality **Phase 4 activities** (initiated at 180 days; sustained indefinitely):
**Tool strategy**: Documentation, architecture, and open-source chaos tools (Chaos Mesh, Gremlin free tier, custom scripts). Work, not purchases. - **Retained capability**: Monthly ASTRAL drift review, PULSAR event summaries, quarterly Elysium/BloodHound scans, housekeeping queue advancement
- **Detection engineering**: Progressive extension of alert rule coverage; tuning based on real events; quarterly rule review
- **Structural improvement**: Exit architectures for vendor dependencies, progressive elimination of legacy systems, planned OT technology refresh
- **Chaos engineering**: Controlled failure exercises — starting with non-production, progressing to production once detection and recovery capability is confirmed
- **Red team exercises**: Annual structured adversarial testing — not before Phase 2 is complete and detection is operational
- **AI sovereignty programme**: Local inference infrastructure, where justified by workload and capability; AURORA deployment for M365 governance intelligence; sovereign AI as a parallel multi-year initiative
- **Greenfield capability building**: Configuration as code for all managed systems; tested migration procedures; documented rebuild path
| Action | Owner | Deliverable | Existing / Free Tool Leverage | **What makes Phase 4 real**: A named person who owns the housekeeping queue. A calendar-blocked weekly threat review. A quarterly retained capability scope. Without these, Phase 4 does not happen — and everything built in 180 days begins to rot.
|--------|-------|-------------|------------------------------|
| Document exit architecture for all major platform dependencies | Enterprise Architecture | 90-day exit plan per critical vendor | Architecture documentation, existing runbooks |
| Implement abstraction layers for proprietary integrations | Engineering | Interface documentation and migration test | Existing development tools and frameworks |
| Establish dual-vendor readiness for one critical category | Procurement / Engineering | Technical proof of capability | Existing engineering capacity, open standards |
| Deploy chaos engineering: simulate critical dependency failure | Resilience Team | Chaos experiment report with findings | Chaos Mesh (open-source), custom scripts, Gremlin free tier |
### Month 5: Stress-to-Signal Conversion
**Tool strategy**: Process and culture changes require no licensing. Use existing EDR/SIEM for detection validation.
| Action | Owner | Deliverable | Existing Tool Leverage |
|--------|-------|-------------|------------------------|
| Implement blameless post-mortem process with structural mandates | Culture / Security | Post-mortem template and governance | Existing collaboration tools (Confluence, SharePoint, Notion) |
| Deploy production chaos engineering with automated rollback | Resilience Team | Monthly chaos experiment schedule | Existing orchestration + open-source chaos tools |
| Create feedback loop: incident findings → architecture changes | Security Architect | Closed-loop metrics: mean time to structural fix | Existing ticketing system (Jira, ServiceNow) |
| Launch "red team as a service": continuous adversarial testing | Security | Monthly red team report | Internal team + existing EDR/SIEM for detection validation |
### Month 6: Defensive AI and Continuous Modernisation
**Tool strategy**: Defensive AI runs on the local inference infrastructure already deployed. Posture measurement uses existing APIs and open-source dashboards.
| Action | Owner | Deliverable | Existing / Low-Cost Tool Leverage |
|--------|-------|-------------|----------------------------------|
| Expand local AI to defensive use cases: anomaly detection, code review, vulnerability prioritization | AI / Security | Defensive AI capability map | Local AI cluster deployed in Phase 3 |
| Implement automated security posture measurement | Security | Continuous compliance dashboard | Existing APIs (Microsoft Graph, AWS APIs) + Grafana or open-source dashboard |
| Evaluate and migrate additional AI workloads to local infrastructure | AI Lead | Migration roadmap with quarterly targets | Local AI infrastructure + business case templates |
| Conduct first antifragility maturity assessment | Consultant / Security | Baseline maturity score with gap analysis | Spreadsheet or existing GRC tool |
| Pilot organizational integration: embed security in one product team | Consultant / Engineering | Shift-left pilot metrics | Existing team structure + collaboration tools |
| **Deploy AI-assisted TVM operationalization** | AI / Security | AI TVM dashboard; <48h critical CVE response | Defender Exposure Management + Azure OpenAI or local LLM; see [AI-Assisted TVM Blueprint](ai-assisted-tvm.md) |
### Phase 4 Exit Criteria
- [ ] Exit architectures documented for top 5 vendor dependencies
- [ ] Chaos engineering operational in production
- [ ] Mean time to structural fix < 14 days from incident
- [ ] Defensive AI pilot operational
- [ ] First antifragility maturity assessment completed
- [ ] Quarterly antifragility review calendar established
### Phase 4 Mantra
> *"We do not want fewer incidents. We want incidents that teach us something we could not have learned any other way."*
---
## Governance and Cadence
### Weekly Steering Committee
- Review blockers and escalations
- Validate phase exit criteria
- Adjust scope based on organizational readiness
### Monthly Board Update
- Risk reduction metrics
- Antifragility maturity trend
- Investment vs. risk-exposure reduction
- Strategic narrative: "This is not a cost centre; it is optionality insurance"
### Quarterly Retrospective
- What failed that taught us something?
- What assumptions have been invalidated?
- What new dependencies have emerged?
- What can be simplified or removed?
--- ---
## Success Metrics ## Success Metrics
| Dimension | Metric | Target | | Dimension | Metric | Realistic Target |
|-----------|--------|--------| |-----------|--------|-----------------|
| **Visibility** | % of assets in CMDB | 100% of T0/T1 within 30 days | | **Kill chain** | Kill chain nodes closed | 100% of P0 nodes closed by day 120 |
| **Control** | Mean time to contain new identity | < 1 hour | | **Identity** | MFA enforcement on privileged accounts | 100% of T0 accounts by day 60; 100% of all accounts by day 120 |
| **Sovereignty** | % of proprietary AI workloads local | 100% of T0-class within 90 days | | **Configuration** | ASTRAL drift detected and reviewed | Weekly; 100% of unauthorised drift investigated within 48h |
| **Resilience** | Recovery time for critical system | < 4 hours | | **Audit trail** | PULSAR retention operational | 12+ months of M365 audit events retained by day 60 |
| **Learning** | Structural fixes per incident | ≥ 1 | | **Housekeeping** | Stale accounts resolved per quarter | Measurable queue reduction each cycle; not a fixed % target |
| **Optionality** | Vendor dependencies without exit plan | 0 | | **Recovery** | T0 system recovery test completed | At least one per T0 system within 180 days |
| **Handover** | Client IT lead operational independence | All built systems operable without consultant by day 180 |
> **On metrics and honesty**: Avoid targets that sound like achievements but are not verifiable. "100% of identities validated" cannot be verified in 180 days in any organisation with meaningful history. "All T0 accounts with MFA enforced and verified via CA sign-in logs" is verifiable. Write metrics you can prove, not metrics that sound ambitious.
---
## Governance and Cadence
### Weekly Check-In (30 minutes, every week)
- Change log review: what was completed, what is blocked
- Client decisions required this week
- Risks and open items
*If this meeting is consistently cancelled by the client, the engagement pauses until it resumes.*
### Monthly Steering Committee (60 minutes)
- Phase progress against exit criteria
- Risk register review
- Housekeeping queue status
- Budget and scope review
- Next phase / retained capability planning
### Phase Gate Reviews (Days 60, 120, 180)
Hard go/no-go decisions. Not formalities. If phase exit criteria are not met, the programme does not advance — it addresses the gaps.
--- ---
## Adaptation Guide ## Adaptation Guide
### Small Organizations (< 100 employees) ### Small Organisations (< 100 employees)
- Compress Phases 1-2 into 30 days - Phase 1 focus: kill chain, T0 accounts, ASTRAL/PULSAR deployment. Skip broad identity audit — it is not necessary for small populations.
- Use managed sovereign cloud for local AI instead of on-premises hardware - Phase 2 focus: MFA for all users (achievable quickly at small scale), basic CA, device compliance.
- Focus on identity, backup, and one high-value AI pilot - Phase 3 focus: runbooks and handover. Detection engineering is proportional to environment complexity.
- Leverage Microsoft Business Premium or Google Workspace security features fully before any additional purchase - **Do not compress the timeline further.** The bottleneck at small organisations is almost always IT resource availability and change management, not technical complexity.
### Regulated Industries (Finance, Healthcare, Critical Infrastructure) ### Regulated Industries (Finance, Healthcare, Critical Infrastructure)
- Extend Phase 1 to 45 days for compliance mapping - Extend Phase 1 to 90 days where regulatory mapping and OT inventory are required.
- Integrate regulatory requirements into T0 classification - Add compliance validation gates at each phase exit — specific evidence requirements for NIS2/DORA/GDPR.
- Add compliance validation gates at each phase exit - The housekeeping stream is non-negotiable for regulators who require demonstrable continuous control.
### Highly Distributed Organizations ### Organisations with Heavy Technical Debt
- Prioritize network segmentation and DNS security in Phase 1 - Accept explicitly, in writing, that 20 years of debt will not be cleared in 180 days.
- Deploy edge inference nodes in Phase 3 instead of central cluster - Phase 1 focus is kill chain only. The full debt picture goes into the housekeeping queue and the Phase 4 backlog.
- Emphasize operational resilience and disconnected operations - The rapid modernisation plan addresses existential risk. The housekeeping stream addresses accumulated risk over time. Both are necessary; neither replaces the other.
- Adjust Phase 2 exit criteria to reflect the realistic pace of MFA rollout in high-debt environments — legacy systems often require extended exception handling.
### Organizations with Heavy Technical Debt ### OT/Critical Infrastructure Environments
- Accept that 20 years of debt cannot be cleared in 180 days - Phase 1 must include OT asset inventory and IT/OT connection map.
- Use defensive AI in Phase 4 to accelerate debt identification and prioritization - Phase 2 segmentation work (IT/OT boundary) is the primary kill chain closure, not identity hardening.
- Focus on "kill chain" protection rather than comprehensive cleanup - See [Vertical: Power and Utilities](../reference/vertical-power-utilities.md) and the Critical Infrastructure Adaptation in [Move Fast and Fix Things](move-fast-and-fix-things.md#the-critical-infrastructure-adaptation).
- Map every action to CIS IG1 to show standards alignment without additional framework investment
--- ---
@@ -0,0 +1,334 @@
# Sample Engagement: Mid-Market Hybrid Organisation
> *This document is a calibration reference for consultants. It walks through a realistic engagement for a specific client profile from first contact through Day 180. Use it to calibrate your own scope estimates, find comparable findings for risk register entries, and understand what a complete engagement looks like for this type of organisation.*
---
## Client Profile: Nexus Operations s.r.o.
**Fictional client. All details are representative of a real mid-market profile.**
| Attribute | Detail |
|-----------|--------|
| **Size** | 500 employees, 10 IT/admin staff |
| **Sector** | Professional services (management consulting + outsourced IT services) — NIS2 **important entity** under digital infrastructure provisions |
| **Identity** | Active Directory (on-premises, single forest, two domains — legacy acquisition) + Entra ID (hybrid join, Azure AD Connect sync) |
| **M365 licensing** | E3 — includes Entra ID P1 (Conditional Access), Defender for Endpoint Plan 1, Intune, Exchange Online, SharePoint, Teams. No E5 features: no PIM, no Defender for Identity, no Sentinel, no Purview advanced. |
| **Endpoint management** | Intune deployed 18 months ago; ~70% Windows enrollment, ~30% macOS enrollment; no iOS/Android policy; Intune used primarily for app deployment, not compliance enforcement |
| **Third-party tools** | Jira (cloud), GitHub (cloud, mix of org/personal accounts), Confluence (cloud), a legacy on-prem ERP (SAP), an on-prem file server (Windows Server 2016), a CRM (Salesforce), and approximately 12 other SaaS tools identified in procurement; shadow IT suspected |
| **Infrastructure** | Three offices (Prague HQ, Brno, Warsaw); hybrid work standard; ~80 external contractors at any given time; site-to-site VPN between offices; split DNS; no SD-WAN |
| **Current security** | No dedicated security tool beyond Defender AV. Microsoft Secure Score: 42%. No SIEM. No SOC. Previous pentest 2 years ago (report available). Previous ISO 27001 attempt abandoned 18 months ago. |
| **NIS2 status** | In-scope as important entity; national transposition deadline passed; supervisory authority has sent initial questionnaire; response due in 90 days |
| **Trigger** | NIS2 questionnaire received; CTO has seen the [Brownhat Diagnostic](../assessment-templates/nist-csf-baseline.md) approach referenced by a peer; CISO role vacant (they are looking) |
---
## Engagement Context
### Why They Called
The NIS2 questionnaire is the proximate trigger but not the underlying problem. The CTO's real concern, surfaced in the discovery call: "We have been growing fast, the acquisition two years ago added a lot of mess, and I genuinely do not know what we would do if we had a serious incident. We have contractors everywhere and I am not sure all of them are properly offboarded when their engagement ends."
This is a common and honest framing. The NIS2 deadline creates a compliance urgency, but the actual risk is operational — undocumented access, accumulated technical debt from the acquisition, and no detection capability.
### What the Discovery Call Revealed
**The trigger question** ("What happened recently that made you call us?") produced: the NIS2 questionnaire, plus a near-miss three months ago — a contractor who had left six months previously used their still-active account to access a SharePoint site. Nobody noticed until the contractor themselves mentioned it to their former manager. No data exfiltration confirmed but not verified.
**The accountability question**: Named IT lead is the senior sysadmin, Ondřej Blaha. CTO is the executive sponsor. CISO role vacant — the IT lead is acting as de facto security lead without the title or dedicated time.
**The tools question**: E3 confirmed. Intune confirmed but underutilised. No SIEM. Previous pentest report available (2 years old). Defender AV on all Windows endpoints; coverage on macOS "mostly."
**The success question**: "Pass the NIS2 questionnaire. Know that if something happens, we can respond. And if I hire a CISO in six months, I want there to be something to hand over."
This is an excellent brief. Concrete, honest, achievable.
### What Disqualifies This Client?
Nothing. All green lights:
- Named executive sponsor with budget authority (CTO)
- Named IT lead with operational access (Ondřej)
- Real trigger with a deadline (NIS2 response in 90 days)
- Honest assessment of current state
- Realistic success criteria
**One flag to manage**: The NIS2 questionnaire response is due in 90 days. This creates urgency that may pressure the client to skip the Brownhat Diagnostic and go straight to "give us a report for the regulator." Resist this. The diagnostic *is* the report — it produces evidence directly usable in the NIS2 response. Skipping it produces a worse outcome for both the client and the regulator.
---
## Brownhat Diagnostic Findings
*What a competent two-day diagnostic would find in this environment. Presented as the consultant would present it to the CTO.*
### Kill Chain Assessment
The shortest path from "nothing bad has happened yet" to "Nexus Operations cannot operate" runs through identity.
```
Compromised contractor credential (still active after offboarding)
→ Access to M365 (no MFA enforced, or legacy auth bypasses MFA)
→ Access to SharePoint / Teams (all data)
→ Access to Exchange (all email, calendar, contacts)
→ Password spray against Entra ID → escalate to admin account
→ Domain Admin via Entra ID Connect sync account
→ Full AD compromise → all on-prem systems
→ ERP (SAP) → financial data, operational disruption
```
This is not theoretical. The six-month-old contractor account near-miss is one credential spray away from the beginning of this chain.
**Secondary kill chain** (on-prem):
```
Internet-facing VPN endpoint (legacy firmware, no MFA)
→ Internal network access
→ Lateral movement via NTLM relay (expected: NTLM not disabled)
→ File server → ERP → AD
```
### Findings by Priority
#### P0 — Kill Chain Nodes
| ID | Finding | Evidence |
|----|---------|----------|
| P0-001 | **No MFA enforced for remote access or M365** | Entra ID sign-in logs show 34% of sign-ins in past 30 days without MFA; Conditional Access policies exist but are in Report-Only mode, never activated |
| P0-002 | **Active contractor accounts: 23 confirmed stale** | Elysium identifies 23 accounts with last login > 90 days owned by contractors whose engagements are confirmed ended in HR system; 6 have been inactive for > 6 months |
| P0-003 | **KRBTGT password never rotated** | Last rotation: 847 days (default since domain creation). Any Golden Ticket attack persists across credential resets until KRBTGT is rotated. |
| P0-004 | **Azure AD Connect sync account has excessive privilege** | The sync service account has DCSync rights on the on-premises domain. Compromise of Entra ID admin → on-prem domain compromise via this account. |
| P0-005 | **VPN endpoint: no MFA, outdated firmware** | Cisco ASA, firmware 18 months out of date; no MFA for VPN authentication; used by all contractors and remote employees |
| P0-006 | **No tested backup restore** | Backups run nightly (confirmed); no restore has ever been tested; ERP backup destination is on the same network segment as the ERP server |
#### P1 — Material Risk
| ID | Finding | Evidence |
|----|---------|----------|
| P1-001 | **Legacy authentication not blocked** | Sign-in logs: 847 legacy auth attempts in past 30 days from 34 unique accounts; these bypass MFA regardless of CA policy |
| P1-002 | **Domain Admins using workstations for email and browsing** | BloodHound: 4 of 5 Domain Admin accounts have interactive logon events from standard workstations; no PAW architecture |
| P1-003 | **Service accounts: 31 with non-expiring passwords, 12 with unknown owners** | AD audit; 7 service accounts have Domain Admin-equivalent rights with no documented purpose |
| P1-004 | **Intune compliance not enforced in Conditional Access** | Compliant device requirement is in CA policy but excluded for all users via the "AllUsers_ExceptionGroup" group containing 489 of 500 users |
| P1-005 | **Third-party SaaS access not reviewed** | 12 known SaaS tools; Entra ID app registrations show 47 enterprise applications with consent grants; 11 have "Mail.ReadWrite" or equivalent scopes from unidentified sources |
| P1-006 | **No MFA on GitHub** | GitHub org admin accounts without MFA enforced at org level; mix of personal and managed accounts; no SSO integration with Entra |
| P1-007 | **SAP ERP on-prem: default admin credentials not changed on secondary instance** | Confirmed during document review of previous pentest report |
| P1-008 | **No logging beyond M365 default 90-day retention** | No SIEM; no secondary log retention; M365 audit log at 90-day E3 default; ERP and file server logs local only, 30-day retention |
#### P2 — Housekeeping Queue
| ID | Finding |
|----|---------|
| P2-001 | NTLM not disabled; NTLMv1 still permitted in GPO |
| P2-002 | Basic authentication still enabled for Exchange (in addition to legacy auth block needed above) |
| P2-003 | 89 stale AD accounts (not contractors — former employees; some date to 2019) |
| P2-004 | DNS records for 14 decommissioned services still exist |
| P2-005 | Firewall ruleset last reviewed 3 years ago; 23 rules with "any/any" destination |
| P2-006 | macOS endpoints: Defender coverage patchy; 31 devices not enrolled in Intune |
| P2-007 | No documented vendor access procedure; contractors provisioned ad hoc |
| P2-008 | Windows Server 2016 file server: extended support ends October 2026 |
| P2-009 | Jira/Confluence: 67 former employee accounts still active |
| P2-010 | SharePoint external sharing enabled globally with no policy; 14 sites have external links active |
### Quick Wins (Closeable Before Day 30)
1. **Activate CA policies** — already in Report-Only; switch to Enabled. MFA enforcement for all sign-ins with zero new tooling. (2 hours)
2. **Disable 23 confirmed stale contractor accounts** — HR-confirmed departures; disable immediately. (1 hour, needs HR sign-off already obtained)
3. **Remove AllUsers_ExceptionGroup from CA compliance policy** — 489 users are excepted from device compliance for no documented reason. Remove the exception. (30 minutes)
4. **Block legacy authentication** — CA policy for legacy auth block already exists in the tenant (Microsoft provides a template); activate it. Test first with sign-in log review. (4 hours including testing)
5. **Enforce MFA on GitHub org** — Organisation setting, 2 minutes to enable; will force any admin without MFA to enrol at next login. (5 minutes)
---
## Module Recommendation and Rationale
### Recommended Sequence
```
Brownhat Diagnostic + Quick Wins (Weeks 1-4)
Module 2: M365 Identity Security (Weeks 4-10) ← Primary kill chain
Module 6: On-Premise AD Hardening (Weeks 8-14) ← Runs in parallel from week 8
Module 1: Endpoint Management (Weeks 14-18) ← Hardens existing Intune
Module 7: Recovery & Resilience (Weeks 16-20) ← Runs in parallel from week 16
```
### Rationale
**Why Module 2 first**: The kill chain runs through identity. P0-001 (no MFA enforced), P0-002 (stale contractor accounts), and P1-001 (legacy auth) are all Module 2 work. These are also the fastest path to demonstrable NIS2 evidence — Article 21 explicitly requires MFA and access control measures.
**Why Module 6 second, partially parallel**: P0-003 (KRBTGT rotation), P0-004 (AD Connect privilege), and P1-002 (Domain Admins on standard workstations) require AD access and change windows. This work can start in week 8 as Module 2 is closing — the identity team has already been engaged, the change management process is established.
**Why Module 1 third, not first**: Intune is already deployed and roughly functional. It is not the kill chain. Hardening Intune (compliance policies, CA integration, full macOS enrollment) is important but secondary to closing the identity gaps. It belongs in Week 14 when identity work is complete.
**Why Module 7 matters here**: The ERP backup (P0-006) is a kill chain node. Recovery and Resilience validates backup integrity and produces the restore test evidence that NIS2 business continuity requirements directly demand. Starting Module 7 in parallel with Module 1 from Week 16 gets this done within 180 days.
**Not recommended in this engagement**:
- Module 5 (AI Sovereignty Bridge): not in the kill chain; deferred to Phase 4
- Module 10 (Red Team): requires a hardened foundation; schedule at 12 months post-engagement
- Module 12 (Blue/Purple Team): requires detection infrastructure not yet deployed; follow-on engagement
- Module 8 (OT): not applicable — no OT environment
---
## Day 30 / Day 90 / Day 180: This Specific Client
### Day 30 Deliverables
| # | Deliverable | Nexus-specific detail |
|---|-------------|----------------------|
| 1 | Brownhat Diagnostic report | Kill chain documented (identity → AD → ERP); 5 quick wins; module roadmap |
| 2 | ASTRAL deployed | Intune + Entra ID baseline committed; Azure DevOps project `ASTRAL-Nexus` created; drift detection live |
| 3 | PULSAR deployed | M365 audit events ingesting; Ondřej confirmed as reviewer; Teams tab pinned in IT channel |
| 4 | T0 accounts hardened | 3 Global Admins: MFA enforced, dedicated admin accounts separated from daily-use accounts |
| 5 | Attack surface report | VPN endpoint flagged (P0-005); external-facing services enumerated |
| 6 | Quick wins closed | CA policies activated; 23 contractor accounts disabled; legacy auth blocked; GitHub MFA enforced; Intune compliance exception removed |
| 7 | Findings backlog opened | All diagnostic findings entered in ADO Work Items; Ondřej named as owner for P0/P1; CTO briefed on P0 count (6) and quick wins status |
> **NIS2 value at Day 30**: The Brownhat Diagnostic report and the quick wins closure log constitute direct evidence for NIS2 Article 21 (access control, MFA, asset management). PULSAR starts accumulating the audit log retention the questionnaire will ask about.
---
### Day 90 Deliverables
| # | Deliverable | Nexus-specific detail |
|---|-------------|----------------------|
| 8 | MFA for all users enforced | CA policy covering all 500 users; verified via sign-in logs; helpdesk prepared for exceptions (expected: ~15 users requiring assisted enrolment) |
| 9 | Legacy auth blocked | Verified: zero legacy auth sign-ins in past 7 days in PULSAR |
| 10 | CA baseline deployed | Device compliance required; location-based policies for Warsaw office (different risk profile); sign-in risk policy active |
| 11 | P0 vulnerabilities closed | P0-002 (contractors) ✓ Day 30; P0-003 (KRBTGT) rotated with two-rotation process; P0-004 (AD Connect account) de-privileged; P0-005 (VPN MFA) enforced |
| 12 | AD attack path reduction | BloodHound before/after: paths to Domain Admin reduced from 847 to <50; service accounts with Domain Admin rights reduced from 7 to 0 |
| 13 | Vendor access hardened | Contractor provisioning procedure documented; offboarding checklist created and linked to HR process; Ondřej named as monthly reviewer |
| 14 | T0 backup integrity | ERP backup tested and restored to isolated environment; restore time documented (target: <4 hours); backup destination moved off same network segment |
| 15 | ASTRAL: first restore drill | Intentional test change made and restored via pipeline; process documented |
| 16 | PULSAR: top 5 alert rules | CA policy modification; new Global Admin assignment; bulk mailbox export; new high-privilege app consent; VPN authentication failure spike |
> **NIS2 value at Day 90**: MFA enforcement (Article 21c), access control and account management (Article 21i), audit log retention accumulating since Day 30 (Article 21j), backup integrity evidence (Article 21c business continuity). Sufficient to respond to the NIS2 questionnaire with evidence, not assertions.
---
### Day 180 Deliverables
| # | Deliverable | Nexus-specific detail |
|---|-------------|----------------------|
| 17 | Alert runbooks | 5 PULSAR alert runbooks signed off by Ondřej; escalation path to CTO documented |
| 18 | Custom detection rules | Contractor account creation outside HR-approved window; SAP admin login outside business hours; bulk SharePoint download |
| 19 | Client independence | Ondřej completes live walkthrough: reviews ASTRAL PR, investigates a PULSAR event, resets a compromised Elysium-flagged account |
| 20 | Housekeeping: 3 cycles | Cycles 13 completed; 67 Jira/Confluence accounts resolved; 89 stale AD accounts processed (disabled with justification per account); DNS cleanup in progress |
| 21 | Module completion packages | Module 2, Module 6, Module 1 completion packages delivered to `nexus-security` ADO repository |
| 22 | Risk register closure | Before/after comparison: P0 count 6 → 0; P1 count 8 → 2 (P1-007 SAP default credentials and P1-005 app consent review in housekeeping queue) |
| 23 | Retained capability scope | Agreed quarterly scope: monthly ASTRAL drift review, quarterly BloodHound + Elysium run, PULSAR health check, housekeeping queue advancement |
---
## Findings Backlog — Initial Population
*Pre-populated from the Brownhat Diagnostic. Consultants: adapt IDs and details to your actual findings.*
**ADO Work Items project**: `ASTRAL-Nexus` (same project as ASTRAL deployment)
**Owner**: Ondřej Blaha
**Cadence**: Monthly housekeeping review, first Thursday of each month
### P0 — Kill Chain (all closed by Day 90)
| ID | Finding | Source | Owner | Status | Target |
|----|---------|--------|-------|--------|--------|
| B-001 | No MFA enforced: 34% of sign-ins without MFA | Brownhat | Ondřej | **Closed** Day 30 | Day 30 |
| B-002 | 23 stale contractor accounts with valid credentials | Elysium | Ondřej | **Closed** Day 30 | Day 30 |
| B-003 | KRBTGT password 847 days old | BloodHound | Ondřej | **Closed** Day 75 | Day 60 |
| B-004 | AD Connect sync account has DCSync rights | BloodHound | Ondřej | **Closed** Day 70 | Day 60 |
| B-005 | VPN: no MFA, firmware 18 months outdated | Brownhat | Ondřej | **Closed** Day 80 | Day 90 |
| B-006 | No tested ERP backup restore | Brownhat | Ondřej | **Closed** Day 85 | Day 90 |
### P1 — Material Risk
| ID | Finding | Source | Owner | Status | Target |
|----|---------|--------|-------|--------|--------|
| B-010 | Legacy auth not blocked: 847 sign-ins in 30 days | PULSAR | Ondřej | **Closed** Day 30 | Day 30 |
| B-011 | Domain Admins using standard workstations | BloodHound | Ondřej | **Closed** Day 65 | Day 60 |
| B-012 | 7 service accounts with Domain Admin rights, no documented purpose | AD audit | Ondřej | **Closed** Day 72 | Day 60 |
| B-013 | Intune compliance exception covers 489/500 users | ASTRAL | Ondřej | **Closed** Day 30 | Day 30 |
| B-014 | 47 Entra app registrations with Mail.ReadWrite or higher scope | Entra audit | Ondřej | In Progress | Day 120 |
| B-015 | GitHub org: no MFA enforcement, personal/managed account mix | Brownhat | Ondřej | **Closed** Day 30 | Day 30 |
| B-016 | SAP secondary instance: default admin credentials not changed | Pentest report | IT Lead (SAP) | Open | Day 90 |
| B-017 | No audit log retention beyond 90 days | Brownhat | Ondřej | **Closed** Day 1 (PULSAR) | Day 30 |
### P2 — Housekeeping Queue
| ID | Finding | Source | Owner | Status | Target |
|----|---------|--------|-------|--------|--------|
| B-100 | NTLM not disabled; NTLMv1 permitted | AD audit | Ondřej | Open | Q3 |
| B-101 | 89 stale AD accounts from former employees | Elysium | Ondřej | In Progress (Cycle 2) | Q3 |
| B-102 | 14 DNS records for decommissioned services | AD audit | Ondřej | Open | Q3 |
| B-103 | 23 firewall rules with any/any destination | Firewall review | Network | Open | Q4 |
| B-104 | 31 macOS devices not enrolled in Intune | ASTRAL/Intune | Ondřej | In Progress (Module 1) | Day 180 |
| B-105 | No documented vendor access procedure | Brownhat | Ondřej | **Closed** Day 85 | Day 90 |
| B-106 | Windows Server 2016 file server: EOL Oct 2026 | Brownhat | CTO | Open | Oct 2026 |
| B-107 | 67 former employee accounts in Jira/Confluence | Brownhat | Ondřej | In Progress (Cycle 1) | Q3 |
| B-108 | SharePoint external sharing: 14 sites with active external links | ASTRAL | Ondřej | Open | Q3 |
| B-109 | Basic auth still enabled for Exchange | Brownhat | Ondřej | Open | Q2 |
---
## NIS2 Article 21 Compliance Map
*Evidence produced by this engagement against the Article 21 measures. Use this table in the NIS2 questionnaire response.*
| Article 21 Measure | Requirement | Evidence from this engagement |
|--------------------|-------------|-------------------------------|
| **21(2)(a)** Policies on risk analysis and information security | Documented policies | Brownhat Diagnostic report; module completion packages; risk register |
| **21(2)(b)** Incident handling | Detection and response capability | PULSAR alert rules + runbooks; incident escalation procedure |
| **21(2)(c)** Business continuity, backup, DR | Tested backup and recovery | Module 7: ERP backup restore test report; Recovery Time documented |
| **21(2)(d)** Supply chain security | Vendor/supplier risk management | Contractor access procedure; vendor access inventory; offboarding checklist |
| **21(2)(e)** Security in acquisition, development | Secure development and procurement | (Partial — addressed in Phase 4; not covered in 180-day programme) |
| **21(2)(f)** Policies to assess effectiveness | Metrics and review cadence | ASTRAL drift history; PULSAR event summaries; quarterly BloodHound/Elysium; housekeeping cycle reports |
| **21(2)(g)** Cyber hygiene and training | Basic hygiene and awareness | MFA enforcement; CA policies; device compliance; housekeeping stream |
| **21(2)(h)** Cryptography and encryption | Encryption standards | (Addressed via CA device compliance and baseline — documented) |
| **21(2)(i)** HR security, access control, asset management | Identity governance, privileged access | Module 2: MFA, CA, privileged account management; Module 6: AD hardening; stale account process |
| **21(2)(j)** Authentication, MFA | MFA for all users | CA policy enforced for all 500 users; verified via sign-in log (Day 90 deliverable #8) |
**For the supervisory authority questionnaire**: The strongest evidence package is: (1) the Brownhat Diagnostic report showing risk analysis was conducted, (2) the ASTRAL baseline showing configuration management is operational, (3) the PULSAR deployment showing logging and monitoring is in place, and (4) the Day 90 MFA enforcement verification via sign-in logs. These four items directly answer the most common questions in NIS2 supervisory questionnaires.
---
## Investment Estimate
*Effort ranges using the module investment levels from [Modular Engagements](../core/modular-engagements.md). Day rates applied per engagement proposal.*
| Phase | Activity | Estimated Effort |
|-------|----------|-----------------|
| Brownhat Diagnostic | 2-day workshop + report | 1620 consultant hours |
| Quick wins implementation | CA policies, account disables, GitHub MFA | 812 hours (same week as diagnostic) |
| Module 2: M365 Identity Security | MFA rollout (500 users, 10 admins, contractors), CA baseline, legacy auth block, app consent review, ASTRAL/PULSAR deployment | **Low to medium** (2030 consultant days) |
| Module 6: On-Premise AD Hardening | KRBTGT rotation, service account cleanup, PAW for admins, BloodHound remediation, AD Connect de-privilege | **Low to medium** (1525 consultant days) |
| Module 1: Endpoint Management | Intune compliance baseline, macOS enrollment, CA integration, ASTRAL hardening | **Low** (815 consultant days) |
| Module 7: Recovery & Resilience | Backup integrity testing, ERP restore drill, DR runbooks | **Low** (812 consultant days) |
| **Total 180-day programme** | | **~5580 consultant days** |
**Infrastructure costs** (one-time, at cost):
- PULSAR hosting: €1020/month (VPS or Azure Container Apps) — or on the client's existing infrastructure
- ASTRAL: no additional cost (Azure DevOps pipelines within E3/Microsoft Partner allocation)
**Retained capability** (post-180 days, quarterly):
- Monthly ASTRAL drift review and PULSAR health check
- Quarterly BloodHound + Elysium run + housekeeping cycle
- Estimated: 35 consultant days per quarter
---
## Consultant Notes
**The CISO handover opportunity**: The CTO mentioned they want something to hand over when they hire a CISO. Structure the Day 180 deliverables explicitly as a CISO onboarding package: the backlog, the ASTRAL history, the PULSAR event summary, the module completion packages, and the retained scope. A new CISO who inherits a cleaned AD, enforced MFA, running detection, and a maintained backlog is in a position to build — not to firefight.
**Managing the NIS2 timeline pressure**: The questionnaire is due in 90 days. The Day 90 deliverables are specifically designed to produce the four evidence items (diagnostic, ASTRAL, PULSAR, MFA enforcement) needed to answer the questionnaire. Do not let the regulatory deadline distort the sequence — the diagnostic first, then module work. A questionnaire answered with ASTRAL drift logs and CA sign-in evidence is stronger than one answered with a Word document and good intentions.
**The two-domain AD**: The acquisition-created second domain adds complexity to Module 6. Scope it explicitly in the kickoff: which domain gets the KRBTGT rotation first? Are there forest-level trusts? BloodHound collection needs to cover both. Add 57 days to the Module 6 estimate if the trust relationship is poorly documented.
**SAP credentials (P1-016)**: This finding is outside the standard M365/AD scope. It requires SAP admin access and coordination with the ERP team (who may not report to Ondřej). Flag it as an explicit dependency at kickoff — it will slip past Day 90 without an owner from the ERP side.
**Contractors**: 80 contractors at any given time means the offboarding process is a permanent operational concern, not a one-time fix. The contractor provisioning and offboarding procedure (B-105) must name an owner in HR, not just IT. If HR does not send a termination notification, IT cannot offboard. This is a process dependency that the engagement alone cannot fix — it requires a management conversation.
---
*This sample engagement is based on composite real-world findings from mid-market AD+M365 environments. All company names and individual details are fictional.*
*Related: [Brownhat Diagnostic](../assessment-templates/nist-csf-baseline.md) · [Module Menu](../core/modular-engagements.md) · [Findings Backlog](../assessment-templates/findings-backlog.md) · [NIS2 Mapping](../reference/nist-csf-mapping.md) · [Risk Register Example](../assessment-templates/risk-register-example.md)*