Files

515 lines
26 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Assessment Team Guide: Technical Execution of the Brownhat Diagnostic
> *"The workshop tells you what the client thinks is happening. The tools tell you what is actually happening. Run the tools before the second session — the findings change the conversation."*
This guide covers the technical execution of the Brownhat Diagnostic. It is the companion to the [NIST CSF 2.0 Baseline Assessment](nist-csf-baseline.md), which covers the workshop methodology. Read both before your first diagnostic.
**Division of labour**: The workshop facilitator runs the NIST CSF sessions and manages the client conversation. The technical assessor runs the tools, collects evidence, and builds the findings. These can be the same person in smaller engagements, but if you have two people, split them — the findings from Day 2 tool runs should inform the workshop conversation, not interrupt it.
---
## Before You Arrive: Pre-Engagement Preparation
### Access to Request (Before Kickoff)
Send this checklist to the client IT lead at least 5 business days before Day 1. Missing access on Day 1 is the most common cause of diagnostic delay.
**M365 / Entra ID**:
- [ ] Global Reader role in Entra ID (read-only; sufficient for most checks)
- [ ] Entra ID audit log access (to verify logging is enabled before PULSAR deploys)
- [ ] Exchange admin centre read access
- [ ] SharePoint admin centre read access
- [ ] Intune read access (Device Management / Endpoint Manager)
- [ ] Microsoft Secure Score access
- [ ] Conditional Access policies read access
**Active Directory**:
- [ ] Domain User account on the domain(s) — BloodHound collection only needs this
- [ ] Read access to ADUC (Active Directory Users and Computers)
- [ ] Ability to run PowerShell on a domain-joined machine (for BloodHound collector and Elysium — see notes below on Elysium privilege requirements)
**Network / Infrastructure**:
- [ ] Access to firewall management interface (read-only; to review ruleset)
- [ ] VPN access or on-site working arrangement for Day 2 tool runs
- [ ] Previous pentest or audit reports (if any exist)
**Documents to request**:
- [ ] Network diagram (any version, however outdated — better than none)
- [ ] Asset inventory or CMDB export (even a spreadsheet)
- [ ] Previous security audit or pentest report
- [ ] List of third-party SaaS tools (from procurement or IT)
- [ ] Organisational chart for IT/security team
> **What to do if access is not ready**: Do not delay the workshop waiting for full access. Start Session 1 with what you have. Deploy ASTRAL and PULSAR as soon as any M365 access is confirmed — they produce value from minute one. Tool runs that need AD access can happen Day 23 once an account is provisioned.
### Tool Preparation
Have these ready to deploy before Day 1. Do not learn a tool at a client's expense.
| Tool | Preparation | Time to deploy |
|------|-------------|---------------|
| ASTRAL | ADO project created; pipeline YAMLs ready; `bootstrap-tenant.ps1` reviewed | 24 hours on-site |
| PULSAR | Docker Compose environment ready; `bootstrap-tenant.ps1` reviewed | 12 hours on-site |
| BloodHound CE | Installed on assessment laptop; SharpHound collector downloaded | 15 minutes |
| Elysium | Cloned and tested in lab; KHDB download initiated (large file — download before arriving) | 30 min setup; KHDB download 3060 min |
| CAExporter | Downloaded and tested | 10 minutes |
| Purple Knight | Downloaded from Semperis (free, requires registration) | 15 minutes |
| E8-CAT | Downloaded and tested (for Australian clients or E8-aligned clients) | 10 minutes |
| Nmap / Shodan | Nmap installed; Shodan account active (free tier sufficient) | Ready |
---
## Day 1: Deploy First, Ask Questions Later
The single most important discipline: **deploy ASTRAL and PULSAR before the first workshop session begins.** The baseline they capture is a point-in-time snapshot. If you wait until after the workshops, the baseline may already reflect changes the client made in response to your questions.
### Morning: Deploy Listening Tools
Before Session 1 starts, or during the first 30-minute introductions slot:
**Step 1 — ASTRAL deployment** (~2 hours, can run in background)
```powershell
# On the client's Azure DevOps or your assessment instance
.\deploy\bootstrap-tenant.ps1 -TenantName "<client>.onmicrosoft.com"
```
Follow the [ASTRAL onboarding runbook](https://github.com/cqrenet/astral/blob/main/deploy/onboarding-runbook.md). The initial full backup pipeline run captures the complete M365 configuration baseline. This is your "before" snapshot — everything you find during the assessment is measured against this.
**What ASTRAL captures on first run**:
- All Intune profiles, policies, compliance policies, applications, scripts
- All Conditional Access policies (with full named-object resolution via CAExporter integration)
- All Entra ID app registrations and enterprise applications
- All authentication methods and named locations
- Produces HTML/PDF as-built documentation automatically
**Step 2 — PULSAR deployment** (~1 hour, can run in background)
```bash
cp .env.example .env
# Fill in CLIENT_ID, CLIENT_SECRET, TENANT_ID from bootstrap output
docker compose up --build -d
```
Once running, trigger a manual fetch to confirm audit log ingestion is working:
```
GET http://localhost:8000/api/fetch-audit-logs
```
**What PULSAR captures immediately**: All M365 admin audit events from the Management Activity API (Exchange, SharePoint, Teams, Entra, Intune). Retention starts from this moment — every admin action from here forward is permanently searchable. For clients with no prior log retention, this is instant value.
**Step 3 — Microsoft Secure Score baseline** (10 minutes)
Navigate to `security.microsoft.com → Secure Score`. Screenshot the current score and the top 10 recommended actions. This is a quick reference point for the workshop conversation and gives the client a number they immediately understand.
**Step 4 — Passive external scan** (runs in background during workshop)
```bash
# From your assessment machine
nmap -sV --open -p 80,443,8080,8443,3389,22,21,25,993,995 [client-public-IPs]
# Shodan CLI for ASN-based discovery
shodan search "org:[client-org-name]" --fields ip_str,port,banner,product
```
Also check:
- Certificate transparency logs: `crt.sh/?q=[client-domain]` — reveals subdomains, expired certs, shadow IT domains
- Shodan for the VPN endpoint specifically: firmware version, known CVEs
- `whois` and reverse DNS for all IP ranges the client mentions
---
## During the Workshops: What to Listen For
The [NIST CSF Baseline](nist-csf-baseline.md) has the full question set. Below are the specific signals to listen for that indicate P0/P1 findings. Note these immediately — they feed the technical checklist for Day 2.
| What the client says | What it likely means | Check on Day 2 |
|---------------------|---------------------|----------------|
| "We haven't tested our backups recently" | No restore has ever been done | Recovery drill required; check backup destination |
| "We use shared admin accounts" | Multiple people using one credential | Elysium; AD audit; no MFA possible on shared account |
| "Contractors have the same access as employees" | Likely no offboarding process; stale accounts | Elysium; AD account audit; HR cross-reference |
| "We have MFA but I think some people have exemptions" | CA policies in report-only or with large exclusion groups | CAExporter; Entra ID CA policy review |
| "The acquisition brought in a second AD" | Forest trusts; uncharted attack paths; duplicate admin accounts | BloodHound must cover both domains |
| "We use [legacy on-prem system] with its own accounts" | Shadow identity; service accounts not in scope of central IAM | Manual AD service account audit |
| "IT handles offboarding when HR tells us" | Offboarding depends on HR notification — often delayed | Elysium; compare AD accounts to HR list |
| "I'm not sure who all has admin access" | No privileged access inventory | BloodHound; ADUC privileged group audit |
| "We have a firewall but nobody has reviewed the rules in years" | Accumulated rules; likely any/any entries; retired services still open | Firewall rule export and review |
| "Some of our developers have direct access to production" | Uncontrolled privileged access to production systems | Scope question for Module 6 |
---
## Day 23: Technical Tool Runs
Run tools in this order. Earlier tools inform later ones.
### 1. CAExporter — Conditional Access Baseline (30 minutes)
Run first. The CA policy export reveals whether MFA is actually enforced or just configured. This is consistently the most surprising finding in M365 environments.
```powershell
# Requires Entra ID reader access
.\CAExporter.ps1 -TenantId <tenant-id> -OutputPath .\ca-export\
```
**What to look for**:
- Policies in **Report-Only** mode (not enforced — common; clients assume they are protected when they are not)
- Large **exclusion groups** containing most users ("AllUsers_ExceptionGroup" type)
- Policies that claim to block legacy authentication but have exclusions that defeat the purpose
- No policy enforcing device compliance
- Multiple overlapping policies with unclear precedence
**Output**: Excel workbook with one row per policy, conditions and controls expanded, groups and apps named rather than showing GUIDs. This is the CA baseline document.
---
### 2. BloodHound — AD Attack Path Analysis (12 hours collection + analysis)
```powershell
# Run SharpHound from a domain-joined machine using the assessor domain account
.\SharpHound.exe -c All --zipfilename nexus-bloodhound.zip
```
Copy the zip to your assessment machine and import into BloodHound CE.
**Required queries** (run these first, every engagement):
```cypher
-- Shortest paths to Domain Admin from all non-admin users
MATCH p=shortestPath((u:User {admincount:false})-[*1..]->(g:Group {name:"DOMAIN ADMINS@DOMAIN.LOCAL"})) RETURN p
-- All Domain Admin members with direct login sessions on workstations
MATCH (u:User)-[:MemberOf]->(g:Group {name:"DOMAIN ADMINS@DOMAIN.LOCAL"})
MATCH (u)-[:HasSession]->(c:Computer) WHERE NOT c.name CONTAINS "DC" RETURN u.name, c.name
-- Kerberoastable accounts with high privilege
MATCH (u:User {hasspn:true}) WHERE u.admincount=true RETURN u.name, u.serviceprincipalnames
-- ASREPRoastable accounts (no Kerberos pre-auth)
MATCH (u:User {dontreqpreauth:true}) RETURN u.name, u.enabled
-- Service accounts with paths to Domain Admin
MATCH p=shortestPath((u:User)-[*1..5]->(g:Group {name:"DOMAIN ADMINS@DOMAIN.LOCAL"}))
WHERE u.name CONTAINS "$" OR u.name CONTAINS "SVC" OR u.name CONTAINS "SERVICE"
RETURN p
```
**What to document**:
- Number of paths to Domain Admin from non-admin users (the "847 paths" number from the sample)
- Shortest path length and the specific nodes on it — this is your kill chain
- Domain Admins with sessions on non-DC workstations — P1 finding in almost every environment
- Any service accounts that are Kerberoastable and have high privilege — often P0
- KRBTGT last password set date (check in ADUC or PowerShell)
```powershell
# KRBTGT last password set
Get-ADUser krbtgt -Properties PasswordLastSet | Select PasswordLastSet
```
---
### 3. Elysium — Password Audit (24 hours, requires elevated AD access)
> **Privilege requirement**: Elysium requires Domain Admin or equivalent (DSInternals needs to read password hashes). Confirm this access before scheduling. If it cannot be arranged during the diagnostic, schedule it for week 1 of Module 6.
```powershell
# Run from a domain controller or with delegated rights
.\Elysium.ps1 -Domain <domain-fqdn> -OutputPath .\elysium-output\
```
**What Elysium finds**:
- Accounts matching known-breached password hashes (from the KHDB — download before arriving)
- Accounts with blank passwords
- Accounts with passwords matching dictionary patterns
- Duplicate passwords across accounts (shared credential detection)
**Output to document**:
- Total accounts audited
- Accounts matching KHDB (breached) — split by privileged vs non-privileged
- Accounts with common passwords
- Any privileged account with a compromised or weak password → immediate P0
**Privacy handling**: Elysium does not transmit usernames or plaintext passwords. The KHDB comparison is local. The output is a list of SAMAccountNames to reset — not passwords. Communicate this clearly to the client before running.
---
### 4. Purple Knight — AD Security Scoring (30 minutes)
Purple Knight (Semperis, free) runs a broad checklist of AD security misconfigurations. Run it from any domain-joined machine.
```powershell
.\PurpleKnight.ps1
```
The report scores against ~100 indicators. **Focus on**:
- LDAP signing and channel binding status
- AdminSDHolder unusual members
- Protected Users group membership (or absence of it for admins)
- Reversible encryption enabled accounts
- Unconstrained delegation (computers and users)
- Machine account quota (default 10 — often abused for relay attacks)
- Exchange permissions on AD objects (if Exchange exists on-prem)
Cross-reference Purple Knight findings with BloodHound. Purple Knight finds the indicators; BloodHound shows how they chain together into attack paths.
---
### 5. Entra ID Manual Checks (1 hour)
These cannot be automated — they require visual inspection in the Entra admin centre.
**App registrations and enterprise applications**:
- Navigate to: `Entra ID → App registrations → All applications`
- Filter by: "High privilege permissions" — look for `Mail.ReadWrite`, `Directory.ReadWrite.All`, `User.ReadWrite.All`
- Note any apps with these permissions that are: (a) published by unknown parties, (b) have no documented owner, (c) were consented to by users rather than admins
- This is consistently where the most surprising findings live — OAuth consent abuse is underdetected in every mid-market environment
**Guest accounts**:
- Navigate to: `Entra ID → Users → Filter: User type = Guest`
- How many guests are there? When was their last sign-in? Are any of them former contractors?
**MFA registration status**:
- Navigate to: `Entra ID → Users → Per-user MFA` (legacy view) OR `Identity Protection → Monitoring → Authentication methods → User registration details`
- What % of users have MFA registered? What % have it enforced?
- Are there any break-glass accounts? Are they properly protected and audited?
**Entra ID Connect sync account** (hybrid environments only):
- Navigate to: `Entra ID → App registrations → find the sync account`
- Check what rights it has in Entra ID
- Cross-reference with on-prem AD: does this account have DCSync rights? (BloodHound query: search for the account name and check its paths)
---
### 6. Intune / Endpoint Check (30 minutes — via ASTRAL output or direct)
ASTRAL's first run will have produced an Intune inventory. Review:
- **Enrollment rate**: What % of devices are enrolled? What platforms?
- **Compliance policy coverage**: Is there a compliance policy? What does it enforce? Is it assigned to all devices?
- **Conditional Access integration**: Is the "Require compliant device" CA policy active — or in report-only?
- **Stale devices**: Devices with last check-in > 90 days are likely personal devices or ghost entries. Note the count.
- **Script inventory**: What PowerShell scripts are deployed via Intune? Any that look unfamiliar?
---
### 7. External Attack Surface (3060 minutes)
By Day 2, the Nmap and Shodan scans from Day 1 should have results.
**Review**:
- Any RDP (3389) exposed to internet → P0 in almost every context
- Any management interfaces (firewalls, switches, VPN management) accessible from internet
- Any services with outdated banners suggesting old software versions
- Certificate expiry on any internet-facing service
- VPN endpoint firmware version → check against vendor advisory for known CVEs
**Additional check — subdomain enumeration**:
```bash
# Using crt.sh results and DNS brute force
cat crt-sh-results.txt | grep "<client-domain>" | sort -u
# For each subdomain found: what is it? Is it documented? Is it still active?
```
Undocumented subdomains pointing to forgotten services are a regular P1 finding.
---
### 8. Firewall Rule Review (3060 minutes)
Request an export of the firewall ruleset. Most firewall platforms support CSV or XML export.
**What to look for**:
- Rules with `source = ANY` and `destination = ANY` (any/any) → almost always P2 but sometimes P1 if it covers a sensitive segment
- Rules allowing direct internet access from server VLANs → P1
- Rules created for a specific project that are still active years later → P2
- Rules referencing IP addresses that no longer correspond to live systems
- No rule for blocking outbound traffic (egress filtering absent) → P1 for environments with sensitive data
---
### 9. Backup and Recovery Spot Check (30 minutes)
Ask the IT lead to show you, live:
- Where backups are stored (destination)
- When the last backup ran and whether it completed successfully
- Whether the backup destination is on the same network segment as the system being backed up
- Whether anyone has ever triggered a test restore and what the result was
> **The standard answer**: "Backups run every night and we get a green tick." The right follow-up: "Show me the most recent successful restore test." In most environments, one has never been performed.
Document: backup target, last run, completion status, last restore test (date or "never").
---
## Synthesising Findings: From Data to Kill Chain
After tool runs are complete, before writing the report, do this step explicitly. Sit with your notes and answer one question:
**"What is the shortest sequence of steps an adversary with no prior access could take to cause the organisation to fail to operate?"**
Build the kill chain step by step:
1. Start from the outside (what can be accessed without credentials?)
2. What is the first credential gain? (phishing, password spray against legacy auth, VPN without MFA)
3. What does that credential give access to? (M365 if MFA is not enforced; VPN if no MFA there)
4. What can you do with M365 access? (read all email, access SharePoint, escalate via app permissions)
5. What is the path from M365 access to domain admin? (Entra ID admin → AD Connect sync account → DCSync)
6. What does domain admin give you? (everything on-prem, including ERP, backup servers)
7. What is the impact? (data exfiltration, ransomware, operational disruption)
Write this as a chain, not a list. The [sample engagement kill chain](sample-engagement-mid-market.md#kill-chain-assessment) shows the format.
---
## Finding Triage and Priority Assignment
For every finding, apply the kill chain test:
| Question | Priority |
|----------|----------|
| Is this a node on the kill chain? | **P0** — fix before anything else |
| If exploited, does material harm result even if not on the kill chain? | **P1** — fix this engagement |
| Real finding, real risk, but not on the kill chain and not immediately material? | **P2** — housekeeping queue |
| Best practice recommendation with no exploitable risk? | **Observation** — note in report, do not count as a finding |
**Common priority inflation mistakes**:
- Marking "no security awareness training programme" as P0 — it is P2 at most
- Marking every missing patch as P0 — only patches for internet-facing or kill-chain systems
- Marking "weak password policy" as P0 when Elysium shows no actual weak passwords — the policy is P2; actual weak credentials on privileged accounts are P0
---
## Quick Wins Identification
A quick win must pass three tests:
1. **Closeable in hours or days, not weeks** — requires no procurement, no change window longer than one day, no significant testing
2. **Uses only existing tools and permissions** — no new purchase, no new deployment
3. **Meaningfully reduces risk** — not cosmetic
For M365/AD environments, the standard quick wins checklist:
- [ ] Activate CA policies already in Report-Only mode
- [ ] Remove large exception groups from CA compliance policies
- [ ] Block legacy authentication (CA policy template exists in every tenant)
- [ ] Enforce MFA at organisation level in GitHub / other SaaS tools
- [ ] Disable accounts confirmed as departed contractors (HR-verified, scripted disable)
- [ ] Enable audit logging where it is off (often disabled on legacy servers to save disk)
- [ ] Revoke suspicious OAuth app permissions (for obvious unknowns with high privilege)
- [ ] Change default credentials on any system where they are confirmed unchanged
---
## Report Structure
The Brownhat Diagnostic report has five sections. Target length: 1525 pages. Not more — if it is longer, it will not be read.
### 1. Executive Summary (2 pages)
- Current state in one paragraph — honest, not alarming
- Kill chain: the specific path, named, diagrammed if possible
- P0 count, P1 count, P2 count
- Quick wins: what was closed immediately (if Day 1 quick wins were executed)
- Recommended first module and rationale
- NIS2 compliance gap summary (if applicable): which Article 21 measures have evidence, which do not
### 2. Methodology (0.5 pages)
- Workshop dates, attendees
- Tools used (ASTRAL, PULSAR, BloodHound, Elysium, Purple Knight, CAExporter, external scan)
- Access used (read-only Entra ID, domain user for BloodHound, domain admin for Elysium)
- What was NOT assessed (explicitly scoped out — sets expectations)
### 3. Findings (815 pages)
Organise by priority tier, not by domain.
**P0 — Kill Chain Nodes**: Each finding gets a half-page: the finding in one sentence, the evidence, the business impact in non-technical language, and the remediation. Name the specific accounts, policies, or systems involved. "Admin accounts lack MFA" is a weak finding. "3 of 5 Global Administrator accounts — `admin@nexus.onmicrosoft.com`, `it-admin@nexus.onmicrosoft.com`, and the break-glass account — can authenticate without MFA because the Conditional Access policy 'Require MFA' is in Report-Only mode" is a finding.
**P1 — Material Risk**: Same format, briefer. One paragraph per finding.
**P2 — Housekeeping Queue**: Table format only. ID, finding, why it matters in one sentence.
### 4. Module Recommendation (2 pages)
- Recommended sequence with rationale
- What each module closes (map to specific P0/P1 findings)
- Timeline estimate
- Investment estimate (effort ranges, not day rates — rates go in the proposal)
### 5. Quick Wins Closed (0.5 pages)
List what was already fixed during the diagnostic. This is the most important page for client confidence — they paid for the diagnostic and something is already better.
---
## Backlog Population
Before leaving the client site (or within 24 hours):
1. Create the ADO Work Items project (or agree on the tool with Ondřej)
2. Enter every finding as a Work Item: ID, finding text (one sentence), source (Brownhat Diagnostic), priority (P0/P1/P2), owner (named person)
3. Move quick wins to Closed with the date they were resolved
4. Brief the named IT lead on the backlog: where it lives, how the monthly cycle works, who owns what
5. Pin the ADO board as a Teams tab if applicable
The backlog handover is not optional. A diagnostic that produces a report but no maintained tracking system has a half-life of one steering committee meeting.
---
## ASTRAL and PULSAR Handover
By the end of the diagnostic engagement:
**ASTRAL**:
- First full backup has run and committed to the ADO repository
- Client IT lead can access the ADO project and review the baseline
- Drift detection is live — the first drift PR, if one occurs, should be reviewed together with the client as a training exercise
- Reviewer notification configured to email or Teams-notify Ondřej
**PULSAR**:
- Audit events ingesting and searchable
- Teams tab pinned in the IT channel
- Basic search walkthrough done with client IT lead: show them how to find a specific event, how to filter by actor and operation
- No alert rules yet — those come in Module 2/3 when there is a hardened baseline to alert against
---
## Common Mistakes in Assessment Execution
**Starting tool runs before access is confirmed.** Tool runs that fail eat time and erode confidence. Confirm credentials work before you need them.
**Running Elysium without telling the client what it does.** "We are going to compare your password hashes against a database of known-compromised credentials" needs to be explained before it happens. Most clients are fine with it once they understand the privacy model. Zero clients want a surprise.
**Presenting findings before you have run BloodHound.** The kill chain often only becomes clear once BloodHound has shown how the pieces connect. Do not anchor the client on an incomplete kill chain in Session 2 and then have to walk it back.
**Marking everything P0.** If you present 15 P0 findings, the client has no way to act. Real P0 items are rare — typically 38 in a first diagnostic. If you have more, re-examine your priority assignments.
**Leaving without a named owner for every P0.** The diagnostic ends. The report goes out. Nobody fixes the P0 items because nobody has their name on them. Get owner names in the room before you leave.
**Forgetting to document what you ran and what access you used.** The methodology section of the report should be written from notes taken during the assessment, not reconstructed from memory three days later.
---
## Post-Assessment Checklist
Before submitting the report:
- [ ] Kill chain written as a chain, not a list
- [ ] Every P0 finding has: evidence citation, specific named assets, remediation steps, named owner
- [ ] Quick wins section lists what was already fixed
- [ ] Module recommendation is tied to specific findings ("Module 2 closes P0-001, P0-002, P1-001, P1-004")
- [ ] ASTRAL baseline committed and accessible to client
- [ ] PULSAR ingesting and accessible to client
- [ ] Findings backlog populated in agreed tool, owners assigned
- [ ] Report reviewed for any claim that is an assertion rather than evidence (replace with what was found)
- [ ] NIS2 compliance map completed if client is in scope
- [ ] Next steps section includes: module recommendation, first meeting date, decision required from client
---
*Companion documents:*
*[NIST CSF 2.0 Baseline Assessment](nist-csf-baseline.md) — workshop methodology and questionnaires*
*[Sample Engagement: Mid-Market Hybrid](../playbooks/sample-engagement-mid-market.md) — calibration reference for findings and recommendations*
*[Findings Backlog](findings-backlog.md) — where findings land and how the housekeeping stream works*
*[Sovereign Tool Stack](../playbooks/sovereign-tool-stack.md) — full tool reference with deployment guidance*
*[Module Menu](../core/modular-engagements.md) — module selection after the diagnostic*