Files
antifragile/antifragile-consulting/books/field-guide-adversarial-validation.md

510 lines
34 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Field Guide — Adversarial Validation
> *"It's a nice compliance dashboard you have here."*
**Last updated:** June 2026
**Companion to:** [Field Guide — 2026 Edition](field-guide-2026.md) · Books IVI
**Engagement type:** Phase 2 — for clients who have done the foundational work
**Checklist:** [Adversarial Validation Checklist](../assessment-templates/adversarial-validation-checklist.md)
**Next review:** January 2027
---
## The premise
The client has MFA. They have Conditional Access. They have Intune. They have a SIEM. Their CIS score is in the seventies or eighties. Their audit passed. The dashboard is green.
This is the most dangerous estate to walk into — not because it is badly configured, but because everyone in the room believes it works. That belief is the fragility. Book I calls it directly: *"Green dashboards, untested reality — the most dangerous estate of all, because it feels safe."*
The foundational field guide tells you how to build controls. This engagement is about finding out which of the client's existing controls are real and which are representations — configurations that *display* correctly but *enforce* nothing, backups that exist but have never been restored, detection that fires into a queue nobody reads, attack paths to Domain Admin that nobody has mapped because the BloodHound licence expired.
**What you are doing in this engagement:** Systematically converting claimed security into observed security, domain by domain, and producing a structural change for every gap found. Not a pentest. Not a red team. A constructive adversarial validation — you are working with the client, with full authorization, with the explicit goal of finding what breaks before an attacker does.
**What you are not doing:** Adding more controls. This engagement deliberately does not recommend new tooling or new policies. If a control exists and does not work, the finding is that the control does not work — not that a different control is needed. Via negativa applies here too: the fragility is almost always that the existing controls have too many exceptions, too little monitoring, and have never been tested.
---
## Before you start
### Authorization scope
Before any test in this engagement, confirm written authorization covering:
- Simulating attacks against identity (Kerberoasting, DCSync simulation, PIM bypass attempts)
- Triggering security alerts deliberately (break-glass sign-in, impossible-travel simulation, fake consent grant)
- Testing compliance controls on managed devices (rooting a test device, forcing a non-compliant state)
- Attempting data exfiltration through DLP and labeling controls (on test data, to controlled test destinations)
- Restoring from backup in a test environment
Authorization is not "we told them verbally." It is a document signed by the named executive sponsor covering the scope of tests. Scope the authorization to the test accounts, test devices, and test data used — do not test on production privileged accounts or production data unless explicitly scoped.
### Baseline capture before anything changes
On day one, before any test or change:
1. Export all CA policies to JSON (CAExporter or Graph API). This is the declared state you will test against and the known-good you will compare the close-of-engagement state to.
2. Run BloodHound and capture the full attack graph. The number of paths to Domain Admin at T+0 is your opening metric.
3. Pull the Entra role assignment list — who holds what role, eligible vs. active.
4. Pull the service principal inventory with their Graph permissions.
5. Export Intune compliance and configuration policy assignments.
6. Run `Get-ADUser krbtgt -Properties PasswordLastSet`, `Get-ADComputer AZUREADSSOACC -Properties PasswordLastSet`, and document both.
7. Count sign-in log distinct device IDs for the last 30 days. Compare to Intune enrolled device count. Record the gap.
These numbers are your before-state. Every structural change produced by this engagement is measured against them.
### The opening conversation
This engagement starts with a single question asked out loud, to the most senior technical person in the room:
> *"Can you show me one control in this estate that you are certain works — not because the portal says so, but because you have watched it fire under real conditions?"*
The answer tells you everything. A person who can point to a specific tested control on a specific date has a security programme. A person who gestures at the dashboard has a compliance programme. Both deserve good consulting — but they need different things.
---
## 1. Identity — proving the wall is real
### The firebreak claim
The client almost certainly claims that cloud privilege is separated from on-prem compromise. Test the claim, don't accept it.
**Draw the full graph, out loud:**
Starting from Domain Admin (or a simulated compromise of the sync server), trace every path that reaches a cloud privileged role:
- Are any GAs synced from on-prem? (They claim no — verify.)
- Can the sync server connector account be used to tamper with cloud objects?
- Do any admins use the same device for Tier 0 and cloud admin work?
- Is there a PTA agent that could be compromised to intercept credentials?
- Does any MFA for cloud admin rely on an authenticator app on a device that is also used for email? (The MFA device is Tier 2. The admin role is cloud Tier 0. That is a tier violation across the MFA layer.)
**Verify cloud-only GAs are actually cloud-only:**
```powershell
$gaRoleId = (Get-MgDirectoryRole -Filter "displayName eq 'Global Administrator'").Id
Get-MgDirectoryRoleMember -DirectoryRoleId $gaRoleId |
Select-Object @{N='UPN';E={$_.AdditionalProperties['userPrincipalName']}},
@{N='OnPremSyncEnabled';E={$_.AdditionalProperties['onPremisesSyncEnabled']}}
```
`onPremisesSyncEnabled: true` on any GA is a P0 finding. "We moved them to cloud-only" is the claim; this is the verification.
**Test the break-glass is actually independent:**
With the client present: sign in to the break-glass account. Does it succeed? Does an alert fire? Does the person named as the responder to that alert actually receive it and acknowledge it within the agreed SLA? An alert rule that exists but routes to an unmonitored inbox is a ghost detection.
### AD FS: is the token-signing key actually monitored?
If AD FS is still running (and in a "mature" estate it often is, "migration is on the roadmap"):
```powershell
Get-AdfsCertificate -CertificateType Token-Signing |
Select-Object Thumbprint, NotAfter, @{N='DaysSinceRotation';E={(Get-Date) - $_.Certificate.NotBefore | Select-Object -ExpandProperty Days}}
```
Then ask: if an attacker obtained the private key for this certificate right now, what would you see in your logs? Walk through the scenario. In almost every case the honest answer is "nothing — a Golden SAML token is indistinguishable from a legitimate one." That is the finding. The migration is no longer a roadmap item.
### PIM: test the activation path, not the configuration
The client has PIM. But:
- **What MFA method is required on activation?** Navigate to PIM > Settings for Global Administrator role > Require MFA on activation. Then confirm the MFA method registered for each eligible GA. Push-approve MFA + PIM activation = phishable PIM. The control is not what it appears.
- **Test an activation:** Have a test user with an eligible GA role activate it. Time the process. Observe: does the approval notification reach the approver? Does the approver know what they are approving, or does it arrive as a blind "approve this"? An approval workflow where approvers routinely click approve without context is not an approval workflow.
- **Check for standing GA assignments that are supposed to be eligible-only.** `Get-MgDirectoryRoleMember` for GA — any user with no corresponding PIM eligible assignment has a permanent standing assignment that exists outside PIM, whether intentionally or by configuration drift.
- **Check the maximum activation time box.** 24-hour activation windows are common in "we have PIM" deployments. An activation window that covers an entire working day is functionally standing privilege during business hours.
### The connector account as a canary
Reconfigure: any sign-in by the Entra connector account (Directory Synchronization Accounts role) from any host other than the sync server should fire an alert. Then test it: simulate a sign-in from an unexpected host. Does the alert fire? Does someone respond?
If the answer is "we have an alert rule," test it. "We have an alert rule" is a declaration. A firing alert reaching a responding human is an observation. The handbook's hardest rule applies here: verify by observation, never by inspection.
---
## 2. Privilege — attack paths the client has not mapped
### BloodHound as a metric, not a one-time scan
The client's mature estate almost certainly has attack paths to Domain Admin that nobody has counted since the last pentest, if ever. Run BloodHound, capture the full graph, and count:
- **Total paths to Domain Admin** (all principals)
- **Paths reachable from standard user compromise** (the realistic starting point for a phishing attack)
- **Paths involving Kerberoastable service accounts** specifically
- **Paths involving ADCS** (add `-CollectionMethod ACL,ObjectProps,Trusts` to catch certificate-based escalation)
Present the number. Do not present it as "you have X findings." Present it as: *"From a single compromised standard user account, there are N independent routes to Domain Admin. Each route is a path through controls the attacker does not need to break because they route around them."* Then pick the three shortest paths and show them concretely.
This number is now a tracked metric. The engagement is not complete until it is going down.
### Kerberoast it — don't ask if it's possible
Run the attack:
```powershell
# Using Rubeus or Invoke-Kerberoast in an authorized test context
Invoke-Kerberoast -OutputFormat Hashcat | Out-File kerberoast_hashes.txt
```
The question is not "are there Kerberoastable accounts" (there are) — the question is: **did anything detect it?** A Kerberoast produces distinctive TGS request patterns. If Defender for Identity, Microsoft Sentinel, or any SIEM is watching, it should alert. If it does not, you have found a detection gap more important than the accounts themselves.
Then attempt to crack the hashes offline (with explicit authorization, on a controlled device). Report which accounts crack and in what time. Most clients are surprised. The service account from 2019 with the password that was "rotated" to `ServiceAcc0unt!2019` cracks in minutes.
### ADCS: the forgotten Tier 0 target
Run a basic ESC vulnerability enumeration:
```
certipy find -u <test-account>@domain.com -p <password> -dc-ip <DC-IP> -stdout
```
Or Certify if a Windows test host is more convenient:
```
Certify.exe find /vulnerable
```
In a mature estate, the ADCS server has been running for years, was configured for a specific purpose in 2018, and has never been audited against the ESC series. ESC1 (supply subject in request + broad enrollment rights) in particular is common and catastrophic — it allows any enrolled user to obtain a certificate for any principal, including Domain Admins. Find it, show the exploit path, and document that the ADCS server is being treated as Tier 1 when it is Tier 0.
### Service principal dark matter
The client's mature estate has app registrations. Some of them have permissions that were granted for a reason that nobody in the room can explain. Find the escalation-grade ones:
```powershell
# Application permissions (not delegated — these run without a user)
$dangerousPermissions = @(
"9e3f62cf-ca93-4989-b6ce-bf83c28f9fe8", # RoleManagement.ReadWrite.Directory
"06b708a9-e830-4db3-a914-8e69da51d44f", # AppRoleAssignment.ReadWrite.All
"1bfefb4e-e0b5-418b-a88f-73c46d2cc8e9", # Application.ReadWrite.All
"19dbc75e-c2e2-444c-a770-ec69d8559fc7" # Directory.ReadWrite.All
)
Get-MgServicePrincipal -All | ForEach-Object {
$sp = $_
Get-MgServicePrincipalAppRoleAssignment -ServicePrincipalId $sp.Id |
Where-Object { $_.AppRoleId -in $dangerousPermissions } |
ForEach-Object {
[PSCustomObject]@{
ServicePrincipal = $sp.DisplayName
Permission = $_.AppRoleId
GrantedDate = $_.CreatedDateTime
}
}
} | Sort-Object GrantedDate
```
For each result: ask the room who created this app registration, what it does, and whether the permission is still needed. The answer to all three is usually "I don't know." That is the finding.
Then go further: check which of these service principals have non-expiring client secrets and which have never been used (check the sign-in logs for the service principal's `lastSignInDateTime`). A service principal that has not authenticated in 180 days with a never-expiring secret holding escalation-grade Graph permissions is a standing credential an attacker can use indefinitely without triggering a human sign-in.
### Standing privilege check: the PIM compliance gap
Ask for the full current list of active (not eligible) privileged role assignments. For each one:
- Is it a break-glass account? If not, it should not be standing.
- Is it a service account that cannot use PIM? Document and scope the managed-identity migration.
- Is it an account someone added "temporarily" and forgot?
In most mature tenants, the list of active non-break-glass assignments is longer than anyone expects, because PIM was deployed and the existing standing assignments were not cleaned up at the time.
---
## 3. Devices — the compliance signal gap
### The ghost CA policy protocol
Apply this to every CA policy the client considers important (not every policy — prioritize the ones that block legacy auth, enforce device compliance, and gate privileged sign-in):
**Before testing any policy:**
Write down the expected outcome: *"User [X], device [Y], from location [Z], accessing [App] → MUST be [blocked / MFA-prompted / compliant-device-required]."* Write this before looking at the policy configuration. This prevents rationalizing whatever you observe.
**The tests to run:**
1. **Legacy auth block:** Use a mail client that supports Basic Auth (older Outlook, curl with basic auth headers to Exchange Online) from a test account. Expected: blocked. If it succeeds, the CA policy that blocks legacy auth either has an exclusion, is in report-only, or is a ghost.
2. **Compliant device gate:** Sign in from a device that is known to be non-compliant (a personal device, or a managed device you have taken out of compliance by disabling BitLocker or removing an agent). Expected: blocked from sensitive workloads. If access is granted, either the CA policy is not evaluating correctly or the compliance signal is stale.
3. **Admin sign-in from non-PAW:** Attempt to activate a PIM role from a standard workstation or a personal device. Expected: blocked if there is a CA policy restricting admin access to compliant or named devices. If it succeeds, the PAW policy is a claim.
4. **The ghost test:** If any policy above fails to enforce despite its configuration appearing correct — recreate the policy from scratch with identical parameters. Re-test. If the recreated policy enforces and the original did not, you have found a ghost policy. Document the specific policy name, the discrepancy, the recreation, and the re-test result.
**Important:** Do not re-edit a failing policy to fix it. Recreate it. A ghost policy carries its corruption forward through edits.
### Compliance signal spoofing: measure the lag
Take a test enrolled device (a managed device you have authorization to modify):
1. Root/jailbreak it, or manually induce a non-compliant state (disable encryption, disable the screen lock, install a prohibited app — whatever the compliance policy checks).
2. Record the timestamp.
3. Watch Intune and Entra ID: when does the compliance state flip to non-compliant?
4. When does Conditional Access revoke the session token?
5. Is Continuous Access Evaluation (CAE) in place for the workloads that matter? If yes, token revocation should be near-real-time for supported apps. If no, the window is bounded by the token lifetime.
The gap between step 2 and step 4 is the attacker's window after compromising a compliant device. Present it in minutes, not as "the token may be stale." Most clients have never measured it.
### Reconcile the real fleet
Pull four numbers and compare them:
| Source | Count |
|--------|-------|
| Intune managed devices | |
| Entra registered/joined devices | |
| Distinct device IDs in sign-in logs (last 30 days) | |
| Distinct device IDs signing in with "Device compliant: No" or "Device managed: No" | |
The gap between row 1+2 and row 3 is the shadow population. The number in row 4 is the unmanaged population actively accessing data. Neither of these are hypothetical risks — they are current, observable facts about who is accessing the tenant right now.
For every device in row 4: what data can it reach, and what Conditional Access policy, if any, applies to it?
### Legacy auth: find the surviving flows
Even with a "block legacy auth" CA policy in place, find the exceptions:
```
Sign-in logs → Add filter → Client App → select all non-modern entries:
Exchange ActiveSync
Exchange Online PowerShell
Exchange Web Services
IMAP4
MAPI Over HTTP
Other clients
POP3
Reporting Web Services
SMTP
```
Export the results. Every entry is a legacy auth flow that either bypasses the CA policy (via an exclusion you should examine) or is a service account using a protocol that will break when the exclusion is removed. Build the map. The goal is zero — but the path to zero requires knowing what is currently there.
---
## 4. Data — does protection actually travel
### Exfiltrate a labelled document
With authorization, take a test document labelled at the highest sensitivity tier available (Highly Confidential, or equivalent):
1. Forward it as an email attachment to a personal test email address outside the tenant. Does DLP intercept it? Does the label encryption hold on the received document?
2. Download it to an unmanaged device (one that is not Intune-enrolled). Open it. Does encryption require authentication to the tenant?
3. Share it via an anonymous "Anyone with the link" URL (if anonymous sharing is still permitted). Access the link from a browser with no tenant authentication. Does it open?
4. Copy and paste the content from the document into an unmanaged app (on a device where the MAM boundary applies). Does the block work?
5. Open it in a browser through Conditional Access App Control session policy. Attempt to download. Does the block work?
Document which paths hold and which do not. The ones that do not hold are the exfiltration routes an attacker (or a careless employee) will actually use. Every failed block is a finding; the label configuration that passed in the policy screen is the ghost, and the exfiltrated file is the fact.
### Enumerate the anonymous link population
The tenant sharing setting may say "restricted." That setting controls new links. It does not remove existing ones. Run:
```powershell
# PnP PowerShell — requires SiteCollection Admin on each site
Get-PnPTenantSite | ForEach-Object {
Connect-PnPOnline -Url $_.Url -Interactive
Get-PnPSharingLinks | Where-Object { $_.SharingLinkType -eq "Anonymous" }
} | Export-Csv anonymous_links.csv
```
Present the count. In mature tenants, the anonymous link population predates the current tenant sharing settings by years. The setting was changed; the links were not revoked. Every entry is an active bearer token for data that predates the restriction.
### The BEC forward rule: simulate it
With a test account (not an executive, not a privileged account):
1. Create an Inbox rule forwarding all email to an external test address you control.
2. Wait to see whether anything detects it and when.
3. Check whether the global block on external auto-forwarding (`Get-RemoteDomain Default | Select-Object AutoForwardEnabled`) actually blocks this test rule from executing.
4. Confirm: does the transport rule block the forwarding, or does the block only apply to Outlook/OWA auto-forwarding (not to manually-created Inbox rules)?
There is a documented distinction: the transport-level `AutoForwardEnabled: false` on Remote Domains blocks transport-rule-level forwarding and OWA Auto-Reply forwarding, but Inbox rules created in Outlook/OWA by the user may still forward depending on the specific configuration. Test this on the client's environment. Do not assume.
### Crown jewel access review
For the data sets the client has identified as crown jewels (if they have not identified them, that is the first finding — go back to basic engagement):
1. Pull the access list for the crown-jewel SharePoint sites and OneDrive locations.
2. Pull the audit log for access events on those locations over the last 30 days.
3. Identify: who accessed them, how frequently, from what devices?
4. Find: any access from unmanaged devices. Any access from accounts that should not have visibility. Any bulk download events.
5. Specifically check for guest access to the crown-jewel locations — guests whose project has concluded but whose access persists.
The audit log review is also a test of the audit infrastructure: can you produce a coherent forensic reconstruction of who accessed what, when, from where, over the last 30 days? If the answer is "we would need to run several different reports and correlate them manually," that is an incident response readiness finding.
---
## 5. Detection — does it fire, does anyone act
This section is the difference between robustness and antifragility. Everything before this is about whether controls hold. This section is about whether the organization learns when they do not.
### The eight simulations
For each of these, run the simulation with authorization, observe the outcome, and measure the time from event to human acknowledgment. The SLA the client believes they have is the declared state. The measured time is the observed state.
**Simulation 1 — Break-glass sign-in:**
Sign in to the break-glass Global Admin account. This should trigger an immediate, high-priority alert routed to a named responder. Measure: how long from sign-in to human acknowledgment? If the answer is longer than 15 minutes, the break-glass is not monitored at the level it needs to be.
**Simulation 2 — New Global Admin assigned:**
Assign GA to a test account. Observe: does an alert fire in Microsoft Sentinel, Microsoft Defender, or the configured SIEM? Who receives it? When? Revoke the assignment after the test.
**Simulation 3 — DCSync simulation:**
From a non-DC host with a test account that has the relevant permissions (or using Mimikatz in an authorized test context), simulate a DCSync operation. Defender for Identity should alert on `Directory Services Replication`. Does it? Does the alert reach a human? Most mature clients have DfI deployed; fewer have confirmed the specific alert fires and routes correctly.
**Simulation 4 — Kerberoasting (detection, not just the attack):**
Run the Kerberoast from section 2 again, now with the explicit goal of measuring detection. Did the TGS request pattern generate an alert? The attack was run earlier to find the vulnerable accounts; run it again now to find the detection gap.
**Simulation 5 — Impossible travel for an admin account:**
Using a VPN exit node or a cloud VM in a geographically distant region, sign in as a test user who recently signed in from the client's location. Entra ID Protection should flag this as a risky sign-in. Does the user risk policy elevate the risk? Does a CA policy enforce remediation (MFA challenge or block)? Does an alert fire to the SOC? For admin accounts specifically, this should be a high-priority signal.
**Simulation 6 — External auto-forward rule:**
From the data section — did anything alert when the test Inbox rule was created? If no detection fired during that test, that is a finding: BEC persistence can be established without triggering a single alert.
**Simulation 7 — Mass download from SharePoint:**
With a test account that has access to a document library, download 50+ files in rapid succession. Does Defender for Cloud Apps or Microsoft Purview generate an unusual-download alert? Does anything block or throttle it?
**Simulation 8 — OAuth consent grant:**
Register a test app requesting `Mail.Read` and `Files.ReadWrite.All` permissions. Grant it on behalf of a test user (simulating a user who clicks "Accept" on a consent prompt). Does anything alert on the grant event? Is user consent for this class of permission blocked by policy, or can users grant it freely?
### Alert fatigue: measure it honestly
Pull the alert volume from the last 30 days (from Sentinel, Defender XDR, or wherever alerts are collected). Calculate:
- Total alerts generated
- Alerts closed as "true positive" with a documented response
- Alerts closed as "false positive"
- Alerts that have sat open for more than 48 hours
- Alerts that were suppressed or auto-closed without human review
The ratio of responded-to versus everything else is the real detection efficacy rate. Most mature clients discover that their effective detection rate is single-digit percentages of generated alerts. Present the number; it is a more honest metric than "we have Sentinel."
### The structural change test
Pull the last five security incidents or alerts that resulted in a closed ticket. For each:
- What was the incident?
- What was the response?
- What structural change resulted — what was removed, severed, restricted, or reconfigured because of this incident?
If the answer to the third question is "we sent a reminder," "we noted it in the risk register," or "we trained the affected user" — the feedback loop is broken. Pain that closes a ticket without changing the architecture is wasted pain. Present the count of structural changes from the last five incidents. If it is zero, that is the most important finding in the report.
---
## 6. Recovery — is the exit ramp real
### Restore something
Before the engagement closes, restore a real dataset from backup. Not a test restore of a test file — a production dataset (authorized, scoped, non-disruptive) or the clearest approximation the client can authorize.
Time it. Record the actual MTTR. Compare it to the RTO written in the policy document.
If the actual MTTR is longer than the policy MTTR, the policy is fiction. Present the observed time as the finding. The goal is not to shame the recovery team — it is to replace a comfortable fiction with a useful truth.
**For M365 specifically:** Restore a mailbox or a SharePoint document library item from the third-party backup (if one exists). If no third-party backup exists in a mature estate, that is a P0 — it means the client has delegated recovery to Microsoft's recycle bin, which is not a backup posture.
### AD forest recovery readiness
Ask the client to produce their AD forest recovery runbook. Three things to verify:
1. **Is the runbook stored where it can be accessed when AD is down?** Not in SharePoint. Not in an AD-authenticated file share. Not in a password manager that authenticates against the domain. Paper, or a system outside the recovery domain, or both.
2. **Has anyone ever run the procedure?** Not a tabletop — an actual restore, even in a lab. The first time you perform AD forest recovery must not be during the real disaster.
3. **Does the runbook account for the double-KRBTGT rotation, metadata cleanup, and trust resets?** If it says "restore the DC from backup and you're done," it is incomplete.
If the answer to question 2 is no, scope a recovery rehearsal. This is the finding: the organization is one ransomware incident away from performing the hardest IT operation in existence for the first time, under maximum pressure, with incomplete runbooks.
### Configuration drift from the known-good
Compare the CA policy export from the beginning of this engagement against the current state. In any mature estate where CA policies are managed by multiple people without change control, there will be differences. For each difference:
- Was it intentional? Is there a change record?
- Does the difference make the policy more or less restrictive?
- If a policy was modified by someone without change authorization, how long ago and how would it have been detected?
The absence of a known-good baseline means the client cannot answer these questions. The presence of a known-good baseline and a diff is the beginning of drift detection. If the diff reveals changes made outside the change window or without documentation, that is a control failure independent of whether the change was malicious.
---
## The close
### What changes structurally
At the end of this engagement, for every finding that was verified by observation (not just inspected), produce a specific structural change:
| Finding type | Structural change target |
|---|---|
| Ghost CA policy found | Policy recreated, re-tested, documented |
| PIM activation MFA is push-approve | Migration to phishing-resistant MFA scoped |
| Kerberoasting not detected | Detection rule created, tested end-to-end |
| Standing GA outside PIM | Account removed from role; break-glass confirmed working |
| Anonymous links not revoked | Links enumerated and revoked; expiration policy applied |
| BEC rule creation not detected | Exchange alert configured, tested |
| Alert queue not triaged | Alert owner named, SLA defined, volume reduced |
| Backup MTTR exceeds policy | Policy updated to observed time; rehearsal scheduled |
The engagement deliverable is not the report. The deliverable is the list of structural changes, plus the metrics: BloodHound path count before and after, standing privilege account count before and after, confirmed-working detection count, and measured MTTR.
### Metrics to deliver at close
| Metric | Before | After |
|--------|--------|-------|
| BloodHound paths to Domain Admin (from standard user) | | |
| Standing (non-break-glass) Global Admin count | | |
| Standing (non-break-glass) Domain Admin count | | |
| CA policies verified to enforce by observation | | |
| Detection signals tested end-to-end and confirmed working | | |
| Anonymous link count (existing) | | |
| Unmanaged devices in sign-in logs (% of total) | | |
| Actual MTTR from backup restore drill | | |
| Structural changes from last 5 incidents (before) | | |
These numbers are the honest alternative to a compliance score. None of them can be faked by clicking a toggle. All of them represent something an attacker either can or cannot do.
---
## 7. The leave-behind
The engagement ends. The admin has to operate the estate alone until the next engagement. This section is what you set up during the engagement so they can do that.
### The self-service cadence document
Every adversarial validation engagement closes with a filled-in [Self-Service Cadence](../assessment-templates/self-service-cadence.md) document, customized for the client. The template becomes their recurring runbook — monthly portal checks, quarterly tool runs, and a clear list of "call us if you see this" triggers.
Spend the last session of the engagement walking through the document with the named admin. Run the first quarterly check together, with them driving. The goal is not to hand over a PDF — it is to verify they can execute it without you in the room.
### Tools to leave installed and working
Before you leave, confirm these are installed and the admin has run each at least once:
| Tool | Confirm working | Leave-behind |
|------|----------------|--------------|
| PingCastle | Run a healthcheck scan, admin can read the output | HTML report from today as the baseline |
| Purple Knight | Run a full scan, admin can read the indicators | PDF report from today as the baseline |
| CAExporter | Exported today's CA policies, stored in agreed location | JSON files from today as the known-good |
| Graph PowerShell module | Admin can connect and run the scripts in the cadence document | Scripts saved to the agreed local path |
| PnP PowerShell | Admin can connect to SharePoint admin and run the anonymous link export | Confirmed connected during the session |
Do not leave a tool installed that the admin has never run. An unfamiliar tool is not a capability — it is a task that will not get done.
### The baseline numbers
At close of engagement, record the opening and closing metrics in the tracking spreadsheet you set up with the admin. These are the numbers their quarterly PingCastle and Purple Knight runs will be compared against. Without a baseline, a quarterly scan is a point in time with no direction — with a baseline, it tells a story.
| Metric | Value at close of engagement |
|--------|------------------------------|
| PingCastle score | |
| Purple Knight: Critical indicators | |
| BloodHound paths to DA (standard user) | |
| Standing GA count (non-break-glass) | |
| Anonymous link count | |
| Stale guest count (90+ days inactive) | |
| CA policies verified to enforce | |
| Detection signals confirmed working | |
### "Call us" triggers — agree them explicitly
From the [cadence document](../assessment-templates/self-service-cadence.md), go through the trigger list out loud with the admin and confirm they understand each one. The list exists so they do not have to judge whether something is important enough to contact you — the bar is already defined.
The most important part of this conversation: *"When in doubt, contact us. We would rather look at a false alarm than hear about a real incident that sat for two weeks because you were not sure if it was worth mentioning."*
---
## What this engagement is not
**Not a red team.** The client knows you are here. You are working with them, not against them. When a simulation fires an alert, you tell the responder it is a test. The goal is to calibrate the detection, not to prove that you can evade it.
**Not a vulnerability scan.** You are not looking for unpatched CVEs or misconfigured services in bulk. You are validating the specific controls the client believes are in place.
**Not a compliance audit.** You will not produce a CIS score or a NIST gap report at the end. You will produce a list of controls that work and a list of controls that do not, measured by observation, with structural changes attached to the ones that do not.
**Not additive.** You are not recommending new tools, new policies, or new products. If something does not work, the fix is almost always to remove the exception, test the existing control, or eliminate the coupling — not to add a compensating control on top of the broken one.
---
*Field Guide — Adversarial Validation. Updated June 2026. Review alongside the main field guide — January 2027.*