22 KiB
Adversarial Validation Checklist
For clients who have done the foundational work. Everything here is tested, not inspected.
Last updated: June 2026 Engagement type: Phase 2 — mature estates Field guide: Adversarial Validation Field Guide Next review: January 2027
How to use this
This checklist assumes the foundational controls are in place. The question is not "does this control exist" — it is "does this control work." Every item is a test. If an item cannot be tested in the current engagement window, mark it as untested and note it as a finding: an untested control is a broken control, you simply do not know it yet.
Before any test: confirm written authorization. Before the first test: capture baseline metrics (BloodHound path count, Entra role assignment export, CA policy JSON export). After the engagement: record the "after" metrics.
Notation:
[VERIFY] — confirm the claim against observed behavior
[SIMULATE] — run the attack or failure scenario, authorized and controlled
[MEASURE] — produce a number; the number is the finding, not pass/fail
Opening metrics (capture before first test)
[MEASURE]BloodHound paths to Domain Admin (all paths; then filtered to paths reachable from standard user compromise)[MEASURE]Count of active (non-eligible) Global Admin assignments excluding break-glass[MEASURE]Count of active (non-eligible) Domain Admin assignments[MEASURE]Service principals with escalation-grade Graph permissions (application permissions)[MEASURE]CA policies verified to enforce (by prior observation) vs. total CA policies in scope[MEASURE]Distinct device IDs in sign-in logs (last 30 days) vs. Intune enrolled device count[MEASURE]Alert volume per day (last 30 days) vs. alerts with documented human response[MEASURE]Structural changes produced by the last five closed security incidents or alerts[MEASURE]Anonymous link count across SharePoint/OneDrive (existing, regardless of current tenant setting)[MEASURE]Backup MTTR from last documented restore (if any; if none, record "never tested")
Section 1 — Identity: the wall
1.1 Firebreak integrity
[VERIFY]Pull all Global Admin members and checkonPremisesSyncEnabledfor each. Anytruevalue is a P0. "We moved them to cloud-only" is the claim; this is the verification.[VERIFY]Trace every path from a simulated on-prem compromise (sync server connector account) to a cloud privileged role. Draw the graph. Each path is a hole in the wall.[VERIFY]For each cloud admin: what MFA device are they using, and is that device also used for email and browsing? A Tier 2 device authenticating a Tier 0 role is a tier violation through the MFA layer.[VERIFY]Does any admin's MFA authenticator app depend on a phone number or device that is outside the client's MDM? (MFA backup codes stored in iCloud are a personal device dependency for a privileged role.)
1.2 Break-glass: real test
[SIMULATE]Sign in to the break-glass Global Admin account.[MEASURE]Time from sign-in to alert received by named responder.[VERIFY]Alert reaches the named responder (not just fires into a queue). Responder acknowledges.[VERIFY]Break-glass sign-in works with zero on-prem dependency (test while sync is stopped, or while on a network with no DC visibility).[VERIFY]Break-glass credentials can be retrieved from their storage location without the systems they are recovering (test retrieval physically or procedurally).
1.3 PIM enforcement
[VERIFY]For Global Administrator role PIM settings: what is the MFA method required on activation? Confirm it is phishing-resistant (FIDO2 or certificate). Push-approve is a finding.[SIMULATE]Activate an eligible GA role from a personal device or a non-compliant device. Is it blocked by a CA policy scoped to role activation?[SIMULATE]Request activation requiring approval. Does the approval notification reach the approver with meaningful context (what role, for whom, what justification)? Does the approver act within SLA?[MEASURE]Maximum activation time box for GA and Privileged Role Admin. Record in hours. 24-hour window = functionally standing privilege during business hours.[VERIFY]Are there any GA assignments that are active (permanent) and are not break-glass accounts? Pull the list; any result is a PIM compliance gap from configuration drift.
1.4 AD FS (if still running)
[MEASURE]Token-signing certificate age in days since last rotation.[SIMULATE]Golden SAML tabletop: if the private key were obtained, what alert (if any) would fire? Walk through the detection path. Document what is visible and what is not.[VERIFY]Is there a signed migration plan with a named date? If not, document as P0 finding — migration tooling is mature; absence of a plan is a decision, not a default.
1.5 Connector account monitoring
[SIMULATE]Authenticate as the Entra connector account (Directory Synchronization Accounts) from a host other than the sync server. Does an alert fire?[MEASURE]Time from test authentication to alert receipt.[VERIFY]If no alert fires: the most DCSync-capable account in the estate is unmonitored. Document as P0.
1.6 Seamless SSO / AZUREADSSOACC
[VERIFY]Get-ADComputer AZUREADSSOACC -Properties PasswordLastSet— compare to approximate tenant go-live date. If matching: never rotated.[VERIFY]If Seamless SSO is not needed for the current device estate (Entra-joined devices on modern auth): document removal as a quick win.
Section 2 — Privilege: attack paths
2.1 BloodHound / attack path analysis
[MEASURE]Total BloodHound paths to Domain Admin.[MEASURE]Shortest path (fewest hops) to Domain Admin from a standard user account. Enumerate the specific path.[MEASURE]Number of paths involving Kerberoastable service accounts.[MEASURE]Number of paths involving ADCS templates (add ACL collection to BloodHound run).[VERIFY]Has anyone on the client team reviewed BloodHound output in the last 90 days? If not, the path count from the last review is the stale baseline, not the current state.
2.2 Kerberoasting: attack and detection
[SIMULATE]Run Invoke-Kerberoast or Rubeus kerberoast (authorized, test account as origin).[VERIFY]Did Defender for Identity, Sentinel, or any SIEM alert on the TGS request pattern?[MEASURE]Time from attack to alert receipt (if alert fires).[SIMULATE]Attempt to crack the harvested hashes offline. Record which accounts crack and approximate crack time.- Finding: accounts that crack quickly + no detection = P0 on both the account and the detection gap.
2.3 ADCS
[VERIFY]Runcertipy findorCertify.exe find /vulnerableagainst the CA. Document any ESC findings.[VERIFY]Is the ADCS server on a dedicated Tier 0 or hardened host, or on a standard server? Check who has local admin access.[VERIFY]Are there published certificate templates with "Supply subject in request" and enrollment permissions broader than the intended service? (ESC1 pattern)[SIMULATE]If ESC1 is found: demonstrate the exploit path (in authorized test context — enroll a cert for a test admin account using the vulnerable template). Show the client the domain admin cert in hand.
2.4 Service principal dark matter
[VERIFY]For each service principal with escalation-grade application permissions: ask the room to identify the current owner and current use case. Document every "I don't know."[VERIFY]For each: checklastSignInDateTimefor the service principal. Unused principal + dangerous permissions + non-expiring secret = standing credential that can be activated any time.[VERIFY]Are there app registrations with admin consent granted forMail.Read,Files.ReadWrite.All, or equivalent — where the granting user or admin is no longer at the organization?[SIMULATE]Attempt to use a service principal with dangerous Graph permissions to escalate: assign a role, add an app role assignment, or read all users. Confirm the permission is real and enforced (not just declared).
2.5 Standing privilege beyond PIM
[VERIFY]Pull active (not eligible) role assignments for GA, PRA, Security Admin, Exchange Admin. Any active assignment not in the break-glass inventory is a drift finding.[VERIFY]Pull Domain Admins and Enterprise Admins. Count them. Ask the client how many they believe exist. Present the actual count. In most estates, the actual count exceeds the belief.[VERIFY]Are there administrator accounts with no associated human — service accounts running with Domain Admin because "it was easier at the time"?
2.6 Local privilege on endpoints
[VERIFY]Pull local Administrators group membership across a sample of endpoints (10+ devices). Are there accounts beyond the expected (LAPS-managed local admin, Entra-joined device admin, EPM)?[VERIFY]Is Windows LAPS deployed and confirmed working? Retrieve a LAPS password for a test device through Intune or the AD attribute. Confirm rotation has occurred (password age < 30 days or per policy).[VERIFY]If EPM is deployed: test an elevation request for a controlled binary. Is it logged? Is the log reviewed by anyone?
Section 3 — Devices: compliance signal gap
3.1 CA policy enforcement (test each separately)
For each CA policy in scope, write the expected outcome before looking at the configuration. Then test:
[SIMULATE]Legacy auth block: Authenticate using Basic Auth from a test account (Exchange ActiveSync, SMTP auth, or equivalent). Expected: blocked. Result: ___[SIMULATE]Compliant device gate: Sign in from a known non-compliant device (personal device, or a managed device taken out of compliance). Expected: blocked from sensitive workloads. Result: ___[SIMULATE]Admin sign-in location gate: Attempt a PIM role activation from a device outside the named compliant/PAW scope. Expected: blocked. Result: ___[SIMULATE]MFA enforcement: Sign in as a test user from a new device with no registered session. Expected: MFA challenged. Confirm the MFA method that fires (push-approve vs. FIDO2). Result: ___[VERIFY]For any policy that fails to enforce despite correct displayed configuration: recreate from scratch, re-test. Document if ghost policy confirmed.[VERIFY]Are there CA policies in report-only mode that should be enabled? Report-only is a test state, not a permanent posture.[VERIFY]Break-glass accounts excluded from blocking policies — test the break-glass sign-in path specifically under the conditions a blocking policy would normally fire.
3.2 Compliance signal quality
[SIMULATE]Induce a non-compliant state on a test managed device. Record the timestamp.[MEASURE]Time from non-compliance induction to Intune state update.[MEASURE]Time from non-compliance induction to CA token revocation / session block.[VERIFY]Is CAE (Continuous Access Evaluation) active for critical workloads? If yes, measure revocation time for a CAE-supported app vs. a non-CAE app. Present the gap.[SIMULATE]Root / jailbreak a test device. Does the jailbreak detection in the compliance policy trigger? How long?
3.3 Fleet reality check
[MEASURE]Distinct device IDs in sign-in logs (last 30 days).[MEASURE]Intune enrolled device count.[MEASURE]Devices in sign-in logs with device compliance state "non-compliant" or "unknown."[VERIFY]Are there legacy-auth sign-ins in the logs that bypass device compliance evaluation entirely? Filter by Client App = non-modern entries. Each entry is a device control bypass.[VERIFY]Pick 5 devices from the sign-in log that are not in Intune. What data do they have access to? What CA policy, if any, applies to them?
3.4 Update rings and rollback
[VERIFY]Are update rings configured with a named pilot group and a broad group with deferral?[VERIFY]Is there a named person with the process to halt a broad ring update push? Do they know the procedure? Have they tested it?[SIMULATE](If authorized and non-disruptive) Push a test configuration change to the pilot ring only. Confirm it stays in the pilot ring and does not propagate to broad without explicit promotion.
3.5 MAM boundary (per platform)
[SIMULATE]On iOS: copy text from managed Outlook to an unmanaged app. Blocked or not?[SIMULATE]On Android: same test. (Do separately — behavior is not symmetric.)[SIMULATE]On iOS: "Open in" from a managed email attachment to Files app or an unmanaged viewer.[SIMULATE]On either platform: save to local storage or backup to iCloud/Google Drive.[VERIFY]For any gap found: confirm it reproduces after device reset. If it does, escalate to vendor. If it does not, investigate configuration.
Section 4 — Data: does protection travel
4.1 Label encryption in the wild
[SIMULATE]Forward a Highly Confidential test document to an external test email address. Open it from a mail client with no tenant authentication. Does encryption prevent access?[SIMULATE]Download the same document to an unmanaged device. Does encryption require re-authentication to the tenant?[SIMULATE]Share the document via an anonymous link. Access from an unauthenticated browser. Does it open?[SIMULATE]Copy/paste content from the document on a managed device under a MAM policy. Is it blocked?[VERIFY]For any path where the document opens without authentication: this is an exfiltration route. Document the specific path, the expected control that should have blocked it, and the observed result.
4.2 DLP enforcement
[SIMULATE]Send an email from a test account containing content matching a high-value DLP rule (credit card number pattern, national ID format, or the client's custom regex for crown-jewel content). Does DLP intercept it? What action fires (block, override, audit-only)?[SIMULATE]Upload the same content to a personal OneDrive or cloud storage from a managed device. Does DLP fire?[VERIFY]For DLP rules that fire in audit-only mode: what happens to the audit events? Are they reviewed? By whom? How often?[VERIFY]What is the false positive rate for high-sensitivity DLP rules? High false positive rates mean users have learned to override; the rule is not a control.
4.3 Anonymous links (existing population)
[MEASURE]Full count of anonymous links across the tenant. (Not the current sharing setting — the existing links that predate any restriction.)[VERIFY]Confirm at least one existing anonymous link resolves from an unauthenticated browser. It does — almost certainly. This proves the declared sharing restriction is forward-looking, not retroactive.[VERIFY]Can the client produce the anonymous link list and revoke all entries in under 30 minutes? Test the revocation capability, not just the list.
4.4 Email exfiltration paths
[SIMULATE]Create a test Inbox rule on a test account forwarding to an external test address. Does anything alert? When?[VERIFY]Get-RemoteDomain Default | Select-Object AutoForwardEnabled— if False, test whether the Inbox rule still forwards. Document the result (transport-level and client-rule forwarding behave differently).[VERIFY]Get-TransportRulefor any rules with external redirect or blind copy. For each: who created it, when, and is there a documented owner?[MEASURE]Time from Inbox rule creation to detection alert (if any).
4.5 Guest access and reshare chain
[MEASURE]Total guest count. Guests not signed in for 90+ days. Ratio of stale to active.[VERIFY]Do guests have access beyond their original project scope? Pick 5 random active guests and enumerate their group and site memberships.[SIMULATE]Share a test document to a test external guest. Have the guest reshare to a second external test account. Can the client observe the second hop? Can they revoke it?[VERIFY]Are access reviews running for guests? What is the default action on reviewer non-response?
4.6 Audit log forensics readiness
[VERIFY]Confirm audit logging is enabled (Purview > Audit — look for the "Start recording" banner; if it appears, logging is off).[SIMULATE]Run a forensic reconstruction: given a specific test user account, reconstruct everywhere they accessed data in the last 7 days. Can you produce a coherent picture from the audit log alone?[MEASURE]How far back does the audit log extend for the current licensing tier? Test by querying for a known event at the boundary date.[VERIFY]Are admin operations (CA policy changes, role assignments, app consent grants) present in the audit log? Run a query for admin events from the last 30 days and spot-check for completeness.
Section 5 — Detection: the eight simulations
For each simulation: run it, record whether the alert fired, record the time from event to human acknowledgment, and record whether the responder acted. The SLA comparison is the finding.
| Simulation | Alert fires? | Time to human | Action taken | Finding |
|---|---|---|---|---|
| Break-glass sign-in | ||||
| New Global Admin assigned | ||||
| DCSync from non-DC host | ||||
| Kerberoasting (TGS pattern) | ||||
| Impossible travel (admin account) | ||||
| External auto-forward rule created | ||||
| Mass download from SharePoint | ||||
| OAuth consent grant (sensitive scope) |
5.1 Alert queue health
[MEASURE]Alert volume per day (last 30 days).[MEASURE]Alerts with documented human response.[MEASURE]Alerts suppressed or auto-closed without human review.[MEASURE]Alerts open for more than 48 hours.[VERIFY]For every alert category: is there a named owner? An alert category with no named owner is an unread alert category.[VERIFY]Pick 5 alerts from the last 30 days that were closed. For each: what action was taken, and what structural change resulted?
5.2 The feedback loop test
[MEASURE]Last 5 closed security incidents: structural changes produced (count removals, access reductions, severed couplings — not reminders, training, or "noted in risk register").[VERIFY]Is there a post-incident process that explicitly asks: "what structural thing changes as a result of this?"[VERIFY]Is the post-incident process blameless on people (encouraging surfacing) and ruthless on structure (demanding a removal or change)?
Section 6 — Recovery
6.1 Backup: restore something
[SIMULATE]Restore a mailbox (or a mailbox item set) from the third-party backup. Time the operation.[MEASURE]Actual MTTR from test restore vs. policy-declared RTO.[VERIFY]If the actual MTTR exceeds the policy RTO: the policy is a fiction. Document the observed time as the operative figure.[VERIFY]Are backups isolated from the estate they protect? Can a Global Admin delete the backup copies?[VERIFY]Is there a third-party M365 backup at all? If not: M365 native recycle bin + version history is the only recovery mechanism, and this is a P0 for any organization with business-critical M365 data.
6.2 AD forest recovery
[VERIFY]Does a written AD forest recovery runbook exist?[VERIFY]Is it stored where it can be retrieved when AD is down? (Not SharePoint. Not AD-authenticated storage.)[VERIFY]Has anyone on the team run the procedure — not a tabletop, an actual restore, even in a lab?[VERIFY]Does the runbook include: DC restore sequence, metadata cleanup, double KRBTGT rotation, trust resets?- Finding if all above are no: the first time AD forest recovery is performed will be during the real disaster. Document as a rehearsal scope item.
6.3 Configuration known-good
[VERIFY]Export current CA policies to JSON. Diff against the opening-of-engagement export. For every difference: is there a change record?[VERIFY]Are there CA policies that changed since the last documented review without a corresponding change order?[VERIFY]If a CA policy was silently modified (intentionally or not), what mechanism would have detected it and when?
6.4 Break-glass independence
[VERIFY]Cloud admin recovery path works with no on-prem dependency — confirm by testing while sync is stopped or from a network with no DC visibility.[VERIFY]If the primary MFA infrastructure (Microsoft Authenticator, FIDO2 key) is unavailable, is there a recovery path for privileged access that does not itself require privileged access?
Closing metrics (capture after engagement)
| Metric | Before | After | Delta |
|---|---|---|---|
| BloodHound paths to DA (from standard user) | |||
| Active (non-break-glass) Global Admin assignments | |||
| Active (non-break-glass) Domain Admin assignments | |||
| CA policies verified by observation (working) | |||
| Detection signals tested end-to-end (working) | |||
| Anonymous link count | |||
| Unmanaged device sign-in % of total | |||
| Actual backup MTTR (minutes) | |||
| Structural changes from last 5 incidents (before) | |||
| Structural changes produced this engagement |
Engagement close verification
Before marking the engagement complete:
- Every finding that was verified by observation has a structural change attached (not a risk register entry — a change).
- The closing metrics have been calculated and compared to the opening metrics.
- The break-glass has been tested and works.
- At least one backup restore has been timed and the MTTR recorded.
- At least one CA policy has been verified to enforce by a real sign-in with pre-written expected outcomes.
- At least one detection signal has been tested end-to-end to a human responder.
- The configuration-as-code export (CA policies, role assignments) has been stored and the client has it.
- A named date exists for the next adversarial validation cycle.
The engagement is not complete when the list is walked. It is complete when every finding from observation has become a structural change or a named, dated, owned commitment.
Adversarial Validation Checklist. Updated June 2026. Review alongside the field guide — January 2027.