feat: Add engagement checklist, adversarial validation, and self-service cadence

2026-06-09 11:48:07 +02:00
parent 0d52474c30
commit 3226e53f95
4 changed files with 1080 additions and 0 deletions
@@ -8,6 +8,9 @@ This directory contains diagnostic tools, maturity models, and assessment resour
 | Template | Purpose |
 |----------|---------|
 | [Engagement Checklist](engagement-checklist.md) | **Point-in-time, regularly updated.** Controls to inspect on every M365+AD engagement, organized by domain. Not scored — a structured inspection list. Review January 2027. |
 | [Adversarial Validation Checklist](adversarial-validation-checklist.md) | **Phase 2 — mature estates.** Every item is a test, not an inspection. Opening/closing metrics, eight detection simulations, CA ghost policy tests, attack path verification. Review January 2027. |
 | [Self-Service Cadence](self-service-cadence.md) | **Client leave-behind.** Monthly portal checks and quarterly tool runs (PingCastle, Purple Knight, CAExporter, PowerShell scripts) an admin can run between engagements. Includes "call us" triggers. Customise per client before handing over. |
 | [Assessment Team Guide](assessment-team-guide.md) | Technical execution guide for the Brownhat Diagnostic: tool sequence (ASTRAL, PULSAR, BloodHound, Elysium, Purple Knight, CAExporter), what to look for, kill chain synthesis, report structure, common mistakes. |
 | [Findings Backlog](findings-backlog.md) | Single source of truth for all findings across every module and diagnostic. The input queue for the housekeeping stream. Pragmatic alternative to a formal risk register for organisations that do not have one. |
 | [NIST CSF 2.0 Baseline Assessment](nist-csf-baseline.md) | The Brownhat Diagnostic: structured 2-half-day workshop, gap analysis, kill chain identification |
@@ -0,0 +1,319 @@
 # Adversarial Validation Checklist
 > *For clients who have done the foundational work. Everything here is tested, not inspected.*
 **Last updated:** June 2026
 **Engagement type:** Phase 2 — mature estates
 **Field guide:** [Adversarial Validation Field Guide](../books/field-guide-adversarial-validation.md)
 **Next review:** January 2027
 ---
 ## How to use this
 This checklist assumes the foundational controls are in place. The question is not "does this control exist" — it is "does this control work." Every item is a test. If an item cannot be tested in the current engagement window, mark it as untested and note it as a finding: **an untested control is a broken control, you simply do not know it yet.**
 Before any test: confirm written authorization. Before the first test: capture baseline metrics (BloodHound path count, Entra role assignment export, CA policy JSON export). After the engagement: record the "after" metrics.
 **Notation:**
 `[VERIFY]` — confirm the claim against observed behavior
 `[SIMULATE]` — run the attack or failure scenario, authorized and controlled
 `[MEASURE]` — produce a number; the number is the finding, not pass/fail
 ---
 ## Opening metrics (capture before first test)
 - `[MEASURE]` BloodHound paths to Domain Admin (all paths; then filtered to paths reachable from standard user compromise)
 - `[MEASURE]` Count of active (non-eligible) Global Admin assignments excluding break-glass
 - `[MEASURE]` Count of active (non-eligible) Domain Admin assignments
 - `[MEASURE]` Service principals with escalation-grade Graph permissions (application permissions)
 - `[MEASURE]` CA policies verified to enforce (by prior observation) vs. total CA policies in scope
 - `[MEASURE]` Distinct device IDs in sign-in logs (last 30 days) vs. Intune enrolled device count
 - `[MEASURE]` Alert volume per day (last 30 days) vs. alerts with documented human response
 - `[MEASURE]` Structural changes produced by the last five closed security incidents or alerts
 - `[MEASURE]` Anonymous link count across SharePoint/OneDrive (existing, regardless of current tenant setting)
 - `[MEASURE]` Backup MTTR from last documented restore (if any; if none, record "never tested")
 ---
 ## Section 1 — Identity: the wall
 ### 1.1 Firebreak integrity
 - `[VERIFY]` Pull all Global Admin members and check `onPremisesSyncEnabled` for each. Any `true` value is a P0. "We moved them to cloud-only" is the claim; this is the verification.
 - `[VERIFY]` Trace every path from a simulated on-prem compromise (sync server connector account) to a cloud privileged role. Draw the graph. Each path is a hole in the wall.
 - `[VERIFY]` For each cloud admin: what MFA device are they using, and is that device also used for email and browsing? A Tier 2 device authenticating a Tier 0 role is a tier violation through the MFA layer.
 - `[VERIFY]` Does any admin's MFA authenticator app depend on a phone number or device that is outside the client's MDM? (MFA backup codes stored in iCloud are a personal device dependency for a privileged role.)
 ### 1.2 Break-glass: real test
 - `[SIMULATE]` Sign in to the break-glass Global Admin account.
 - `[MEASURE]` Time from sign-in to alert received by named responder.
 - `[VERIFY]` Alert reaches the named responder (not just fires into a queue). Responder acknowledges.
 - `[VERIFY]` Break-glass sign-in works with zero on-prem dependency (test while sync is stopped, or while on a network with no DC visibility).
 - `[VERIFY]` Break-glass credentials can be retrieved from their storage location without the systems they are recovering (test retrieval physically or procedurally).
 ### 1.3 PIM enforcement
 - `[VERIFY]` For Global Administrator role PIM settings: what is the MFA method required on activation? Confirm it is phishing-resistant (FIDO2 or certificate). Push-approve is a finding.
 - `[SIMULATE]` Activate an eligible GA role from a personal device or a non-compliant device. Is it blocked by a CA policy scoped to role activation?
 - `[SIMULATE]` Request activation requiring approval. Does the approval notification reach the approver with meaningful context (what role, for whom, what justification)? Does the approver act within SLA?
 - `[MEASURE]` Maximum activation time box for GA and Privileged Role Admin. Record in hours. 24-hour window = functionally standing privilege during business hours.
 - `[VERIFY]` Are there any GA assignments that are active (permanent) and are not break-glass accounts? Pull the list; any result is a PIM compliance gap from configuration drift.
 ### 1.4 AD FS (if still running)
 - `[MEASURE]` Token-signing certificate age in days since last rotation.
 - `[SIMULATE]` Golden SAML tabletop: if the private key were obtained, what alert (if any) would fire? Walk through the detection path. Document what is visible and what is not.
 - `[VERIFY]` Is there a signed migration plan with a named date? If not, document as P0 finding — migration tooling is mature; absence of a plan is a decision, not a default.
 ### 1.5 Connector account monitoring
 - `[SIMULATE]` Authenticate as the Entra connector account (Directory Synchronization Accounts) from a host other than the sync server. Does an alert fire?
 - `[MEASURE]` Time from test authentication to alert receipt.
 - `[VERIFY]` If no alert fires: the most DCSync-capable account in the estate is unmonitored. Document as P0.
 ### 1.6 Seamless SSO / AZUREADSSOACC
 - `[VERIFY]` `Get-ADComputer AZUREADSSOACC -Properties PasswordLastSet` — compare to approximate tenant go-live date. If matching: never rotated.
 - `[VERIFY]` If Seamless SSO is not needed for the current device estate (Entra-joined devices on modern auth): document removal as a quick win.
 ---
 ## Section 2 — Privilege: attack paths
 ### 2.1 BloodHound / attack path analysis
 - `[MEASURE]` Total BloodHound paths to Domain Admin.
 - `[MEASURE]` Shortest path (fewest hops) to Domain Admin from a standard user account. Enumerate the specific path.
 - `[MEASURE]` Number of paths involving Kerberoastable service accounts.
 - `[MEASURE]` Number of paths involving ADCS templates (add ACL collection to BloodHound run).
 - `[VERIFY]` Has anyone on the client team reviewed BloodHound output in the last 90 days? If not, the path count from the last review is the stale baseline, not the current state.
 ### 2.2 Kerberoasting: attack and detection
 - `[SIMULATE]` Run Invoke-Kerberoast or Rubeus kerberoast (authorized, test account as origin).
 - `[VERIFY]` Did Defender for Identity, Sentinel, or any SIEM alert on the TGS request pattern?
 - `[MEASURE]` Time from attack to alert receipt (if alert fires).
 - `[SIMULATE]` Attempt to crack the harvested hashes offline. Record which accounts crack and approximate crack time.
 - Finding: accounts that crack quickly + no detection = P0 on both the account and the detection gap.
 ### 2.3 ADCS
 - `[VERIFY]` Run `certipy find` or `Certify.exe find /vulnerable` against the CA. Document any ESC findings.
 - `[VERIFY]` Is the ADCS server on a dedicated Tier 0 or hardened host, or on a standard server? Check who has local admin access.
 - `[VERIFY]` Are there published certificate templates with "Supply subject in request" and enrollment permissions broader than the intended service? (ESC1 pattern)
 - `[SIMULATE]` If ESC1 is found: demonstrate the exploit path (in authorized test context — enroll a cert for a test admin account using the vulnerable template). Show the client the domain admin cert in hand.
 ### 2.4 Service principal dark matter
 - `[VERIFY]` For each service principal with escalation-grade application permissions: ask the room to identify the current owner and current use case. Document every "I don't know."
 - `[VERIFY]` For each: check `lastSignInDateTime` for the service principal. Unused principal + dangerous permissions + non-expiring secret = standing credential that can be activated any time.
 - `[VERIFY]` Are there app registrations with admin consent granted for `Mail.Read`, `Files.ReadWrite.All`, or equivalent — where the granting user or admin is no longer at the organization?
 - `[SIMULATE]` Attempt to use a service principal with dangerous Graph permissions to escalate: assign a role, add an app role assignment, or read all users. Confirm the permission is real and enforced (not just declared).
 ### 2.5 Standing privilege beyond PIM
 - `[VERIFY]` Pull active (not eligible) role assignments for GA, PRA, Security Admin, Exchange Admin. Any active assignment not in the break-glass inventory is a drift finding.
 - `[VERIFY]` Pull Domain Admins and Enterprise Admins. Count them. Ask the client how many they believe exist. Present the actual count. In most estates, the actual count exceeds the belief.
 - `[VERIFY]` Are there administrator accounts with no associated human — service accounts running with Domain Admin because "it was easier at the time"?
 ### 2.6 Local privilege on endpoints
 - `[VERIFY]` Pull local Administrators group membership across a sample of endpoints (10+ devices). Are there accounts beyond the expected (LAPS-managed local admin, Entra-joined device admin, EPM)?
 - `[VERIFY]` Is Windows LAPS deployed and confirmed working? Retrieve a LAPS password for a test device through Intune or the AD attribute. Confirm rotation has occurred (password age < 30 days or per policy).
 - `[VERIFY]` If EPM is deployed: test an elevation request for a controlled binary. Is it logged? Is the log reviewed by anyone?
 ---
 ## Section 3 — Devices: compliance signal gap
 ### 3.1 CA policy enforcement (test each separately)
 For each CA policy in scope, write the expected outcome before looking at the configuration. Then test:
 - `[SIMULATE]` **Legacy auth block:** Authenticate using Basic Auth from a test account (Exchange ActiveSync, SMTP auth, or equivalent). Expected: blocked. Result: ___
 - `[SIMULATE]` **Compliant device gate:** Sign in from a known non-compliant device (personal device, or a managed device taken out of compliance). Expected: blocked from sensitive workloads. Result: ___
 - `[SIMULATE]` **Admin sign-in location gate:** Attempt a PIM role activation from a device outside the named compliant/PAW scope. Expected: blocked. Result: ___
 - `[SIMULATE]` **MFA enforcement:** Sign in as a test user from a new device with no registered session. Expected: MFA challenged. Confirm the MFA method that fires (push-approve vs. FIDO2). Result: ___
 - `[VERIFY]` For any policy that fails to enforce despite correct displayed configuration: recreate from scratch, re-test. Document if ghost policy confirmed.
 - `[VERIFY]` Are there CA policies in report-only mode that should be enabled? Report-only is a test state, not a permanent posture.
 - `[VERIFY]` Break-glass accounts excluded from blocking policies — test the break-glass sign-in path specifically under the conditions a blocking policy would normally fire.
 ### 3.2 Compliance signal quality
 - `[SIMULATE]` Induce a non-compliant state on a test managed device. Record the timestamp.
 - `[MEASURE]` Time from non-compliance induction to Intune state update.
 - `[MEASURE]` Time from non-compliance induction to CA token revocation / session block.
 - `[VERIFY]` Is CAE (Continuous Access Evaluation) active for critical workloads? If yes, measure revocation time for a CAE-supported app vs. a non-CAE app. Present the gap.
 - `[SIMULATE]` Root / jailbreak a test device. Does the jailbreak detection in the compliance policy trigger? How long?
 ### 3.3 Fleet reality check
 - `[MEASURE]` Distinct device IDs in sign-in logs (last 30 days).
 - `[MEASURE]` Intune enrolled device count.
 - `[MEASURE]` Devices in sign-in logs with device compliance state "non-compliant" or "unknown."
 - `[VERIFY]` Are there legacy-auth sign-ins in the logs that bypass device compliance evaluation entirely? Filter by Client App = non-modern entries. Each entry is a device control bypass.
 - `[VERIFY]` Pick 5 devices from the sign-in log that are not in Intune. What data do they have access to? What CA policy, if any, applies to them?
 ### 3.4 Update rings and rollback
 - `[VERIFY]` Are update rings configured with a named pilot group and a broad group with deferral?
 - `[VERIFY]` Is there a named person with the process to halt a broad ring update push? Do they know the procedure? Have they tested it?
 - `[SIMULATE]` (If authorized and non-disruptive) Push a test configuration change to the pilot ring only. Confirm it stays in the pilot ring and does not propagate to broad without explicit promotion.
 ### 3.5 MAM boundary (per platform)
 - `[SIMULATE]` On iOS: copy text from managed Outlook to an unmanaged app. Blocked or not?
 - `[SIMULATE]` On Android: same test. (Do separately — behavior is not symmetric.)
 - `[SIMULATE]` On iOS: "Open in" from a managed email attachment to Files app or an unmanaged viewer.
 - `[SIMULATE]` On either platform: save to local storage or backup to iCloud/Google Drive.
 - `[VERIFY]` For any gap found: confirm it reproduces after device reset. If it does, escalate to vendor. If it does not, investigate configuration.
 ---
 ## Section 4 — Data: does protection travel
 ### 4.1 Label encryption in the wild
 - `[SIMULATE]` Forward a Highly Confidential test document to an external test email address. Open it from a mail client with no tenant authentication. Does encryption prevent access?
 - `[SIMULATE]` Download the same document to an unmanaged device. Does encryption require re-authentication to the tenant?
 - `[SIMULATE]` Share the document via an anonymous link. Access from an unauthenticated browser. Does it open?
 - `[SIMULATE]` Copy/paste content from the document on a managed device under a MAM policy. Is it blocked?
 - `[VERIFY]` For any path where the document opens without authentication: this is an exfiltration route. Document the specific path, the expected control that should have blocked it, and the observed result.
 ### 4.2 DLP enforcement
 - `[SIMULATE]` Send an email from a test account containing content matching a high-value DLP rule (credit card number pattern, national ID format, or the client's custom regex for crown-jewel content). Does DLP intercept it? What action fires (block, override, audit-only)?
 - `[SIMULATE]` Upload the same content to a personal OneDrive or cloud storage from a managed device. Does DLP fire?
 - `[VERIFY]` For DLP rules that fire in audit-only mode: what happens to the audit events? Are they reviewed? By whom? How often?
 - `[VERIFY]` What is the false positive rate for high-sensitivity DLP rules? High false positive rates mean users have learned to override; the rule is not a control.
 ### 4.3 Anonymous links (existing population)
 - `[MEASURE]` Full count of anonymous links across the tenant. (Not the current sharing setting — the existing links that predate any restriction.)
 - `[VERIFY]` Confirm at least one existing anonymous link resolves from an unauthenticated browser. It does — almost certainly. This proves the declared sharing restriction is forward-looking, not retroactive.
 - `[VERIFY]` Can the client produce the anonymous link list and revoke all entries in under 30 minutes? Test the revocation capability, not just the list.
 ### 4.4 Email exfiltration paths
 - `[SIMULATE]` Create a test Inbox rule on a test account forwarding to an external test address. Does anything alert? When?
 - `[VERIFY]` `Get-RemoteDomain Default | Select-Object AutoForwardEnabled` — if False, test whether the Inbox rule still forwards. Document the result (transport-level and client-rule forwarding behave differently).
 - `[VERIFY]` `Get-TransportRule` for any rules with external redirect or blind copy. For each: who created it, when, and is there a documented owner?
 - `[MEASURE]` Time from Inbox rule creation to detection alert (if any).
 ### 4.5 Guest access and reshare chain
 - `[MEASURE]` Total guest count. Guests not signed in for 90+ days. Ratio of stale to active.
 - `[VERIFY]` Do guests have access beyond their original project scope? Pick 5 random active guests and enumerate their group and site memberships.
 - `[SIMULATE]` Share a test document to a test external guest. Have the guest reshare to a second external test account. Can the client observe the second hop? Can they revoke it?
 - `[VERIFY]` Are access reviews running for guests? What is the default action on reviewer non-response?
 ### 4.6 Audit log forensics readiness
 - `[VERIFY]` Confirm audit logging is enabled (Purview > Audit — look for the "Start recording" banner; if it appears, logging is off).
 - `[SIMULATE]` Run a forensic reconstruction: given a specific test user account, reconstruct everywhere they accessed data in the last 7 days. Can you produce a coherent picture from the audit log alone?
 - `[MEASURE]` How far back does the audit log extend for the current licensing tier? Test by querying for a known event at the boundary date.
 - `[VERIFY]` Are admin operations (CA policy changes, role assignments, app consent grants) present in the audit log? Run a query for admin events from the last 30 days and spot-check for completeness.
 ---
 ## Section 5 — Detection: the eight simulations
 For each simulation: run it, record whether the alert fired, record the time from event to human acknowledgment, and record whether the responder acted. The SLA comparison is the finding.
 | Simulation | Alert fires? | Time to human | Action taken | Finding |
 |---|---|---|---|---|
 | Break-glass sign-in | | | | |
 | New Global Admin assigned | | | | |
 | DCSync from non-DC host | | | | |
 | Kerberoasting (TGS pattern) | | | | |
 | Impossible travel (admin account) | | | | |
 | External auto-forward rule created | | | | |
 | Mass download from SharePoint | | | | |
 | OAuth consent grant (sensitive scope) | | | | |
 ### 5.1 Alert queue health
 - `[MEASURE]` Alert volume per day (last 30 days).
 - `[MEASURE]` Alerts with documented human response.
 - `[MEASURE]` Alerts suppressed or auto-closed without human review.
 - `[MEASURE]` Alerts open for more than 48 hours.
 - `[VERIFY]` For every alert category: is there a named owner? An alert category with no named owner is an unread alert category.
 - `[VERIFY]` Pick 5 alerts from the last 30 days that were closed. For each: what action was taken, and what structural change resulted?
 ### 5.2 The feedback loop test
 - `[MEASURE]` Last 5 closed security incidents: structural changes produced (count removals, access reductions, severed couplings — not reminders, training, or "noted in risk register").
 - `[VERIFY]` Is there a post-incident process that explicitly asks: "what structural thing changes as a result of this?"
 - `[VERIFY]` Is the post-incident process blameless on people (encouraging surfacing) and ruthless on structure (demanding a removal or change)?
 ---
 ## Section 6 — Recovery
 ### 6.1 Backup: restore something
 - `[SIMULATE]` Restore a mailbox (or a mailbox item set) from the third-party backup. Time the operation.
 - `[MEASURE]` Actual MTTR from test restore vs. policy-declared RTO.
 - `[VERIFY]` If the actual MTTR exceeds the policy RTO: the policy is a fiction. Document the observed time as the operative figure.
 - `[VERIFY]` Are backups isolated from the estate they protect? Can a Global Admin delete the backup copies?
 - `[VERIFY]` Is there a third-party M365 backup at all? If not: M365 native recycle bin + version history is the only recovery mechanism, and this is a P0 for any organization with business-critical M365 data.
 ### 6.2 AD forest recovery
 - `[VERIFY]` Does a written AD forest recovery runbook exist?
 - `[VERIFY]` Is it stored where it can be retrieved when AD is down? (Not SharePoint. Not AD-authenticated storage.)
 - `[VERIFY]` Has anyone on the team run the procedure — not a tabletop, an actual restore, even in a lab?
 - `[VERIFY]` Does the runbook include: DC restore sequence, metadata cleanup, double KRBTGT rotation, trust resets?
 - Finding if all above are no: the first time AD forest recovery is performed will be during the real disaster. Document as a rehearsal scope item.
 ### 6.3 Configuration known-good
 - `[VERIFY]` Export current CA policies to JSON. Diff against the opening-of-engagement export. For every difference: is there a change record?
 - `[VERIFY]` Are there CA policies that changed since the last documented review without a corresponding change order?
 - `[VERIFY]` If a CA policy was silently modified (intentionally or not), what mechanism would have detected it and when?
 ### 6.4 Break-glass independence
 - `[VERIFY]` Cloud admin recovery path works with no on-prem dependency — confirm by testing while sync is stopped or from a network with no DC visibility.
 - `[VERIFY]` If the primary MFA infrastructure (Microsoft Authenticator, FIDO2 key) is unavailable, is there a recovery path for privileged access that does not itself require privileged access?
 ---
 ## Closing metrics (capture after engagement)
 | Metric | Before | After | Delta |
 |--------|--------|-------|-------|
 | BloodHound paths to DA (from standard user) | | | |
 | Active (non-break-glass) Global Admin assignments | | | |
 | Active (non-break-glass) Domain Admin assignments | | | |
 | CA policies verified by observation (working) | | | |
 | Detection signals tested end-to-end (working) | | | |
 | Anonymous link count | | | |
 | Unmanaged device sign-in % of total | | | |
 | Actual backup MTTR (minutes) | | | |
 | Structural changes from last 5 incidents (before) | | | |
 | Structural changes produced this engagement | | | |
 ---
 ## Engagement close verification
 Before marking the engagement complete:
 - Every finding that was verified by observation has a structural change attached (not a risk register entry — a change).
 - The closing metrics have been calculated and compared to the opening metrics.
 - The break-glass has been tested and works.
 - At least one backup restore has been timed and the MTTR recorded.
 - At least one CA policy has been verified to enforce by a real sign-in with pre-written expected outcomes.
 - At least one detection signal has been tested end-to-end to a human responder.
 - The configuration-as-code export (CA policies, role assignments) has been stored and the client has it.
 - A named date exists for the next adversarial validation cycle.
 The engagement is not complete when the list is walked. It is complete when every finding from observation has become a structural change or a named, dated, owned commitment.
 ---
 *Adversarial Validation Checklist. Updated June 2026. Review alongside the field guide — January 2027.*
@@ -0,0 +1,378 @@
 # M365 + AD Engagement Checklist
 > *Not a benchmark. Not scored. A structured inspection list for consultants on active engagements.*
 **Last updated:** June 2026
 **Companion to:** [Field Guide 2026](../books/field-guide-2026.md) · [Books I–VI](../books/)
 **Next review:** January 2027
 ---
 ## How to use this
 Work through the relevant sections during the Brownhat Diagnostic or at the start of a module engagement. Each item is a control area — something to inspect and a question to answer honestly. Mark items that surface findings. Mark items that are verified clean. If an item is not applicable, note why.
 This is not a scoring tool. "Found" and "clean" are the only states that matter. A clean item with no evidence of testing is the same as not checked.
 **Notation used below:**
 - `[LOOK AT]` — inspect and document current state
 - `[TEST]` — verify by observation, not by reading the config
 - `[ASK]` — a question that requires a conversation, not just a portal check
 Nothing here replaces the governing question from Book I:
 > **If this is owned tonight, what is the largest thing an attacker reaches before hitting a wall — and can I draw that wall?**
 ---
 ## Section A — Hybrid Identity
 ### A1. Authentication Method
 - `[LOOK AT]` Which authentication method is actually in use: PHS, PTA, or Federation (AD FS)?
 - `[LOOK AT]` Does the method shown in the Entra portal match what is documented and what IT staff believe to be true?
 - `[TEST]` If on-prem AD is simulated as unavailable (pull the sync server), does cloud authentication survive? Which auth method does this actually prove?
 - `[LOOK AT]` Is PHS running alongside PTA as a failover? (Optionality — cheap insurance)
 - `[LOOK AT]` If on PTA: how many PTA agents are deployed, and what host/network tier are they on?
 ### A2. Sync Engine (Entra Connect / Cloud Sync)
 - `[LOOK AT]` Which sync engine is running: Entra Connect Sync or Entra Cloud Sync?
 - `[LOOK AT]` What server hosts the sync engine, and what domain/tier is it joined to?
 - `[LOOK AT]` What account runs the on-prem connector service, and does it have `Replicate Directory Changes All` (DCSync capability)?
 - `[LOOK AT]` What is the patch / update level of the sync server (OS and sync software)?
 - `[LOOK AT]` Who has local administrator rights on the sync server?
 - `[LOOK AT]` What does the Entra connector account (Directory Synchronization Accounts role) have permission to do in the cloud?
 - `[TEST]` If the connector account is monitored: does an alert fire when it authenticates from an unexpected host?
 - `[LOOK AT]` Are there active alerts or errors in the sync engine health dashboard?
 ### A3. AD FS
 - `[LOOK AT]` Is AD FS deployed and active?
 - `[ASK]` If yes: why is it still running? What relying party trusts require it, and is there a migration plan?
 - `[LOOK AT]` When was the token-signing certificate last rotated? Where is the private key stored?
 - `[LOOK AT]` Is the rollover certificate about to expire?
 - `[LOOK AT]` Which servers host AD FS, and what network tier and patching cadence do they have?
 - `[TEST]` Golden SAML tabletop: if the token-signing key were obtained, what would detection see, and how fast could the cert be rotated? Is the procedure written and tested?
 - `[ASK]` Is there a Entra staged rollout in progress or planned to migrate away from federation?
 ### A4. Privileged Account Sync
 - `[LOOK AT]` Are any Domain Admins, Enterprise Admins, or other Tier 0 accounts synced to Entra ID (i.e., present as cloud objects)?
 - `[LOOK AT]` Are Global Admins or other Entra privileged role holders cloud-only accounts, or synced from on-prem?
 - `[LOOK AT]` Are admin accounts (on-prem or cloud) using the same device for privileged work as for daily tasks (email, browsing)?
 ### A5. Writebacks
 - `[LOOK AT]` Which writebacks are enabled: password writeback, group writeback, device writeback?
 - `[ASK]` For each: who owns the decision, and is the reverse blast radius (cloud compromise → on-prem impact) documented?
 - `[LOOK AT]` Is group writeback (v2) enabled? If so, which cloud groups write into AD, and what on-prem resources do they gate?
 ### A6. Seamless SSO
 - `[LOOK AT]` Is Seamless SSO enabled?
 - `[LOOK AT]` When was the `AZUREADSSOACC` Kerberos key last rotated? (`Get-ADComputer AZUREADSSOACC -Properties PasswordLastSet`)
 - `[ASK]` Is Seamless SSO actually needed, or can it be removed (Entra-joined devices + modern auth typically do not require it)?
 ### A7. Sync Scope
 - `[LOOK AT]` Is sync scoped to specific OUs, or is "sync everything" the default?
 - `[LOOK AT]` Are there synced objects that serve no cloud purpose (decommissioned systems, service accounts, administrative accounts)?
 ### A8. Breach Optionality
 - `[ASK]` Is there a written, accessible runbook for severing the AD↔Entra bridge under breach conditions?
 - `[TEST]` Is the runbook stored somewhere accessible when both AD and SharePoint are unavailable?
 - `[ASK]` Has anyone walked through the "kill the sync" procedure, and does the team know what breaks per auth method?
 - `[LOOK AT]` Does the cloud admin path (break-glass Global Admin) work with zero on-prem dependency?
 ---
 ## Section B — Privileged Access
 ### B1. Standing Privilege Inventory
 - `[LOOK AT]` How many identities hold standing (permanent, active) privilege: Global Admin, Privileged Role Admin, Domain Admin, Enterprise Admin?
 - `[LOOK AT]` Are there any standing Global Admin assignments that are not break-glass accounts? (Should be zero)
 - `[LOOK AT]` How many Domain Admins and Enterprise Admins exist, and are they all justified with named owners?
 - `[ASK]` When was the privileged account list last reviewed, and by whom?
 ### B2. PIM / JIT
 - `[LOOK AT]` Is Entra PIM deployed and enforced for Entra administrative roles?
 - `[LOOK AT]` Are Entra roles set to eligible (not active) by default?
 - `[LOOK AT]` Does PIM activation require phishing-resistant MFA (FIDO2 / certificate), or just push-approve?
 - `[LOOK AT]` Do crown roles (Privileged Role Administrator, Global Administrator) require approval workflow on PIM activation?
 - `[LOOK AT]` What is the maximum activation time-box configured? (Should be justified and bounded — 8 hours maximum for a working day)
 - `[LOOK AT]` Is PIM alert configuration enabled (Roles activated without MFA, Redundant assignments, etc.)?
 - `[ASK]` For on-prem DA/EA: is there any JIT or time-limited elevation mechanism in place?
 ### B3. Service Accounts (On-Prem)
 - `[LOOK AT]` Are there service accounts with SPNs and static passwords older than 12 months? (Kerberoastable)
 - `[LOOK AT]` Which service accounts are over-permissioned (e.g., Domain Admin, local admin on all servers)?
 - `[LOOK AT]` Which service accounts have been migrated to gMSA?
 - `[LOOK AT]` Are there service accounts nobody can identify a current owner for?
 - `[TEST]` Run a Kerberoast simulation: do ticket requests for service account SPNs generate any detection?
 ### B4. Service Principals & App Registrations (Cloud)
 - `[LOOK AT]` Which app registrations hold escalation-grade Graph permissions (application permissions): `RoleManagement.ReadWrite.Directory`, `AppRoleAssignment.ReadWrite.All`, `Application.ReadWrite.All`, `Directory.ReadWrite.All`?
 - `[LOOK AT]` Which app registrations have non-expiring client secrets?
 - `[LOOK AT]` Are there orphaned app registrations with no current owner?
 - `[LOOK AT]` Which apps have tenant-wide admin consent, and is each justified and reviewed?
 - `[LOOK AT]` Which Azure workloads use client secrets instead of managed identities where managed identities are available?
 ### B5. Tier Model / Clean Source
 - `[LOOK AT]` Do Domain Admins / Enterprise Admins authenticate from standard workstations used for email and browsing?
 - `[LOOK AT]` Is ADCS (Active Directory Certificate Services) deployed? If so, is it on a Tier 0 or hardened host, or on a standard server?
 - `[LOOK AT]` Are there shared administrative jump boxes that cross tier boundaries (used for both Tier 0 and Tier 1 work)?
 - `[LOOK AT]` Do cloud admins use the same device for privileged Entra work as for daily activity?
 ### B6. Escalation Paths
 - `[LOOK AT]` Are there accounts with `GenericAll`, `WriteDACL`, or `WriteOwner` on high-value AD objects (domain root, DCs, admin groups) that are not themselves Tier 0?
 - `[LOOK AT]` Are there computers with unconstrained delegation enabled (excluding DCs)?
 - `[LOOK AT]` When was KRBTGT last rotated? (`Get-ADUser krbtgt -Properties PasswordLastSet`)
 - `[LOOK AT]` Is LAPS (Windows LAPS preferred) deployed across all workstations and servers? What is the coverage percentage?
 - `[TEST]` Run BloodHound (or equivalent) and count attack paths to Domain Admin. Note the number as a baseline. Is it going up or down over time?
 ### B7. Break-Glass
 - `[LOOK AT]` Do cloud-only break-glass Global Admin accounts exist?
 - `[LOOK AT]` Is phishing-resistant authentication (FIDO2 or certificate) configured on break-glass accounts?
 - `[LOOK AT]` Are break-glass accounts excluded from the CA policies that would otherwise enforce device compliance or block sign-in?
 - `[LOOK AT]` Does any use of the break-glass account trigger an immediate, monitored alert?
 - `[TEST]` Sign in to the break-glass account in a controlled drill. Does it work? Does the alert fire? Does someone respond?
 - `[ASK]` Where are the break-glass credentials stored, and can they be retrieved without the systems they recover?
 ### B8. Phishing-Resistant MFA for Admins
 - `[LOOK AT]` What MFA method is enforced for Global Admins: FIDO2, certificate-based auth, or push/SMS?
 - `[LOOK AT]` Push-approve and SMS are not acceptable for administrative accounts. If they are in use, that is a P0.
 - `[LOOK AT]` Is there a CA policy restricting privileged role activation to compliant/managed devices or named PAWs?
 ---
 ## Section C — Devices & Endpoint
 ### C1. Fleet Reality
 - `[LOOK AT]` Reconcile: Intune enrolled devices vs. Entra registered devices vs. sign-in log device population. What is the gap?
 - `[LOOK AT]` How many sign-in events in the last 30 days came from non-compliant or unmanaged devices (device compliance state = unknown or non-compliant in sign-in logs)?
 - `[LOOK AT]` Are there legacy-protocol sign-ins (Basic Auth) that bypass Conditional Access entirely? (Sign-in logs, filter Client App = "Exchange ActiveSync," "Other clients")
 - `[LOOK AT]` How many BYOD / personal devices are accessing corporate data through the web client or OWA (known-unmanaged population)?
 ### C2. Join State and Management Mode
 - `[LOOK AT]` Are devices Entra-joined, hybrid Entra-joined, or Entra-registered (BYOD)?
 - `[LOOK AT]` Is hybrid Entra join still in use? If so, which on-prem dependencies actually require it?
 - `[LOOK AT]` Is there a roadmap to go cloud-native (Entra join + Intune) for devices currently on hybrid join?
 - `[LOOK AT]` Are there GPO and Intune co-management conflicts producing inconsistent configuration?
 ### C3. Conditional Access Enforcement
 - `[TEST]` For every CA policy that enforces device compliance or blocks legacy auth: run real sign-ins with expected outcomes written down beforehand. Does the observed result match?
 - `[TEST]` If a policy looks correct but does not enforce: recreate from scratch, re-test. Document ghost policy findings.
 - `[LOOK AT]` Is there a CA policy blocking legacy authentication protocols across all apps? (This is the single highest-leverage CA policy — if not in place, that is P0)
 - `[LOOK AT]` Is there a CA policy requiring MFA for all admin role activations?
 - `[LOOK AT]` Is there a CA policy requiring compliant or managed device for access to sensitive workloads?
 - `[LOOK AT]` Are break-glass accounts and emergency service accounts correctly excluded from blocking CA policies?
 - `[TEST]` Lock yourself out in report-only mode (simulate a compliance failure on an admin account). Confirm break-glass bypasses the policy. Confirm a legitimate admin gets the expected failure and knows the escalation path.
 ### C4. Compliance Signal Quality
 - `[LOOK AT]` What is the compliance check-in cadence? (The window where a fallen-out device still holds a "compliant" token)
 - `[LOOK AT]` Is Continuous Access Evaluation (CAE) enabled for workloads that support it? (Narrows the stale-token window)
 - `[ASK]` Is root/jailbreak detection in compliance policy, and how is it treated — as a hard block or a risk signal? Is it believed to be a wall or a tripwire?
 - `[TEST]` Spoof compliance on a test device (root a test device). How long until the signal flips? Does CA revoke access?
 ### C5. Endpoint Privilege
 - `[LOOK AT]` Do standard users have standing local admin on their endpoints?
 - `[LOOK AT]` Is Endpoint Privilege Management (EPM) deployed, or is there a JIT elevation mechanism for tasks requiring admin rights?
 - `[LOOK AT]` Is Windows LAPS deployed across the fleet? Is legacy LAPS still in use (to be migrated)?
 - `[LOOK AT]` Are there shared local admin accounts with common passwords across multiple machines?
 ### C6. Update and Patch Velocity
 - `[LOOK AT]` Is Windows Autopatch in use (for update ring management)?
 - `[LOOK AT]` Are Intune update rings configured with pilot, broad, and deferral stages?
 - `[ASK]` Is there a named person with the authority and procedure to halt a broad update ring push? Has this been tested?
 - `[LOOK AT]` What is the current patch lag for the fleet (how many devices are 30+ days behind on OS updates)?
 ### C7. MAM / App Protection (BYOD)
 - `[TEST]` On iOS: attempt copy/paste from managed Outlook/Teams to an unmanaged app. Does it block?
 - `[TEST]` On Android: same test, separately — behavior is not symmetric with iOS.
 - `[TEST]` Attempt to "Open in" from a managed attachment to an unmanaged app on each platform.
 - `[TEST]` Attempt to save to local storage or sync to a personal cloud (iCloud, Google Drive).
 - `[LOOK AT]` Are managed browsers enforced for SharePoint/OWA access on BYOD, or can users access via any browser?
 ### C8. Autopilot and Enrollment Trust
 - `[LOOK AT]` Is the Autopilot device list audited? Are there stale or unknown device registrations?
 - `[LOOK AT]` Are enrollment restrictions in place to prevent unauthorized device enrollment?
 - `[TEST]` Time a wipe-and-reprovision on a corporate device via Autopilot. Is the "replaceable in an hour" claim accurate?
 - `[LOOK AT]` Is the PRT (Primary Refresh Token) TPM-bound on Windows devices?
 ---
 ## Section D — Data & Collaboration
 ### D1. Sharing Posture
 - `[LOOK AT]` What is the tenant-level external sharing setting in SharePoint Admin Center?
 - `[LOOK AT]` Are "Anyone with the link" anonymous shares enabled at the tenant level?
 - `[TEST]` Enumerate existing anonymous links across the tenant. Can you produce the list? How large is it?
 - `[LOOK AT]` Are per-site sharing settings more permissive than the tenant default? (Sites can override upward)
 - `[LOOK AT]` Are sharing expiration policies configured for anonymous and external links?
 - `[TEST]` Share a document to a test external guest and attempt to reshare onward. Can you track the second-hop share?
 ### D2. Guest Access
 - `[LOOK AT]` How many active guests exist in the tenant?
 - `[LOOK AT]` How many guests have not signed in for 90+ days?
 - `[LOOK AT]` Are access reviews configured for guest accounts? What is the review cadence and the default action on non-response?
 - `[LOOK AT]` Do guests have broader access than the project they were invited for (i.e., access to Teams/channels beyond their original scope)?
 - `[LOOK AT]` Are external identities governed by specific B2B collaboration settings, or is the default (all external domains) allowed?
 ### D3. Email Security
 - `[TEST]` Enumerate external auto-forwarding rules at the transport level (`Get-TransportRule`). Are there any active rules forwarding externally without a documented business owner?
 - `[TEST]` Enumerate Inbox rules on executive / privileged user mailboxes forwarding externally. (`Get-InboxRule`)
 - `[LOOK AT]` Is the global "allow automatic forwarding" setting disabled in Remote Domains for the Default domain?
 - `[LOOK AT]` Are anti-phishing policies configured? Is impersonation protection enabled for executives and key domains?
 - `[LOOK AT]` Is DKIM signing enabled for all sending domains?
 - `[LOOK AT]` Is DMARC configured (policy `reject` or `quarantine`), and is the SPF record current?
 ### D4. Crown Jewels
 - `[ASK]` Can the client name the five data sets that, if exfiltrated, would cause the most damage?
 - `[LOOK AT]` Where do the crown jewels live (SharePoint sites, mailboxes, OneDrive, Teams channels)?
 - `[LOOK AT]` Who has access to the crown-jewel locations? Is access reviewed periodically?
 - `[LOOK AT]` Are the crown-jewel locations labeled with sensitivity labels that carry encryption?
 - `[LOOK AT]` Are audit logs turned on and retained long enough to reconstruct access to crown-jewel locations?
 ### D5. Sensitivity Labels and DLP
 - `[LOOK AT]` Are sensitivity labels deployed in the tenant? What is the coverage across the most-used content types (email, files)?
 - `[LOOK AT]` Are labels configured with encryption for the highest sensitivity tiers?
 - `[LOOK AT]` Is auto-labeling deployed for known crown-jewel content types (if licensed for M365 E5 Compliance)?
 - `[LOOK AT]` Is DLP deployed? Is it scoped to specific known-value patterns (regulated data, PII, crown-jewel keywords) or applied as a broad dragnet generating noise?
 - `[TEST]` Exfiltrate a labeled test document via email to an external address. Does DLP fire? Does the label encryption hold on the received document?
 ### D6. Collaboration Sprawl
 - `[LOOK AT]` Is there ungoverned self-service creation of Teams and SharePoint sites?
 - `[LOOK AT]` Are there orphaned or inactive Teams/sites that still hold data and have no active owner?
 - `[LOOK AT]` Are there Teams channels or SharePoint sites with "Everyone" or broad internal membership grants on sensitive data?
 - `[LOOK AT]` Is late-joiners' access to Team history governed (a user joining a Team today can read all prior messages by default)?
 ### D7. OAuth App Consent
 - `[LOOK AT]` Is user consent for OAuth apps restricted (users cannot consent to app permission requests without admin approval)?
 - `[LOOK AT]` Are there existing grants for apps holding `Mail.Read`, `Files.ReadWrite.All`, or equivalent sensitive scopes by non-first-party apps?
 - `[LOOK AT]` Is Microsoft's app governance module (Purview) enabled? Are risky app alerts configured?
 ### D8. Audit Logging
 - `[LOOK AT]` Is Unified Audit Logging enabled (confirm in Purview Compliance Center > Audit)?
 - `[LOOK AT]` What is the audit retention period, given the client's licensing?
 - `[TEST]` Run a sample audit query on a known recent activity and verify log entries are present. Do not assume the log is on without testing it.
 - `[LOOK AT]` Are admin operations (role assignment changes, app consent, CA policy changes) captured in the audit log?
 ---
 ## Section E — Recovery & Detection
 ### E1. Backup and Recovery
 - `[ASK]` What is the recovery path if a Global Admin deletes all Exchange Online mailboxes and SharePoint sites? Be specific about process, tool, and time estimate.
 - `[LOOK AT]` Is there a third-party M365 backup solution covering Exchange, SharePoint, OneDrive, and Teams?
 - `[LOOK AT]` Are M365 backups isolated from the estate they protect (immutable, separate authentication domain)?
 - `[TEST]` When was the last successful restore from backup, and how long did it take? Restore a test mailbox or a file share and time it. This is the MTTR.
 - `[LOOK AT]` Are on-prem AD backups (System State) taken regularly, stored offline, and verified?
 - `[TEST]` Can the current backup restore an AD domain if all DCs are destroyed? Has anyone run the forest recovery procedure, even in a lab?
 ### E2. Configuration-as-Code (Known-Good Baseline)
 - `[LOOK AT]` Have CA policies been exported to code/JSON (e.g., using CAExporter)?
 - `[LOOK AT]` Has the Entra role assignment state been captured as a document?
 - `[LOOK AT]` Has the Intune baseline configuration been exported?
 - `[LOOK AT]` Is there a diff between the opening state and current state for any changes made during the engagement?
 - `[ASK]` If the tenant CA policies were silently modified by an attacker, would anyone know? Is there drift detection against the known-good?
 ### E3. Recovery Path Independence
 - `[LOOK AT]` Does any part of the recovery runbook depend on the system it recovers (e.g., runbook stored in SharePoint, backup auth via the compromised AD)?
 - `[LOOK AT]` Are recovery credentials (break-glass, backup admin accounts) accessible independently of the estate?
 - `[LOOK AT]` Is the AD forest recovery runbook stored offline or in a location that survives domain destruction?
 - `[ASK]` If both AD and M365 were simultaneously unavailable, what is the recovery sequencing? Is that decision documented?
 ### E4. Detection: Signal Quality
 - `[LOOK AT]` Break-glass account use: is there an alert? Is it monitored by a named person?
 - `[LOOK AT]` New Global Admin assignment: does an alert fire?
 - `[LOOK AT]` DCSync from a non-DC host: is this detected (Defender for Identity or SIEM rule)?
 - `[LOOK AT]` Impossible-travel sign-in for admin accounts: is Entra ID Protection user risk policy configured and alerting?
 - `[LOOK AT]` External auto-forward rule creation: is this generating an alert?
 - `[LOOK AT]` Mass download from SharePoint/OneDrive: is there a Defender for Cloud Apps or Purview policy detecting it?
 - `[LOOK AT]` New OAuth consent grant to sensitive scopes: is this alerting?
 - `[LOOK AT]` PIM activation outside business hours: is this logged and reviewed?
 - `[TEST]` For each configured detection: simulate the event (in a controlled, authorized test context) and confirm the alert fires, is received by a named person, and generates a response within the expected SLA.
 ### E5. Detection: Noise and Action
 - `[ASK]` How many alerts does the monitoring system generate per day? How many are triaged vs. suppressed vs. missed?
 - `[ASK]` For the last three security incidents or notable alerts: what structural change resulted? If the answer is "we sent an awareness email" or "we noted it," the feedback loop is broken.
 - `[LOOK AT]` Is there a named owner for each alert category? An alert without a named owner is an unread alert.
 - `[ASK]` Is there a blameless post-incident process? Do people surface incidents, or do they bury them to avoid blame?
 ### E6. Game-Days and Drills
 - `[ASK]` When was the last deliberate test of recovery or detection (a drill, tabletop, or game-day)?
 - `[TEST]` Break-glass drill: sign in, confirm it works, confirm the alert fires. Document the test and the result.
 - `[TEST]` CA policy enforcement drill: force a non-compliant state on a test user. Confirm the expected outcome and that break-glass bypasses the gate.
 - `[ASK]` Has the client ever run a ransomware tabletop that assumes Tier 0 is owned? What did they find?
 ---
 ## Section F — Quick-Win Inventory
 Use this section to capture findings that can be addressed in the same session or within the engagement without additional scoping.
 Each of the following, if found to be the case, is a fix that typically takes under an hour and has immediate blast-radius reduction. Do not leave these open for the next engagement.
 | Control | Condition that makes it a quick win |
 |---------|-------------------------------------|
 | Tenant-level anonymous sharing | "Anyone" links enabled at tenant level — one toggle |
 | External auto-forwarding | Global block not set — one Exchange setting |
 | Legacy auth CA policy | No policy blocking legacy auth — deploy baseline CA policy |
 | Break-glass alert | Break-glass use not alerting — configure alert rule |
 | Global admins audit | Standing synced GAs — identify and initiate migration |
 | KRBTGT age | Password not set in 365+ days — document and schedule rotation |
 | Stale admin accounts | Disabled or unchecked admin accounts — disable and document |
 | Audit log | Not enabled — turn on (one click in Purview) |
 | PIM not deployed | P2 licensed but PIM off — scope activation as P1 |
 | No CA blocking admin sign-in from personal devices | Missing policy — create report-only immediately, test and enable |
 ---
 ## Engagement Close — Structural Change Verification
 At the close of each engagement or module, confirm:
 1. Which items above were found to be fragile?
 2. For each: what **structural change** was made (not documented, not accepted, but changed)?
 3. Which items were tested by observation (not just inspected)?
 4. Which items are open and in the risk register with a named owner and a timeline?
 5. Has the configuration-as-code baseline been exported and stored?
 6. Has the break-glass been tested?
 7. Is there a named date for the next review of this checklist?
 The work is not complete when the list is walked. It is complete when fragility found has become structure changed.
 ---
 *Engagement Checklist. Updated June 2026. Review and update alongside the Field Guide — January 2027.*
@@ -0,0 +1,380 @@
 # Self-Service Security Cadence
 > *What you run between our engagements. When something in here surprises you, that's when you call us.*
 **Last updated:** June 2026
 **Produced by:** [engagement name / consultant name]
 **For:** [client name] — [named admin / IT lead]
 **Next full engagement:** [date or "TBD"]
 **Next review of this document:** January 2027
 ---
 ## What this is
 We ran the adversarial validation. We fixed the structural issues we found. The work does not stop when we leave.
 This document is your recurring checklist — things you can run yourself, with the tools we set up, on a regular cadence. None of it requires a security background. Most of it takes under an hour per month. The point is to catch drift before it becomes a problem, and to know when to call us before it becomes a crisis.
 **The most important thing:** when something in here produces a result that surprises you, do not sit on it. Log it, screenshot it, and send it to us. The earlier we see a problem the cheaper it is to fix.
 ---
 ## Tools you need (all installed during the engagement)
 | Tool | What it does | Where to get it |
 |------|-------------|-----------------|
 | **PingCastle** | Scans Active Directory and produces a security report with a score and specific findings | [pingcastle.com](https://www.pingcastle.com) — free Community edition |
 | **Purple Knight** | Scans Active Directory for indicators of exposure — simpler output than PingCastle, good complement | [purple-knight.com](https://www.purple-knight.com) — free |
 | **CAExporter** | Exports all Conditional Access policies to JSON files you can compare over time | [github.com/vibecoding/CAExporter](https://github.com/vibecoding/CAExporter) |
 | **Microsoft Graph PowerShell** | The PowerShell module for the scripts in this document | `Install-Module Microsoft.Graph` |
 | **Microsoft 365 Defender portal** | alerts.microsoft.com — your alert queue and Secure Score |  |
 | **Microsoft Entra portal** | entra.microsoft.com — your identity dashboard |  |
 The scripts in this document are saved in `[location agreed during engagement — e.g., C:\SecurityRunbook\Scripts\]`.
 ---
 ## Monthly checks — 30 to 45 minutes, portal-based
 Do these on the first working day of each month. They require no special tools — just a browser logged in as a Global Admin or Security Reader.
 ---
 ### M1. Microsoft Secure Score
 **Where:** [Microsoft 365 Defender portal](https://security.microsoft.com) > Secure Score
 **What to do:**
 1. Note the current score.
 2. Compare to last month's score (the history graph shows it).
 3. Look at the "Recommended actions" tab — filter to "Not addressed."
 4. Any new items that appeared since last month? Note them.
 **What you are looking for:** Score going down month-over-month without a known reason. New recommended actions you did not create. Completed actions that have reverted to "not addressed" (this means configuration drifted back).
 **Call us if:** Score drops more than 5 points in a month without a documented reason, or if a completed action you remember implementing shows as "not addressed."
 ---
 ### M2. Entra ID Recommendations
 **Where:** [Entra portal](https://entra.microsoft.com) > Overview > Recommendations
 **What to do:**
 1. Look at all open recommendations.
 2. Note any that are new since last month.
 3. Note the impact rating (High / Medium / Low) on new ones.
 **What you are looking for:** New high-impact recommendations that appeared since last month. Specifically watch for anything related to admin accounts, Conditional Access, legacy authentication, or risky sign-ins.
 **Call us if:** Any new High-impact recommendation appears. We will help you assess whether to act immediately or schedule it.
 ---
 ### M3. Sign-in risk review
 **Where:** Entra portal > Identity Protection > Risky sign-ins
 **What to do:**
 1. Filter to the last 30 days.
 2. Look at sign-ins with risk level "High" that were not dismissed or remediated.
 3. For any admin account (Global Admin, Exchange Admin, Security Admin) with any risky sign-in event — investigate before dismissing.
 **What you are looking for:** Admin accounts appearing in the risky sign-in list. Any high-risk sign-in that auto-remediated (meaning the user passed an MFA challenge) where the geography or device does not make sense.
 **Call us if:** Any admin account has a risky sign-in event. Any high-risk event that was remediated from an unexpected location.
 ---
 ### M4. Alert queue health
 **Where:** Microsoft 365 Defender portal > Incidents & alerts > Alerts
 **What to do:**
 1. Filter to "New" and "In progress" alerts.
 2. How many are sitting open for more than 48 hours?
 3. Are there categories of alert that appear repeatedly? (Recurring alerts on the same user or asset are a pattern, not noise.)
 **What you are looking for:** Alert queue growing over time without being worked. The same alert firing repeatedly on the same account or resource. Any alert tagged as "High severity" that is more than 24 hours old without assignment.
 **Call us if:** A High-severity alert is more than 24 hours old and you do not know what to do with it. Or if the same alert keeps firing on the same account.
 ---
 ### M5. New admin assignments
 **Where:** Entra portal > Identity > Roles & admins > All roles > Global Administrator > Assignments
 **What to do:**
 1. Check the current member list against last month's.
 2. Any new members? Were they expected?
 3. Check at minimum: Global Administrator, Exchange Administrator, Security Administrator, SharePoint Administrator.
 **What you are looking for:** Anyone in a privileged role who should not be, or who appeared without a formal request.
 **Call us if:** Any new privileged role assignment you did not authorize or do not recognize.
 ---
 ### M6. Break-glass confirmation (30 seconds)
 **What to do:**
 1. Confirm the break-glass account credentials are still in the agreed storage location.
 2. Confirm the contact for "break-glass alert fired" is still the right person.
 Do not log in to the break-glass account during this check — any sign-in triggers an alert. Just confirm the credentials are accessible.
 **Call us if:** Credentials cannot be found. Or if the break-glass alert fires without a drill scheduled.
 ---
 ## Quarterly checks — 2 to 3 hours, tools required
 Do these in the first week of each quarter (January, April, July, October). These require running the installed tools and saving the output.
 ---
 ### Q1. PingCastle AD scan
 **How to run:**
 1. Log in to the domain controller (or any domain-joined machine) as a Domain Admin.
 2. Run `PingCastle.exe --healthcheck --server <your-domain-FQDN>`.
 3. It produces an HTML report. Save it to `[agreed location]` with the date in the filename: `PingCastle-2026-Q3.html`.
 4. Open the report and note the score and any findings marked "Critical" or "High."
 5. Compare to the previous quarter's report — is the score going up or down?
 **What you are looking for:** Score trending down quarter-over-quarter. New Critical or High findings that were not present last quarter. Specifically watch the "Stale Objects" section (accounts nobody uses) and the "Privileged Access" section.
 **Call us if:** The score drops more than 10 points since last quarter. Any new Critical finding. Any finding in the "Privileged Access" category that was clean last quarter.
 ---
 ### Q2. Purple Knight AD scan
 **How to run:**
 1. Download and run Purple Knight on a domain-joined machine with Domain Admin credentials.
 2. It is a GUI tool — click through the scan, wait for it to finish.
 3. Save the PDF report with the date: `PurpleKnight-2026-Q3.pdf`.
 4. Look at the "Identity Security Indicators" with status "Exposed" or "Critical."
 5. Compare to the previous quarter.
 **What you are looking for:** New exposed indicators that did not appear last quarter. Any indicator flagged as Critical. The tool is organized by MITRE ATT&CK category — pay particular attention to "Credential Access" and "Privilege Escalation."
 **Call us if:** Any new Critical indicator. Or if the same Medium indicators keep appearing quarter after quarter without being resolved (this means the fix did not stick).
 ---
 ### Q3. KRBTGT and AZUREADSSOACC age check
 **How to run:** Open PowerShell as Domain Admin and run the following:
 ```powershell
 Write-Host "=== KRBTGT ===" -ForegroundColor Cyan
 Get-ADUser krbtgt -Properties PasswordLastSet |
  Select-Object @{N="Account";E={"krbtgt"}},
               PasswordLastSet,
               @{N="AgeDays";E={((Get-Date) - $_.PasswordLastSet).Days}}
 Write-Host "=== AZUREADSSOACC ===" -ForegroundColor Cyan
 Get-ADComputer AZUREADSSOACC -Properties PasswordLastSet -ErrorAction SilentlyContinue |
  Select-Object @{N="Account";E={"AZUREADSSOACC"}},
               PasswordLastSet,
               @{N="AgeDays";E={((Get-Date) - $_.PasswordLastSet).Days}}
 ```
 Record the age in days in your tracking spreadsheet.
 **What you are looking for:** KRBTGT older than 365 days = P1 (schedule rotation with us). KRBTGT older than 180 days = note and plan. AZUREADSSOACC never rotated since initial sync setup = note.
 **Call us if:** KRBTGT is over 365 days old and there is no scheduled rotation. Or if either account shows a password age younger than expected (meaning someone rotated it without telling you — that is a finding too).
 ---
 ### Q4. Cloud-only Global Admins check
 **How to run:**
 ```powershell
 Connect-MgGraph -Scopes "Directory.Read.All"
 $gaRoleId = (Get-MgDirectoryRole -Filter "displayName eq 'Global Administrator'").Id
 $gaMembers = Get-MgDirectoryRoleMember -DirectoryRoleId $gaRoleId
 Write-Host "=== Global Admins ===" -ForegroundColor Cyan
 $gaMembers | ForEach-Object {
  $user = Get-MgUser -UserId $_.Id -Property DisplayName,UserPrincipalName,OnPremisesSyncEnabled
  [PSCustomObject]@{
    Name            = $user.DisplayName
    UPN             = $user.UserPrincipalName
    SyncedFromAD    = $user.OnPremisesSyncEnabled
  }
 } | Format-Table -AutoSize
 ```
 Any row where `SyncedFromAD` is `True` is a P0 — call us immediately.
 **What you are looking for:** Any Global Admin that is synced from on-prem AD. Any new GA you did not create.
 **Call us if:** Any synced GA appears. Any GA you do not recognize.
 ---
 ### Q5. Service principal secrets check — expiring and never-expiring
 **How to run:**
 ```powershell
 Connect-MgGraph -Scopes "Application.Read.All"
 $today = Get-Date
 $warningDays = 60
 Write-Host "=== Non-expiring secrets ===" -ForegroundColor Red
 Get-MgApplication -All | ForEach-Object {
  $app = $_
  $app.PasswordCredentials | Where-Object { $_.EndDateTime -eq $null } | ForEach-Object {
    [PSCustomObject]@{ App = $app.DisplayName; Secret = $_.DisplayName; Expires = "NEVER" }
  }
 } | Format-Table
 Write-Host "=== Secrets expiring within $warningDays days ===" -ForegroundColor Yellow
 Get-MgApplication -All | ForEach-Object {
  $app = $_
  $app.PasswordCredentials | Where-Object {
    $_.EndDateTime -ne $null -and $_.EndDateTime -lt $today.AddDays($warningDays)
  } | ForEach-Object {
    [PSCustomObject]@{ App = $app.DisplayName; Secret = $_.DisplayName; Expires = $_.EndDateTime }
  }
 } | Sort-Object Expires | Format-Table
 ```
 **What you are looking for:** Non-expiring secrets on any app registration. Secrets about to expire (these will break an application if not rotated — but they also need reviewing: is the app still needed?).
 **Call us if:** You find a non-expiring secret on an app you do not recognize. Or if you find an expiring secret and do not know which application or service it belongs to.
 ---
 ### Q6. Stale guest review
 **How to run:**
 ```powershell
 Connect-MgGraph -Scopes "User.Read.All", "AuditLog.Read.All"
 $cutoff = (Get-Date).AddDays(-90)
 Get-MgUser -Filter "userType eq 'Guest'" -All -Property DisplayName,Mail,CreatedDateTime,SignInActivity |
  ForEach-Object {
    $lastSignIn = $_.SignInActivity.LastSignInDateTime
    [PSCustomObject]@{
      Name        = $_.DisplayName
      Email       = $_.Mail
      Created     = $_.CreatedDateTime
      LastSignIn  = $lastSignIn
      DaysSinceSignIn = if ($lastSignIn) { ((Get-Date) - $lastSignIn).Days } else { "Never" }
    }
  } |
  Sort-Object DaysSinceSignIn -Descending |
  Format-Table -AutoSize
 ```
 **What you are looking for:** Guests who have not signed in for 90+ days. Guests you do not recognize (external parties from concluded projects or former vendors).
 **Call us if:** The count of stale guests is growing quarter-over-quarter and nobody is pruning them. Or if a guest account appears that belongs to an external party from a concluded engagement and still has active access.
 ---
 ### Q7. Anonymous link count
 **How to run:** Connect using PnP PowerShell (installed during engagement):
 ```powershell
 Connect-PnPOnline -Url "https://[tenant]-admin.sharepoint.com" -Interactive
 $sites = Get-PnPTenantSite -IncludeOneDriveSites
 $anonLinks = foreach ($site in $sites) {
  Connect-PnPOnline -Url $site.Url -Interactive
  Get-PnPSharingLinks | Where-Object { $_.SharingLinkType -eq "Anonymous" } |
    ForEach-Object { [PSCustomObject]@{ Site = $site.Url; Link = $_.ShareLink; Expires = $_.ExpirationDateTime } }
 }
 Write-Host "Total anonymous links: $($anonLinks.Count)" -ForegroundColor Yellow
 $anonLinks | Sort-Object Site | Format-Table
 ```
 Record the count. Save the export.
 **What you are looking for:** Count increasing quarter-over-quarter (means new anonymous links are being created despite the policy). Links with no expiration date.
 **Call us if:** Count is increasing despite the restriction we put in place. Or if you find anonymous links on sites that hold sensitive data (HR, Finance, M&A).
 ---
 ### Q8. CA policy diff — detect drift
 **How to run:**
 ```powershell
 # CAExporter is set up from the engagement — run from its directory
 .\CAExporter.ps1 -ExportPath "C:\SecurityRunbook\CA-Exports\CA-$(Get-Date -Format 'yyyy-MM-dd')"
 ```
 Then compare this quarter's export folder to last quarter's using any file diff tool (WinMerge, VS Code with the "compare folders" extension, or simply `Compare-Object` in PowerShell):
 ```powershell
 $old = Get-ChildItem "C:\SecurityRunbook\CA-Exports\CA-2026-04-01" -File | Select-Object -ExpandProperty Name
 $new = Get-ChildItem "C:\SecurityRunbook\CA-Exports\CA-2026-07-01" -File | Select-Object -ExpandProperty Name
 Compare-Object $old $new
 ```
 Then for any policy that changed, open the JSON files and compare manually. The changed lines are the configuration drift.
 **What you are looking for:** Policies deleted since last quarter. Policies whose parameters changed (exclusions added, scope narrowed, MFA grant changed to "grant without controls"). New policies in report-only mode that should have been enabled.
 **Call us if:** Any CA policy has changed without a corresponding change record. A policy that was enforcing is now in report-only mode. A new exclusion was added to a critical policy (legacy auth block, admin MFA, device compliance).
 ---
 ## "Call us" trigger list
 These are the situations where you stop, take a screenshot, and contact us — even outside a scheduled check:
 | What you see | How urgent | What to do first |
 |---|---|---|
 | Break-glass alert fires unexpectedly | Immediate | Disable any active sessions for the break-glass account, then call us |
 | New Global Admin you did not create | Immediate | Do not remove it yet — screenshot first, then call us |
 | Synced account in Global Admin role | Same day | Do not change anything — screenshot and call us |
 | DCSync alert from Defender for Identity | Immediate | Isolate the source host from the network if possible, then call us |
 | External auto-forward rule found on any executive mailbox | Same day | Disable the rule, check for mail forwarded, call us |
 | PingCastle score drops more than 10 points | Within 48 hours | Send us the report alongside the previous quarter's |
 | Any alert sitting at High severity for more than 24 hours you do not know how to triage | Within 24 hours | Screenshot, note what the alert says, call us |
 | Backup restore fails or produces corrupt data | Same day | Do not delete anything — call us |
 | Something that feels wrong but is not on this list | Use your judgement | A wrong feeling is data. Document what you noticed and send it. We will tell you if it is nothing. |
 ---
 ## Tracking spreadsheet columns
 Keep a simple spreadsheet (Excel or SharePoint list) with one row per check per quarter:
 | Date | Check | Result / Count | vs. Last Quarter | Action taken | Escalated to consultant? |
 |------|-------|---------------|-----------------|--------------|--------------------------|
 The trend matters more than any individual value. A metric that is consistently getting worse is a finding even if no single value crosses a threshold.
 ---
 ## When to schedule the next full engagement
 Use this as a rule of thumb:
 - **Annual:** Full adversarial validation (the engagement that produced this document). Recommended even if the monthly and quarterly checks are clean — they catch drift, not adversarial paths.
 - **Triggered:** Any time a "call us immediately" event fires, or PingCastle / Purple Knight produces a new Critical finding.
 - **Project-triggered:** Before any major change to the estate — AD migration, new cloud service onboarding, M365 license change, acquisition or merger, significant IT staff change.
 ---
 *Self-service cadence for [client name]. Produced June 2026. Review and update January 2027 alongside the field guide update.*