Files
antifragile/antifragile-consulting/playbooks/assignment-intune-security-baseline.md

385 lines
26 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Assignment: Intune Security Baseline
> *The device will be compromised. Compliant is not the same as secure, and the portal toggle is not the same as the device's behaviour. Build for the compromise, not against it.*
This is a **scoped assignment package** — a complete, principled delivery guide for one specific client brief. It closes the device-layer gap and activates the CA Layer 3 policies designed in [Assignment: CA Architecture](assignment-ca-architecture.md). It can be delivered standalone, but its full structural value is realised when CA Layer 3 is activated at the end.
---
## The Brief
Client requests that fall within this scope:
- *"Deliver a security baseline for our Intune-managed endpoints"*
- *"Set up Intune / we need device management"*
- *"We need compliant devices to be required for M365 access"*
- *"Our auditor wants evidence that devices are encrypted and patched"*
- *"We have Intune but nobody set up the security policies"*
- *"We're retiring SCCM and going cloud-native"* (if co-management migration is explicitly scoped)
This assignment does not require executive sponsorship. It requires one named IT lead with Intune Administrator access, a tolerance for a grace-period before enforcement, and an understanding that the enrollment rate at the start is almost never what the CMDB says.
---
## Scope Boundary
**In scope:**
- Device population mapping (what is actually authenticating, vs. what is enrolled, vs. what the CMDB says)
- Compliance policies: Windows, macOS, iOS, Android — as applicable to the fleet
- Device configuration profiles: Windows security baseline settings
- Windows Update rings (quality and feature updates)
- Windows LAPS (local admin password management)
- App Protection Policies for BYOD iOS and Android (MAM without MDM)
- Enrollment review and gaps (not a new enrollment deployment unless scoped separately)
- CA Layer 3 activation: connecting compliance state to Conditional Access
**Out of scope:**
- SCCM co-management migration → separate engagement (scope is complex and fleet-specific)
- Autopilot setup and Autopilot-based provisioning → separate deployment engagement
- EDR configuration: Defender for Endpoint advanced features, custom detection rules → separate or within E5 engagement
- WDAC / Smart App Control / application allowlisting → advanced application control engagement
- Driver and firmware update management → note as gap, recommend Windows Update for Business or third-party where Intune is insufficient
- GPO conflict resolution for hybrid-joined estates → flag; recommend cloud-native migration path
- Endpoint Privilege Management (JIT local admin elevation) → note as follow-on if standing local admin cannot be removed
When the client asks about SCCM migration or Autopilot, scope it separately. Co-management is a legitimate transitional architecture but it adds complexity that deserves its own scoped engagement with its own completion criteria.
---
## Before You Touch Anything
**1. Break-glass exclusion.**
Confirm that break-glass accounts are excluded from all device-compliance CA policies. A flaky compliance signal must never lock out tenant recovery. If CA Layer 3 is not yet designed, this step ensures the door is open when it is deployed.
**2. Four-population mapping.**
The CMDB is a claim. Authentication logs are facts. Before configuring compliance policies, build the real device picture from four sources:
| Population | Source |
|-----------|--------|
| **Enrolled (MDM)** | Intune device list |
| **Registered (Entra)** | Entra ID → Devices → All devices |
| **Authenticating** | Entra sign-in logs (30 days), filtered by device detail |
| **CMDB** | Whatever the client has |
Map the differences. Devices in sign-in logs but not in Intune are known-unmanaged — they reach data and you cannot apply compliance policies to them. Devices in the CMDB but not in sign-in logs may be retired equipment or offline devices that have never actually authenticated. The gap between enrolled and authenticating is the real finding, and it belongs in the leave-behind regardless of whether it is addressed in this engagement.
**3. Existing Intune policy audit.**
If Intune has been configured before — even partially — audit what exists before touching anything. Duplicate compliance policies, conflicting configuration profiles, and orphaned enrollment restrictions are common. A client who says "Intune is set up" often has one compliance policy created in 2021, three enrollment profiles nobody recognises, and a Windows security baseline applied to a group that no longer exists. Export the current state.
**4. CA Layer 3 status.**
Check whether `CA-AllUsers-AllApps-RequireCompliantDevice` exists in report-only mode from the CA Architecture assignment. If it does, this assignment ends by activating it. If it does not exist, design and deploy it in report-only mode as part of this assignment — but do not activate it until compliance coverage is proven.
---
## Principles Applied
**Compliance is a signal, not a checkbox.**
A device marked compliant in Intune carries a staleness window: compliance is evaluated on check-in cadence, not continuously. A device can fall out of compliance — lose encryption, miss patches, be rooted — and still hold a valid compliant token and access grant for hours. Design around this: the compliance requirement at CA is a meaningful control that raises the cost of attack, not a guarantee of device integrity. Document what it is and what it isn't.
**Test on real devices, not portal configurations.**
A Conditional Access policy can show a perfectly correct configuration in the portal and enforce nothing. The same applies to compliance policies: a policy assigned to a group can appear active and produce no compliance results for enrolled devices whose group membership has drifted. And MAM/App Protection enforcement has documented gaps between the toggle and the actual device behaviour — gaps that vary by platform, OS build, and companion app version. For every control that matters, confirm it with a real device producing the expected result. Write the expected result down before you test, not after.
**Velocity with a brake.**
Update rings exist not to slow patching but to make patching safe at speed. An unbraked push to the entire fleet is one bad update away from a mass outage — the kind that stops production, not the kind that stops attackers. A canary ring with a real halt-and-rollback capability is the mechanism that lets the rest of the fleet patch fast and safely. The canary must be tested — an untested canary is just the first domino with a friendly name.
**The device is disposable; the data boundary is the protection.**
Every design decision in this assignment should ask: if this device is wiped and reprovisioned in an hour, does anything important break? A device that can be reprovisioned in an hour is antifragile. A device whose compromise is a crisis is fragile, regardless of how many compliance policies are applied to it. Build for reprovisionability: Autopilot, LAPS, application deployment from Intune, user profile from OneDrive. The compliance baseline hardens the device; the reprovision capability makes its loss survivable.
---
## Delivery Architecture
### Step 1 — Population Mapping and Audit (no changes)
| Action | Output |
|--------|--------|
| Four-population mapping (enrolled / registered / authenticating / CMDB) | Device population report: counts, deltas, known-unmanaged estimate |
| Existing compliance policy audit | Policy inventory: assignments, settings, mode, last modified |
| Existing configuration profile audit | Profile inventory: conflicts, orphaned assignments, platform coverage |
| Update ring inventory | Current rings or absence of rings |
| Sign-in log: device compliance state | What proportion of sign-ins carried a compliant device signal in the last 30 days |
| LAPS status | Whether Windows LAPS is deployed or legacy LAPS or neither |
Share the device population report with the named client lead before writing any policies. The finding is almost always the same: the managed fleet is smaller than assumed, the dark population is larger than assumed, and several CMDB entries have not authenticated in months. State it plainly.
---
### Step 2 — Compliance Policies (report mode first)
Deploy all compliance policies in report mode. Review results for 72 hours before activating noncompliance actions. The goal at this step is to see the real compliance state of the fleet — not to block anyone.
**Noncompliance action sequence (apply to all compliance policies):**
| Day | Action |
|-----|--------|
| 0 | Mark noncompliant (reporting only — this is immediate and always on) |
| 1 | Send email notification to user |
| 7 | Block access (activates when `CA-AllUsers-AllApps-RequireCompliantDevice` is enabled) |
| 30 | Retire device (for persistent noncompliance — confirm with client lead before activating) |
The 7-day grace window is not leniency — it is the window in which IT can identify and remediate legitimate noncompliance (device in repair, device offline, missed check-in) before a user is blocked. Without it, the first enforcement wave produces a support ticket flood. With it, enforcement is gradual and explainable.
**Windows compliance policy — baseline settings:**
| Setting | Value | Rationale |
|---------|-------|-----------|
| BitLocker required | Yes | Unencrypted devices lose data on physical theft |
| OS minimum version | Windows 10 22H2 / Windows 11 22H2 | Below this: no Windows LAPS; OS in extended support only |
| Defender AV enabled | Yes | Baseline detection |
| Defender real-time protection | Yes | |
| Firewall enabled | Yes | |
| Secure boot enabled | Yes | Blocks bootkit-level compromise |
| TPM required | Yes (for new enrollments; consider exclusion group for legacy hardware) | PRT TPM-binding requires TPM |
| Password required | Yes | Minimum complexity, minimum length 8 |
| Maximum inactivity before screen lock | 15 minutes | |
Do not configure the compliance policy to evaluate Microsoft Defender for Endpoint risk score unless Defender for Endpoint P2 (E5) is licensed. Misconfiguring this setting against an E3 tenant produces false noncompliance for all devices.
**macOS compliance policy (if fleet includes Macs):**
| Setting | Value |
|---------|-------|
| FileVault enabled | Yes |
| OS minimum version | macOS 13 (Ventura) or later |
| Password required | Yes |
| Firewall enabled | Yes |
| System Integrity Protection | Yes |
**iOS compliance policy:**
| Setting | Value |
|---------|-------|
| OS minimum version | iOS 16 or later |
| Passcode required | Yes |
| Jailbreak detection | Block jailbroken devices |
| Device threat level | Secured (no threat level tolerance) |
**Android compliance policy:**
| Setting | Value |
|---------|-------|
| OS minimum version | Android 12 or later |
| Device PIN required | Yes |
| Rooted devices | Block |
| Minimum security patch level | Within 90 days |
**The honest note on jailbreak/root detection:** detection is an arms race. A motivated attacker with a current tool bypasses it. Treat root detection as a tripwire that raises the cost of the attack, never as a barrier that stops it. Document this in the residual risk statement.
---
### Step 3 — Device Configuration Baseline
The Microsoft Windows Security Baseline (available in Intune → Endpoint security → Security baselines) is the starting point. It encodes Microsoft's recommended settings as an Intune profile that enforces continuously.
**Deployment approach:**
1. Deploy the Windows Security Baseline in **report mode** to a pilot group (1020 devices, IT team first)
2. Review conflicts and configuration gaps for 48 hours
3. Resolve any conflicts with existing policies (overlapping profiles produce unpredictable results — Intune applies the stricter setting per-setting by default, but conflicting values create undefined behaviour)
4. Expand to production groups
5. Monitor Intune reports for policy conflicts and noncompliance
**Additional configuration profiles (deploy after the security baseline is stable):**
| Profile | Purpose | Notes |
|---------|---------|-------|
| **BitLocker configuration** | Enable BitLocker silently, escrow recovery keys to Entra | Separate from compliance (compliance requires BitLocker; this profile configures how it's applied) |
| **Microsoft Defender AV** | Configure exclusions, scheduled scans, PUA protection | Do not configure AV exclusions broadly — each exclusion reduces coverage |
| **Firewall configuration** | Block inbound connections, logging | Complements compliance requirement |
| **Edge browser baseline** | SmartScreen, extension management, safe browsing, disable password manager sync | Applies to corporate Edge profile; test carefully — extension management can break legitimate workflows |
| **Windows Hello for Business** | Phishing-resistant authentication at device layer | If deploying phishing-resistant MFA (required by CA-Admins policy), WHfB is the most practical path |
---
### Step 4 — Update Rings
Update rings are the mechanism that makes patching fast and safe simultaneously. Deploy three rings minimum.
**Ring structure:**
| Ring | Assignment | Quality update deferral | Feature update deferral | Notes |
|------|-----------|------------------------|------------------------|-------|
| **Canary** | IT team (510 devices) | 0 days | 0 days | Takes every update immediately. Canary for production rings. Must include at least one machine that runs every critical business application. |
| **Pilot** | 1015% of fleet, varied roles | 7 days | 30 days | Broad business representation. If Canary is clear after 7 days, Pilot proceeds. |
| **Production** | Remainder | 14 days | 90 days | Conservative deferral. If Pilot is clear after 7 days, Production proceeds. |
**Pause and rollback configuration:**
Configure Intune update rings with the pause capability enabled. Define in the client's runbook:
- Who has authority to pause an update ring (named person, not a committee)
- What the trigger is for pausing (Canary devices showing a known issue, not a vague "something might be wrong")
- Maximum pause duration before the pause is reviewed (7 days)
An untested pause capability is a fiction. Test it during the engagement: deploy an update to Canary, confirm it lands, pause the ring, confirm the pause holds, resume. This takes 30 minutes and is the only proof the mechanism works.
---
### Step 5 — Windows LAPS
Standing local administrator accounts are the device-layer version of standing privilege. If the same local admin password is shared across the fleet (common in legacy environments), one compromised device yields lateral movement credentials for the entire estate.
**Windows LAPS (cloud-native):**
- Available on Windows 10 22H2+ and Windows 11 22H2+ with current patches
- Configure backup target: Entra ID (cloud-native; no on-prem infrastructure required)
- Rotation schedule: 30 days, plus rotate on device handoff
- Requires Entra ID P1 (included in E3)
**Deployment:**
1. Enable LAPS in Entra ID (Entra admin center → Devices → Device settings → Enable Microsoft Entra Local Administrator Password Solution)
2. Create an Intune LAPS policy (Endpoint security → Account protection → LAPS)
3. Assign to a pilot group; confirm password backup to Entra after check-in
4. Expand to production
**For legacy LAPS (on-prem AD environments where Windows LAPS is not yet deployable):**
Legacy LAPS (the original Microsoft LAPS MSI) remains deployable via Intune for hybrid-joined devices. Flag this as a transitional state — cloud-native Windows LAPS is the destination.
**What this does not solve:** if standing Domain Admin or local admin is provided to specific IT staff outside of LAPS, that standing privilege is out of scope for this assignment. Log it in scope boundary signals.
---
### Step 6 — App Protection Policies (BYOD)
App Protection Policies (MAM without MDM) manage the data layer on personal devices without enrolling the device. This is the correct model for BYOD: wall the corporate data, not the device.
**The honest caveat, stated plainly:** App Protection Policy enforcement has gaps. The policy controls what managed apps should do; the actual enforcement is dependent on the app version, OS version, companion app (Company Portal on Android), and specific API support. "Block copy/paste to unmanaged apps" blocks in documented paths — it does not block screenshots, OS-level share sheet on some platforms, or every third-party clipboard manager. Test on real devices. Document what you verified and where the limits are.
**Deploy separate policies per platform.** iOS and Android are not symmetric. A policy that works on iOS may not produce the same behaviour on Android. Test both independently.
**iOS App Protection Policy — baseline settings:**
| Setting | Value |
|---------|-------|
| Prevent "Save As" to personal storage | Block |
| Restrict cut/copy/paste to managed apps only | Managed apps with paste in |
| Require PIN for app access | Yes (after 5 minutes inactivity) |
| Minimum OS version | iOS 16 |
| Offline grace period before access blocked | 720 hours (30 days) |
| Selective wipe after failed PIN attempts | Yes (after 10 attempts) |
| Minimum app version | Latest 1 (configure per app) |
| Jailbroken/rooted devices | Block |
Apply to: Outlook, Teams, Edge, OneDrive, SharePoint mobile. These are the apps through which corporate data flows on BYOD devices.
**Android App Protection Policy — same baseline settings.** Test enforcement independently — behaviour on Android differs, particularly clipboard controls and "open in" restrictions.
**Selective wipe verification:**
Test selective wipe on a real BYOD device before the engagement closes. Confirm that corporate data (email, files, Teams content) is removed and personal data (photos, personal apps) is not. This is the capability that makes MAM politically viable — if the user doesn't trust that it won't touch their personal data, enrollment fails. Document the test.
---
### Step 7 — CA Layer 3 Activation
This is the step that connects device compliance to access control. Everything before this point has been deploying and measuring; this step makes compliance matter for access.
**Prerequisites before activating:**
- [ ] Compliance policy deployed and returning results for ≥ 80% of the enrolled fleet
- [ ] 72 hours of report-only compliance results reviewed — no widespread false noncompliance identified
- [ ] Break-glass accounts confirmed excluded from device compliance CA policies
- [ ] Named client lead has approved activation in writing
- [ ] IT team briefed on noncompliance action timeline (users blocked after day 7 if noncompliant)
- [ ] Helpdesk runbook written: what to do when a user is blocked due to noncompliance
**Activation sequence:**
1. Switch `CA-AllUsers-AllApps-RequireCompliantDevice` from report-only to **enabled**
2. Monitor Intune compliance dashboard and Entra sign-in logs for 24 hours
3. Confirm: compliant devices are signing in successfully; noncompliant devices are being blocked at CA
4. Confirm: break-glass accounts are not blocked
Do not activate device-compliance CA policies on a Monday or before a public holiday. An unexpected compliance failure during a period of low IT staffing is a bad outcome that a one-day wait entirely prevents.
**After activation, the compliance signal is live.** A device that loses compliance — drops encryption, falls behind on patches, is rooted — will be blocked from M365 access within the 7-day noncompliance action window. This is the control working as designed.
---
## Structural Resilience Checklist
Controls that hold without ongoing human willingness after this engagement closes.
- [ ] Compliance policies deployed and returning results for enrolled devices
- [ ] Noncompliance action timer active (day 7 block — not just report)
- [ ] Windows Security Baseline profile active on production fleet
- [ ] Update rings deployed with Canary, Pilot, and Production separation
- [ ] Update ring pause tested at least once
- [ ] Windows LAPS deployed; local admin passwords backing up to Entra
- [ ] App Protection Policies active for iOS and Android BYOD (tested on real devices)
- [ ] Selective wipe tested on BYOD device
- [ ] `CA-AllUsers-AllApps-RequireCompliantDevice` **enabled** (not report-only)
- [ ] Break-glass accounts excluded from device compliance CA policies — confirmed with a real sign-in
---
## Kill Chain Contribution
**What this assignment closes (or significantly raises the cost of):**
| Attack vector | Control deployed |
|---------------|-----------------|
| Stolen credentials used from unmanaged/unknown device | CA Layer 3: compliant device required |
| Physical theft of unencrypted device | BitLocker compliance requirement |
| Lateral movement via shared local admin credentials | Windows LAPS: unique per-device passwords |
| Unpatched OS exploited at known CVE | Update rings: enforced patch cadence |
| BYOD personal device accessing corporate data without controls | App Protection Policies: data container on unmanaged device |
| Attacker persistence on device after credential reset | Compliance noncompliance action: device retired after persistent noncompliance |
**What this assignment does not close:**
| Remaining gap | Addressed by |
|---------------|-------------|
| Session token theft post-compliance check (AiTM phishing) | Entra token protection (P2) + continuous access evaluation |
| Compromised but still-compliant device (stale signal window) | Defender for Endpoint device risk integration (E5) |
| App-layer data exfiltration through sanctioned apps | Collaboration and data security assignment |
| Advanced malware, post-exploitation on managed device | EDR: Defender for Endpoint P2 (E5) or Wazuh/Sysmon augmentation |
| Standing privilege on servers accessed from managed devices | Privileged access engagement |
| Dark access (legacy auth, long-lived tokens bypassing CA) | Legacy auth block (identity baseline) + token lifetime policies |
The most important gap to document plainly: a managed, compliant device that carries a stolen session token (issued after legitimate MFA) still has access. The compliance signal does not re-evaluate session tokens retroactively. Continuous Access Evaluation (CAE) narrows this window for supported apps — verify which apps in the client's environment support CAE, and document the remainder as residual risk.
---
## Leave-Behind Package
| Artifact | Description |
|----------|-------------|
| **Device population report** | Four-population map: enrolled, registered, authenticating, CMDB; delta analysis; known-unmanaged estimate |
| **Compliance policy documentation** | Every policy: settings, assignments, noncompliance action timeline, rationale |
| **Compliance dashboard export** | Compliance rates by policy and platform at engagement close |
| **Configuration profile documentation** | Security baseline and supplemental profiles: settings, assignments, conflict analysis |
| **Update ring documentation** | Ring structure, deferral schedule, pause/rollback procedure, pause test result |
| **LAPS deployment confirmation** | Devices with LAPS active; Entra backup confirmed; rotation schedule |
| **App Protection Policy documentation** | iOS and Android policies: settings, tested behaviours, documented gaps per platform |
| **Selective wipe test record** | Device tested, result, personal data confirmed intact |
| **CA Layer 3 activation confirmation** | Sign-in log showing compliant devices accessing successfully, noncompliant devices blocked |
| **Scope boundary log** | Every finding outside this scope, named and prioritized |
| **Residual risk statement** | What this assignment did not close: stale compliance signal, AiTM token theft, EDR gap, dark access |
---
## Scope Boundary Signals
| Signal | Points toward |
|--------|--------------|
| Shadow IT apps visible in Intune application inventory | Collaboration and data security assignment; shadow AI discovery |
| SCCM co-management active; GPO policies conflicting with Intune | Co-management migration engagement; AD hardening |
| Hybrid-joined devices that depend on line-of-sight to DC | Cloud-native migration path; hybrid identity engagement |
| No Defender for Endpoint P2; device risk signal not feeding CA | E5 licensing gap; E3 augmentation with Wazuh/Sysmon |
| Standing local admin accounts for IT staff outside LAPS scope | Privileged access engagement (Endpoint Privilege Management) |
| Autopilot not configured; device reprovision takes days not hours | Autopilot deployment engagement |
| Legacy devices below Windows 10 22H2 in the compliance-excluded group | Accelerate OS refresh; document as known risk with timeline |
| Audit log retention < 90 days | Detection baseline assignment |
| MAM enforcement gaps found during BYOD testing | Document with vendor; consider MDM enrollment for corporate-issued mobile |
---
## Buildable-On: What the Next Assignment Depends On
The Collaboration and Data Security assignment builds on the device posture deployed here. Specifically:
1. **`CA-AllUsers-UnmanagedDevice-AppEnforcedRestrictions` behaviour** is now testable against the real unmanaged device population. With enrolled and unmanaged devices mapped, you know which users will be affected by app-enforced restrictions and can design the policy accurately.
2. **The application inventory from Intune** surfaces the shadow IT picture that informs data security scope — what apps are running, what cloud storage is installed, whether consumer AI tools are present.
3. **Managed device as a data exfiltration boundary** — with compliant devices required for access, the remaining data risk is through sanctioned apps on managed devices. That is the scope of the next assignment.
---
*For the identity foundation, see [Assignment: Identity Baseline](assignment-identity-baseline.md).*
*For the CA Layer 3 policies this assignment activates, see [Assignment: CA Architecture](assignment-ca-architecture.md).*
*For the governing philosophy on device posture, see [Book IV — Devices & Endpoint](../books/03-devices-and-intune.md).*