14 KiB
The Antifragile Handbook for M365 & Active Directory
Book V — Data & Collaboration (Exchange, SharePoint, Teams, OneDrive)
Data is liquid. It leaves where you put it — copied, shared, forwarded, synced, linked. The question is never "is it locked down" but "where can it flow, who can reshare it, and can you see and reverse the flow?"
The governing question
Books II–IV protected the containers: identity, privilege, devices. This book is about the contents, and contents obey a different physics. You can perfectly secure a container and still lose the data, because data doesn't stay put — it's duplicated into an email, dropped in a Team, synced to a laptop, handed to a guest who reshares it to someone you've never heard of. Perimeter thinking dies here.
Every share is a copy of your blast radius handed to a party you don't control. Can you see where it went, and can you pull it back?
For most estates the honest answers are "no" and "no": nobody can enumerate the external shares, nobody reviews the guests, and a file shared to "Anyone with the link" three years ago is still reachable by anyone who ever held that link.
1. Fragility inventory — how data leaks
"Anyone" links: bearer tokens for your data
Anonymous "Anyone with the link" sharing in SharePoint/OneDrive is the single largest data-exposure fragility in M365. A link is a bearer token — whoever holds it has access, no identity, no MFA, no device check, often no expiry, and it's forwardable. Its blast radius is everyone the link ever reaches, forever, including the open web if it leaks into an email thread or a crawler. Conditional Access, compliant devices, all of Books II–IV — none of it applies to a bearer link. It's a hole punched clean through every wall you built.
Reshare, and the chain you can't see
Once data is shared — especially externally — the recipient can usually reshare, download, and copy it. You've handed your blast radius to an org (or a personal account) whose security posture you don't control and can't observe. Guests reshare to other guests. The chain of custody becomes invisible after the first hop. And the controls that govern this in Teams collaboration are split across several layers — Teams policy, SharePoint org- and site-level sharing, OneDrive, tenant sharing settings, and B2B/cross-tenant access — that interact in non-obvious ways and don't always agree. (More in §honest uncertainty; this is a place where the policy matrix and the observed behaviour routinely diverge.)
Guest sprawl: standing blast radius at the data layer
Guests accumulate and nobody prunes them. The guest invited for one project in 2022 still has a foothold. Each is an external identity governed by their security, not yours — the data-layer cousin of standing privilege (Book III) and shadow devices (Book IV). Unreviewed guest access is a slowly metastasising external attack surface, and most tenants cannot even produce the list of who has it and to what.
Email: the oldest, most Lindy exfil channel
Auto-forwarding rules are the classic business-email-compromise move — a quiet hidden rule that copies all mail to an external address, persistent and invisible. Add attachment-save paths that escape policy, and mail remains the most reliable way data walks out the door. External auto-forward should be off by default, and its presence should scream.
The hybrid Exchange anchor (Book II at the data layer)
An on-prem Exchange server is a Tier-0-adjacent liability — historically one of the most catastrophic on-prem attack surfaces, where mailbox/management permissions can escalate toward AD. Hybrid Exchange drags that liability into the estate, and subtle functionality dependencies keep the last server alive long past its welcome. The via-negativa prize is decommissioning on-prem Exchange entirely (§2) — verify the current management/recipient tooling first.
Internal oversharing
External isn't the only blast radius. "Everyone," "All company," and "Everyone except external users" permissions on a site holding HR, finance, or M&A data mean one compromised internal account reaches it all. Default-open SharePoint sites and self-service site creation produce internal data sprawl that no one maps.
Collaboration sprawl by design
Every Team spins up a SharePoint site, an M365 group, a mailbox, and more — each with its own sharing and guest settings, each a potential leak. Self-service creation means ungoverned proliferation of data containers, and collaboration tools carry subtle data-visibility behaviours (who sees what history, what a late joiner can read) that surprise even experts. Sprawl nobody inventories is fragility nobody can see.
Illicit OAuth consent: data exfil through a "legitimate" app
A user clicks OK on an app requesting Mail.Read or Files.Read.All, and now a third party reads tenant data through a sanctioned-looking grant. This is the data-layer face of Book III's app-registration dark matter — exfil that needs no malware and trips no device control.
Retention as hoarded blast radius
Keeping everything forever makes every breach maximal: the attacker gets fifteen years of data instead of one. Over-retention is hoarding fragility — every byte you keep is a byte that can be stolen. (Its opposite, no recoverable copy at all, is Book VI's problem. The art is disposing of what you don't need while protecting what you do.)
2. Via negativa — what to remove
- Kill anonymous "Anyone" links. Default external sharing to authenticated, time-limited, least-permission (view, not edit). Remove the bearer token from your data entirely where you can.
- Decommission on-prem Exchange. Remove the Tier-0-adjacent liability; get off hybrid Exchange where the dependency can actually be cut (verify current tooling — §honest uncertainty).
- Block external auto-forwarding by default. Delete the quietest exfil channel there is.
- Prune guests ruthlessly. Access reviews, expiration, entitlement management. Stale external access gets removed, and new guest access expires by default. Treat guest sprawl like standing privilege: minimise and time-box it.
- Minimise retention. Dispose of stale data on a schedule. Shrink the prize so every breach is smaller. Data you no longer hold cannot be exfiltrated.
- Remove broad internal shares ("All company"/"Everyone") from anything sensitive. Sensitive data should live in few, known places with narrow access.
- Govern self-service creation and clean up the dead. Curb ungoverned Team/ site/app creation; archive and delete orphaned, inactive containers.
- Restrict user consent and revoke illicit grants. Users shouldn't be able to hand tenant data to arbitrary apps; admin-consent workflow for anything sensitive, and sweep out the over-permissioned grants already there.
3. The barbell — find the crown jewels, free the rest
Name the crown jewels. Which handful of data sets — the IP, the regulated data, the executive and M&A comms, the source of the company's value — would, if leaked, actually end the business? Most organisations cannot name them, and that inability is finding #1. You cannot protect asymmetrically until you know what the asymmetry is for.
Paranoid protection for the crown jewels:
- Sensitivity labels with encryption that travels with the file. This is the convex control of the data world (Book I, principle 7): one label protects the file everywhere it goes, forever — even after it leaves the tenant, lands on an unmanaged device, or is forwarded to a stranger. The protection is bound to the data, not the container. That's the only thing that survives data's liquidity.
- Restricted sites, no external sharing, tight access with recurring reviews.
- Conditional Access app control / session controls — browser-only, block-download for sensitive data on unmanaged devices (the Book IV boundary applied to content).
- Heightened monitoring on crown-jewel access (feeds Book VI).
Free everything else. Most collaboration data is low value and should flow fast — velocity is a feature (Book I creed). Don't lock the lunch-menu SharePoint with M&A-vault rigour. Spreading DLP and restriction evenly across all data is the concave failure: enormous maintenance, false positives that train users to click through, and the real exfil lost in the noise. DLP is a scalpel for known high-value patterns (card numbers, national IDs, the labelled crown jewels), not a dragnet over everything.
4. Optionality & recovery — escape hatches, tested
- The label is the escape hatch. Because encryption travels with the file, a leaked crown-jewel document is still encrypted wherever it lands — you pre-paid for the data to survive being stolen. That is optionality bound into the byte.
- Fast share revocation. Can you, in 30 minutes, enumerate and kill every external share and anonymous link? If you can't produce the list, you can't pull it back — build the report and the revocation muscle before you need them.
- Audit and content forensics — switched on and retained. "Who accessed and downloaded what" is your post-incident truth, but only if audit logging is actually enabled and retained long enough to matter. Verify it's on; don't assume (§honest uncertainty).
- Guest access reviews as recurring pruning — the recovery loop for sprawl.
- Immutable/held copies of crown-jewel data — the bridge to Book VI backup.
5. Stressor — break it on purpose
- Exfiltrate a labelled crown-jewel file yourself. Email it externally, share it anonymously, download it through CAA session control, open it on an unmanaged device. Does the label encryption hold? Does DLP fire? Does anything alert? You are testing the behaviour, not the policy screen (Book I corollary).
- Plant a canary document seeded with a detectable pattern and try to move it out every way you can. What catches it? What doesn't?
- Enumerate the external surface. Produce the full list of "Anyone" links, external guests, and externally-shared files. The exercise of trying usually reveals you can't — which is the finding.
- Simulate the BEC forward rule. Set a test external auto-forward. Is it blocked? Alerted? Silent? Silence is the BEC attacker's favourite answer.
- Test the reshare chain. Share to a test guest, have them reshare onward. Can you see it? Stop it? Pull it back?
- Reconcile declared vs enforced sharing. The tenant sharing setting says one thing; walk the actual per-site and per-link reality. They diverge — the ghost-policy cousin from Book IV, at the data layer.
Per Book I principle 6: every leak path found becomes a structural change — a killed link type, a pruned guest population, a label applied, a coupling removed — not a note in a spreadsheet.
Honest uncertainty (the sharing matrix moves — test, don't trust it)
Stable and Lindy (teach with confidence): data is liquid; bearer links are exposure; protection must travel with the data; minimise the prize; DLP is a scalpel not a dragnet; guests are standing blast radius. None of that churns.
What moves, and what you must verify by testing rather than reading:
- External sharing enforcement is split across many interacting layers — Teams policy, SharePoint org/site sharing, OneDrive, tenant settings, B2B/cross-tenant access, and the Premium tiers — and they don't always agree. Enforcement can differ by client and platform, and the documented matrix and the observed behaviour diverge often enough that you should confirm the real behaviour on a real client, not from the policy screen. When you find an inconsistency that survives reconfiguration, that's a vendor escalation, not your error.
- On-prem Exchange decommissioning and the "last server for management" story — the tooling has evolved; verify the current supported path before promising the coupling can be cut.
- Purview / sensitivity labels / auto-labelling / DLP capabilities churn fast, including the branding. Verify current coverage and licensing.
- Cross-tenant access settings (B2B collaboration and direct connect) are comparatively new and evolving — verify current behaviour.
- Audit log retention defaults and licensing have changed over time. Confirm what's actually captured and for how long before you rely on it for forensics.
If a client's safety hinges on a specific sharing behaviour, test it on a live client and cite the current doc — and where the client behaviour contradicts the doc, believe the client.
Consolidated judgement prompts
- Can we name the crown jewels? If not, that's finding #1 — everything else is guesswork until we can.
- Can we enumerate every external share, anonymous link, and guest right now? Can we revoke them fast?
- Does protection travel with the crown-jewel data (labels/encryption), or only with the container it currently sits in?
- Where can this data flow — reshare, forward, sync, download, OAuth app — and is any of that flow visible or reversible?
- Are guests treated as standing blast radius (minimised, time-boxed, reviewed) or left to accumulate?
- Is DLP a scalpel on known high-value patterns, or a dragnet generating noise everyone clicks through?
- Is on-prem Exchange still anchoring the estate? What would it take to cut it?
- Is audit logging actually on and retained long enough to reconstruct an incident?
- Does the tenant's declared sharing posture match what the sites and links actually enforce?
Book V of the Antifragile Handbook. You cannot wall in a liquid. Name the few things that would end the company, bind protection to the data itself, shrink the prize, and make every flow visible and reversible. Move fast and fix things.