Sync from dev @ 252c1cf

Source: main (252c1cf)
Excluded: live tenant exports, generated artifacts, and dev-only tooling.
This commit is contained in:
2026-04-17 15:57:35 +02:00
commit 17d745bdac
52 changed files with 15601 additions and 0 deletions

8
.gitignore vendored Normal file
View File

@@ -0,0 +1,8 @@
.DS_Store
docs/share/
docs/security-review-package.pdf
docs/security-review-questionnaire.pdf
node_modules/
__pycache__/
*.py[cod]
*$py.class

198
AGENTS.md Normal file
View File

@@ -0,0 +1,198 @@
# Agent Guidance: Intune / Entra Drift Backup
This repository tracks Git-based snapshots of Microsoft Intune and Entra ID configuration, generates review reports, and drives a rolling pull-request workflow for post-change review and remediation.
## Project Overview
The implementation is centered on three Azure DevOps pipelines:
- `azure-pipelines.yml`: hourly backup/export pipeline with rolling PR management.
- `azure-pipelines-review-sync.yml`: 20-minute reviewer-decision sync and post-merge remediation queue.
- `azure-pipelines-restore.yml`: manual or auto-queued restore pipeline for approved baseline rollback.
Workflow at a high level:
1. Export Intune and Entra configuration into `tenant-state/`.
2. Generate Markdown/CSV reports in `tenant-state/reports/`.
3. Filter known non-actionable drift noise before commit.
4. Commit workload drift to `drift/intune` and `drift/entra`.
5. Create or update one rolling PR per workload into `main`.
6. Refresh the PR description with deterministic change/risk summary and optional Azure OpenAI narrative.
7. Apply reviewer `/reject` or `/accept` decisions and queue restore when needed.
## Technology Stack
- **Python 3**: primary language for all automation scripts.
- **Azure DevOps Pipelines**: YAML-based CI/CD (`azure-pipelines.yml`, `azure-pipelines-review-sync.yml`, `azure-pipelines-restore.yml`).
- **PowerShell & Bash**: inline pipeline steps for Git operations, token retrieval, and conditional logic.
- **IntuneCD** (Python package, pinned to `2.5.0`): exports Intune configuration and restores baseline state.
- **Microsoft Graph API**: reads/writes tenant configuration and resolves references.
- **Node.js / md-to-pdf** (v5.2.5): generates HTML and PDF documentation artifacts from Markdown on full runs.
- **Azure OpenAI**: optional PR narrative generation.
## Repository Layout
```
.
├── azure-pipelines.yml # Main hourly backup pipeline
├── azure-pipelines-review-sync.yml # 20-minute review sync
├── azure-pipelines-restore.yml # Baseline restore pipeline
├── scripts/ # Python automation helpers
├── tests/ # unittest coverage for scripts
├── tenant-state/ # Committed JSON exports and reports
│ ├── intune/
│ ├── entra/
│ └── reports/
├── docs/ # Security review docs and roadmap
├── md2pdf/ # HTML/PDF styling and configs
├── prod-as-built.md # Generated as-built source
└── README.md # Operational overview for humans
```
### Key Scripts
- `export_entra_baseline.py`: Graph API export for Entra objects (Named Locations, Authentication Strengths, Conditional Access, App Registrations, Enterprise Applications).
- `commit_entra_drift.py`: commits Entra drift with author attribution from audit logs.
- `resolve_ca_references.py`: resolves Conditional Access GUID references to human-readable names.
- `filter_entra_enrichment_noise.py`: reverts JSON churn caused by best-effort Graph enrichment (owners, app roles).
- `filter_intune_partial_settings_noise.py`: reverts partial Settings Catalog exports.
- `generate_assignment_report.py`: produces Markdown and CSV assignment inventories.
- `generate_app_inventory_report.py`: produces Entra apps inventory CSV.
- `generate_object_inventory_reports.py`: produces per-category object inventory CSVs.
- `validate_backup_outputs.py`: asserts required files exist after export.
- `ensure_rolling_pr.py`: creates or updates one rolling drift PR per workload.
- `update_pr_review_summary.py`: refreshes PR descriptions with change counts, risk assessment, and optional AI narrative.
- `apply_reviewer_rejections.py`: processes `/reject` and `/accept` reviewer thread commands.
- `queue_post_merge_restore.py`: queues restore pipeline after merged PRs that contained `/reject` decisions.
## Code Style and Conventions
- Every Python file starts with `#!/usr/bin/env python3` and `from __future__ import annotations`.
- Type hints are used throughout (`typing.Any`, `argparse.Namespace`, etc.).
- Internal helper functions are prefixed with `_`.
- Common environment parsing helpers appear in multiple scripts:
- `_env_text(name, default="")` reads and sanitizes env vars, treating unresolved Azure DevOps macros `$(...)` as empty.
- `_env_bool(name, default=False)` interprets `1`, `true`, `yes`, `on` as boolean true.
- Arguments use `argparse` with typed flags; pipeline variables are passed as env vars or CLI args.
- JSON is written with `indent=4` or `indent=5` and `ensure_ascii=False`.
- HTTP calls to Graph or Azure DevOps REST APIs use `urllib.request` (no external HTTP library).
## Testing
Tests are written with the Python standard library `unittest` framework. There is **no pytest configuration** (`pyproject.toml`, `setup.py`, or `pytest.ini` are absent). Modules are loaded dynamically in tests using `importlib.util.spec_from_file_location` so that scripts in `scripts/` do not need to be on `PYTHONPATH`.
### Run Tests
```bash
python3 -m unittest discover -s tests -v
```
### Test Coverage Areas
- `test_ensure_rolling_pr.py`: rolling PR creation, draft publishing, merge strategy logic.
- `test_export_entra_baseline.py`: Entra export parsing, concurrent export behavior, error handling.
- `test_filter_entra_enrichment_noise.py`: enrichment-only churn detection and reversion.
- `test_filter_intune_partial_settings_noise.py`: partial Settings Catalog export filtering.
- `test_queue_post_merge_restore.py`: post-merge restore queueing logic.
- `test_update_pr_review_summary.py`: semantic diffing, AI thread management, PR description upserts.
- `test_validate_backup_outputs.py`: validation rules for Intune and Entra outputs.
## Build and Runtime Architecture
There is no traditional build step for the Python code. The pipelines install runtime dependencies on each run:
```bash
pip3 install "IntuneCD==2.5.0"
```
For local development, only a Python 3 interpreter is required; scripts use the standard library except for the optional IntuneCD package.
### Pipeline Jobs
- **Intune backup job** (`backup_intune`):
1. Prepare `drift/intune` branch from `main`.
2. Decide light vs full mode (configured full-run hour or `forceFullRun=true`).
3. Run `IntuneCD-startbackup`.
4. Filter partial Settings Catalog exports.
5. Resolve assignment group names from Graph.
6. Generate assignment and object inventory reports.
7. Validate outputs.
8. Commit drift and update rolling PR.
- **Entra backup job** (`backup_entra`):
1. Prepare `drift/entra` branch from `main`.
2. Export selected categories with `export_entra_baseline.py`.
3. Resolve Conditional Access references.
4. Generate reports.
5. Validate outputs.
6. Filter enrichment noise and commit drift.
- **Review sync jobs** (`sync_intune_review_decisions`, `sync_entra_review_decisions`):
1. Apply `/reject` decisions.
2. Update automated PR summary.
3. Queue post-merge restore when needed.
- **Restore job** (`restore_from_baseline`):
1. Checkout approved baseline snapshot (branch, tag, or commit).
2. Prepare restore scope (`full` or `selective`).
3. Normalize payload JSON and strip display-only assignment labels.
4. Run `IntuneCD-startupdate` with optional `--entraupdate`.
## Security Considerations
- **Token handling**: Graph tokens are obtained via `Get-AzAccessToken` in PowerShell and passed as secret pipeline variables. Token payload is decoded to validate required application permissions before use.
- **Service connection**: Azure DevOps service connection (e.g. `sc-astral-backup`) uses workload federated credentials.
- **Permissions**: read-only permissions for backup; read-write permissions (`...ReadWrite.All`) for restore. Missing roles are surfaced as pipeline errors before any Graph mutations occur.
- **Path traversal**: selective restore paths are normalized and validated against `..` segments before file copy.
- **Dry run**: restore pipeline defaults to `dryRun=true` and must be explicitly overridden to push changes.
- **Access token scope**: `System.AccessToken` is required for PR and thread management via Azure DevOps REST APIs.
## Common Development Tasks
### Generate Entra export locally
```bash
python3 ./scripts/export_entra_baseline.py \
--root ./tenant-state/entra \
--token "$GRAPH_TOKEN" \
--enterprise-app-workers 8
```
### Resolve Conditional Access references locally
```bash
python3 ./scripts/resolve_ca_references.py \
--root ./tenant-state/entra \
--token "$GRAPH_TOKEN"
```
### Generate assignment report locally
```bash
python3 ./scripts/generate_assignment_report.py \
--root ./tenant-state/intune \
--output-dir ./tenant-state/reports/intune
```
### Validate backup outputs locally
```bash
python3 ./scripts/validate_backup_outputs.py \
--workload intune \
--mode light \
--root ./tenant-state/intune \
--reports-root ./tenant-state/reports/intune
```
## Key Environment / Pipeline Variables
- `BASELINE_BRANCH` (default: `main`)
- `DRIFT_BRANCH_INTUNE` (default: `drift/intune`)
- `DRIFT_BRANCH_ENTRA` (default: `drift/entra`)
- `BACKUP_FOLDER` (default: `tenant-state`)
- `ENABLE_WORKLOAD_INTUNE` / `ENABLE_WORKLOAD_ENTRA`
- `ENABLE_PR_REVIEW_SUMMARY` / `ENABLE_PR_REVIEWER_DECISIONS`
- `AUTO_REMEDIATE_AFTER_MERGE` / `AUTO_REMEDIATE_DRY_RUN`
- `ENABLE_PR_AI_SUMMARY` + `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_DEPLOYMENT`, `AZURE_OPENAI_API_KEY`
- `ROLLING_PR_DELAY_REVIEWER_NOTIFICATIONS` / `ROLLING_PR_MERGE_STRATEGY`
See the top of each pipeline YAML for the full variable list and defaults.

422
README.md Normal file
View File

@@ -0,0 +1,422 @@
# Intune / Entra Drift Backup
This repository keeps Git-tracked snapshots of Microsoft Intune and selected Entra ID configuration, generates review reports, and drives a rolling pull-request workflow for post-change review and remediation.
> **Product name:** ASTRAL (Admin Security Through Review, Automation & Least-privilege)
## Getting Started
This repository is designed to be forked or downloaded into your own Azure DevOps organization. Each tenant gets its own project and pipeline instance.
Quick start:
1. Fork or import this repository into an Azure DevOps project.
2. Review `templates/variables-tenant.yml` and create a matching Azure DevOps Variable Group in your project (e.g. `vg-astral-tenant`).
3. Uncomment the variable group reference in the three pipeline YAMLs.
4. Run `deploy/bootstrap-tenant.ps1` to create the Azure AD app registration, assign Graph permissions, and configure the federated credential.
5. Create the Azure DevOps service connection using the app registration details from the bootstrap script.
6. Import the three pipelines (`azure-pipelines.yml`, `azure-pipelines-review-sync.yml`, `azure-pipelines-restore.yml`) into Azure DevOps.
7. Run `deploy/validate-deployment.yml` to verify connectivity and permissions.
8. Set `AUTO_REMEDIATE_RESTORE_PIPELINE_ID` in your variable group after the restore pipeline is imported.
See [`deploy/onboarding-runbook.md`](deploy/onboarding-runbook.md) for the full step-by-step guide.
## What The Repository Does
The implementation is centered on three Azure DevOps pipelines:
- `azure-pipelines.yml`: hourly backup/export pipeline with rolling PR management.
- `azure-pipelines-review-sync.yml`: 20-minute reviewer-decision sync and post-merge remediation queue.
- `azure-pipelines-restore.yml`: manual or auto-queued restore pipeline for approved baseline rollback.
The main workflow is:
1. Export Intune and Entra configuration into `tenant-state/`.
2. Generate Markdown/CSV reports in `tenant-state/reports/`.
3. Filter known non-actionable drift noise before commit.
4. Commit workload drift to `drift/intune` and `drift/entra`.
5. Create or update one rolling PR per workload into `main`.
6. Refresh the PR description with deterministic change/risk summary and optional Azure OpenAI narrative.
7. Apply reviewer `/reject` or `/accept` decisions and queue restore when needed.
This is an ex-post change-management model: admins can change settings in the Microsoft admin portals, and the repo turns those changes into auditable Git drift with a review and rollback path.
## Current Baseline Coverage
Intune currently tracks:
- App Configuration
- App Protection
- Apple Push Notification
- Apple VPP Tokens
- Applications
- Compliance Policies
- Device Configurations
- Device Management Settings
- Enrollment Configurations
- Enrollment Profiles
- Filters
- Scope Tags
- Scripts
- Settings Catalog
Entra currently tracks:
- Named Locations
- Authentication Strengths
- Conditional Access
- App Registrations
- Enterprise Applications
Current scope behavior:
- Named Locations, Authentication Strengths, and Conditional Access run on hourly light runs and midnight full runs.
- App Registrations and Enterprise Applications are enabled in the pipeline but exported only on full runs.
- During light runs, the previous drift-branch snapshot of `App Registrations` and `Enterprise Applications` is preserved to avoid churn and heavy export cost.
## Repository Layout
- `README.md`: operational overview.
- `azure-pipelines.yml`: backup/export, report generation, drift commit, rolling PR, and docs/artifact flow.
- `azure-pipelines-review-sync.yml`: reviewer decision sync and post-merge remediation helper.
- `azure-pipelines-restore.yml`: baseline restore pipeline with full or selective scope.
- `docs/m365-baseline-roadmap.md`: expansion roadmap beyond current workload scope.
- `docs/security-review-package.md`: implementation-focused security review package.
- `docs/security-review-questionnaire.md`: short-form security review answers.
- `scripts/`: export, reporting, PR automation, validation, and remediation helpers.
- `tests/`: focused unit coverage for the Python helpers.
- `tenant-state/intune`: committed Intune JSON export.
- `tenant-state/entra`: committed Entra JSON export.
- `tenant-state/reports/intune`: Intune CSV/Markdown reports.
- `tenant-state/reports/entra`: Entra CSV/Markdown reports.
- `prod-as-built.md`: generated as-built document source.
- `md2pdf/`: HTML/PDF styling and config for documentation publish.
## Pipeline Model
### Main Backup Pipeline
`azure-pipelines.yml` runs hourly on `main`.
For Intune it:
1. Prepares `drift/intune` from `main`.
2. Chooses light vs full mode from the configured local timezone, with `forceFullRun=true` override.
3. Runs IntuneCD export.
4. Reverts partial Settings Catalog exports with `scripts/filter_intune_partial_settings_noise.py`.
5. Resolves assignment group names from Graph when needed.
6. Generates assignment and object inventory reports.
7. Validates outputs with `scripts/validate_backup_outputs.py`.
8. Commits drift and updates the rolling PR flow.
For Entra it:
1. Prepares `drift/entra` from `main`.
2. Chooses effective export scope per mode.
3. Exports selected categories with `scripts/export_entra_baseline.py`.
4. Resolves Conditional Access reference names with `scripts/resolve_ca_references.py`.
5. Generates assignment, app, and object inventory reports.
6. Validates outputs with `scripts/validate_backup_outputs.py`.
7. Reverts enrichment-only JSON churn with `scripts/filter_entra_enrichment_noise.py`.
8. Commits drift with `scripts/commit_entra_drift.py`.
### Review Sync Pipeline
`azure-pipelines-review-sync.yml` runs every 20 minutes on `main` and exists to shorten the reviewer feedback loop.
Per workload it can:
- apply reviewer `/reject` and `/accept` decisions with `scripts/apply_reviewer_rejections.py`
- refresh the automated PR summary
- queue restore after merged PRs that contained reviewer `/reject` decisions using `scripts/queue_post_merge_restore.py`
### Restore Pipeline
`azure-pipelines-restore.yml` restores from approved baseline (`main` by default) or from a historical branch, tag, or commit.
Supported restore modes:
- `full`: restore the full committed Intune baseline
- `selective`: restore only selected file paths
It also supports optional Entra update when restore automation is triggered for Entra review outcomes.
## Schedule And Run Modes
- Main backup schedule: hourly, `0 * * * *`, on `main`
- Review sync schedule: every 20 minutes, `*/20 * * * *`, on `main`
- Full mode: configured full-run hour (default 00:00) or manual queue with `forceFullRun=true`
- Light mode: every other scheduled hour
Full mode adds:
- full Entra scope, including App Registrations and Enterprise Applications
- Intune split-documentation generation
- HTML/PDF artifact generation when browser dependencies are available
- optional tagging and documentation publish steps
## Branch And PR Model
- Baseline branch: `main`
- Drift branches:
- `drift/intune`
- `drift/entra`
Each workload keeps one rolling PR open to `main`.
Key behavior:
- reports are generated in `tenant-state/reports/*` but excluded from rolling drift commits and PR diffs
- rolling PRs can be created as draft first, then published after automated summary generation when `ROLLING_PR_DELAY_REVIEWER_NOTIFICATIONS=true`
- merge strategy for the rolling PR is controlled by `ROLLING_PR_MERGE_STRATEGY` and defaults to `rebase`
## Review, Tickets, And Remediation
The PR automation currently supports:
- deterministic operation counts and risk assessment
- rename-aware semantic comparison
- stable change fingerprinting for idempotent summary refresh
- optional Azure OpenAI reviewer narrative
- optional per-file `Change Needed` review threads when `REQUIRE_CHANGE_TICKETS=true`
Reviewer thread commands:
- `/reject`: remove that file-level drift from the rolling PR by resetting it to baseline
- `/accept`: keep that file in PR scope
Supported remediation paths:
1. Reject and abandon a whole rolling PR.
The next run can detect the matching rejected snapshot and queue restore automatically.
2. Reject selected files in ticket threads, then merge the accepted remainder.
The review-sync pipeline can queue restore after merge so the tenant is reconciled to the merged baseline.
3. Queue `azure-pipelines-restore.yml` manually for full or selective historical rollback.
## Key Variables
Core repo and branch settings:
- `BASELINE_BRANCH`
- `DRIFT_BRANCH_INTUNE`
- `DRIFT_BRANCH_ENTRA`
- `ROLLING_PR_TITLE_INTUNE`
- `ROLLING_PR_TITLE_ENTRA`
- `BACKUP_FOLDER`
- `INTUNE_BACKUP_SUBDIR`
- `ENTRA_BACKUP_SUBDIR`
- `REPORTS_SUBDIR`
Workload toggles:
- `ENABLE_WORKLOAD_INTUNE`
- `ENABLE_WORKLOAD_ENTRA`
- `ENABLE_ENTRA_CONDITIONAL_ACCESS`
Intune behavior:
- `INTUNECD_VERSION`
- `EXCLUDE_SCRIPT_BACKUP`
- `INTUNE_EXCLUDE_CSV`
- `SPLIT_DOCUMENTATION`
Entra behavior:
- `ENTRA_INCLUDE_NAMED_LOCATIONS`
- `ENTRA_INCLUDE_AUTHENTICATION_STRENGTHS`
- `ENTRA_INCLUDE_CONDITIONAL_ACCESS`
- `ENTRA_INCLUDE_APP_REGISTRATIONS`
- `ENTRA_INCLUDE_ENTERPRISE_APPS`
- `ENTRA_ENTERPRISE_APP_WORKERS`
PR and reviewer automation:
- `ENABLE_PR_REVIEW_SUMMARY`
- `ENABLE_PR_REVIEWER_DECISIONS`
- `ROLLING_PR_DELAY_REVIEWER_NOTIFICATIONS`
- `ROLLING_PR_MERGE_STRATEGY`
- `REQUIRE_CHANGE_TICKETS`
- `CHANGE_TICKET_REGEX`
- `DEBUG_CHANGE_TICKET_THREADS`
Auto-remediation:
- `AUTO_REMEDIATE_ON_PR_REJECTION`
- `AUTO_REMEDIATE_AFTER_MERGE`
- `AUTO_REMEDIATE_AFTER_MERGE_LOOKBACK_HOURS`
- `AUTO_REMEDIATE_RESTORE_PIPELINE_ID`
- `AUTO_REMEDIATE_DRY_RUN`
- `AUTO_REMEDIATE_UPDATE_ASSIGNMENTS`
- `AUTO_REMEDIATE_REMOVE_OBJECTS`
- `AUTO_REMEDIATE_MAX_WORKERS`
- `AUTO_REMEDIATE_EXCLUDE_CSV`
Azure OpenAI integration:
- `ENABLE_PR_AI_SUMMARY`
- `AZURE_OPENAI_ENDPOINT`
- `AZURE_OPENAI_DEPLOYMENT`
- `AZURE_OPENAI_API_KEY`
- `AZURE_OPENAI_API_VERSION`
- `PR_AI_PAYLOAD_MAX_BYTES`
- `PR_AI_MAX_TOKENS`
- `PR_AI_COMPACT_MAX_CHARS`
## Required Azure DevOps Permissions
The pipeline build identity should have repository permissions to:
- contribute
- create branch
- force push
- create and update pull requests
- create tag if tagging is enabled
For auto-queued restore, the same identity also needs on `azure-pipelines-restore.yml`:
- `View builds`
- `Queue builds`
- pipeline authorization if explicit pipeline permissions are enforced
Also enable script access to `System.AccessToken`.
## Required Microsoft Graph Application Permissions
Baseline read permissions used by the current implementation:
- `Device.Read.All`
- `DeviceManagementApps.Read.All`
- `DeviceManagementConfiguration.Read.All`
- `DeviceManagementManagedDevices.Read.All`
- `DeviceManagementRBAC.Read.All`
- `DeviceManagementScripts.Read.All`
- `DeviceManagementServiceConfig.Read.All`
- `Group.Read.All`
- `Policy.Read.All`
- `Policy.Read.ConditionalAccess`
- `Policy.Read.DeviceConfiguration`
- `User.Read.All`
Additional read permissions used by the current Entra scope:
- `Application.Read.All`
- `RoleManagement.Read.Directory` or `Directory.Read.All` for richer name resolution
- `AuditLog.Read.All` for best-effort Entra drift author attribution
Restore pipeline write permissions:
- `DeviceManagementApps.ReadWrite.All`
- `DeviceManagementConfiguration.ReadWrite.All`
- `DeviceManagementManagedDevices.ReadWrite.All`
- `DeviceManagementRBAC.ReadWrite.All`
- `DeviceManagementScripts.ReadWrite.All`
- `DeviceManagementServiceConfig.ReadWrite.All`
- `Group.Read.All`
Additional restore permission when `includeEntraUpdate=true`:
- `Policy.Read.All`
- `Policy.ReadWrite.ConditionalAccess`
## Outputs
Intune outputs:
- JSON backup under `tenant-state/intune/**`
- `tenant-state/reports/intune/policy-assignments.md`
- `tenant-state/reports/intune/policy-assignments.csv`
- `tenant-state/reports/intune/object-inventory-all.csv`
- `tenant-state/reports/intune/Object Inventory/*-inventory.csv`
Entra outputs:
- JSON backup under `tenant-state/entra/**`
- `tenant-state/reports/entra/policy-assignments.md`
- `tenant-state/reports/entra/policy-assignments.csv`
- `tenant-state/reports/entra/apps-inventory.csv`
- `tenant-state/reports/entra/object-inventory-all.csv`
- `tenant-state/reports/entra/Object Inventory/*-inventory.csv`
Full-run documentation artifacts:
- `prod-as-built-split-markdown`
- `prod-as-built-split-html`
- `prod-as-built-split-pdf`
## Local Script Usage
Generate Entra export locally:
```bash
python3 ./scripts/export_entra_baseline.py \
--root ./tenant-state/entra \
--token "$GRAPH_TOKEN" \
--enterprise-app-workers 8
```
Resolve Conditional Access references:
```bash
python3 ./scripts/resolve_ca_references.py \
--root ./tenant-state/entra \
--token "$GRAPH_TOKEN"
```
Generate Intune assignment report:
```bash
python3 ./scripts/generate_assignment_report.py \
--root ./tenant-state/intune \
--output-dir ./tenant-state/reports/intune
```
Generate Entra assignment report:
```bash
python3 ./scripts/generate_assignment_report.py \
--root ./tenant-state/entra \
--output-dir ./tenant-state/reports/entra
```
Generate Entra apps inventory:
```bash
python3 ./scripts/generate_app_inventory_report.py \
--root ./tenant-state/entra \
--output-dir ./tenant-state/reports/entra
```
Generate workload object inventories:
```bash
python3 ./scripts/generate_object_inventory_reports.py \
--root ./tenant-state/intune \
--output-dir ./tenant-state/reports/intune
python3 ./scripts/generate_object_inventory_reports.py \
--root ./tenant-state/entra \
--output-dir ./tenant-state/reports/entra
```
Validate backup outputs:
```bash
python3 ./scripts/validate_backup_outputs.py \
--workload intune \
--mode light \
--root ./tenant-state/intune \
--reports-root ./tenant-state/reports/intune
```
## Tests
The repository includes focused unit tests for:
- Entra export behavior
- backup output validation
- rolling PR creation/update logic
- PR summary generation
- reviewer rejection processing
- post-merge restore queueing
- Intune partial-export noise filtering
- Entra enrichment-noise filtering

611
azure-pipelines-restore.yml Normal file
View File

@@ -0,0 +1,611 @@
trigger: none
pr: none
parameters:
- name: dryRun
displayName: Dry run only (report, no changes pushed)
type: boolean
default: true
- name: updateAssignments
displayName: Update assignments
type: boolean
default: true
- name: removeObjectsNotInBaseline
displayName: Remove objects not present in baseline
type: boolean
default: false
- name: includeEntraUpdate
displayName: Include Entra updates
type: boolean
default: false
- name: baselineBranch
displayName: Baseline branch to restore from
type: string
default: main
- name: baselineRef
displayName: Optional historical git ref (branch/tag/commit) to restore from
type: string
default: ""
- name: restoreMode
displayName: Restore mode (`full` or `selective`)
type: string
default: full
- name: restorePathsCsv
displayName: Selective restore file paths (CSV; repo-relative or intune-relative)
type: string
default: ""
- name: maxWorkers
displayName: IntuneCD max workers
type: number
default: 10
- name: excludeCsv
displayName: Exclude object categories (comma-separated IntuneCD keys)
type: string
default: ""
variables:
# Tenant-specific values are expected in a variable group (see templates/variables-tenant.yml).
# Uncomment the line below after creating the group in your Azure DevOps project.
# - group: vg-astral-tenant
- template: templates/variables-common.yml
- name: BACKUP_FOLDER
value: tenant-state
- name: INTUNE_BACKUP_SUBDIR
value: intune
- name: INTUNECD_VERSION
value: 2.5.0
jobs:
- job: restore_from_baseline
displayName: Restore tenant from approved baseline
pool:
name: $(AGENT_POOL_NAME)
steps:
- checkout: self
persistCredentials: true
- task: Bash@3
displayName: Checkout approved baseline snapshot
inputs:
targetType: inline
script: |
set -euo pipefail
TARGET_REF_RAW="${{ parameters.baselineRef }}"
TARGET_REF="$(echo "$TARGET_REF_RAW" | xargs)"
TARGET_REF_LOWER="$(echo "$TARGET_REF" | tr '[:upper:]' '[:lower:]')"
if echo "$TARGET_REF" | grep -Eq '^\$\([^)]+\)$'; then
TARGET_REF=""
elif [ "$TARGET_REF_LOWER" = "none" ] || [ "$TARGET_REF_LOWER" = "null" ] || [ "$TARGET_REF_LOWER" = "n/a" ] || [ "$TARGET_REF_LOWER" = "-" ] || [ "$TARGET_REF_LOWER" = "_none_" ]; then
TARGET_REF=""
fi
if [ -z "$TARGET_REF" ]; then
TARGET_REF="${{ parameters.baselineBranch }}"
fi
git fetch --quiet --tags origin
git fetch --quiet origin "${{ parameters.baselineBranch }}"
RESOLVED_REF=""
if git rev-parse --verify --quiet "origin/$TARGET_REF^{commit}" >/dev/null; then
RESOLVED_REF="origin/$TARGET_REF"
elif git rev-parse --verify --quiet "$TARGET_REF^{commit}" >/dev/null; then
RESOLVED_REF="$TARGET_REF"
elif git fetch --quiet origin "$TARGET_REF" >/dev/null 2>&1; then
RESOLVED_REF="FETCH_HEAD"
fi
if [ -z "$RESOLVED_REF" ]; then
echo "##vso[task.logissue type=error]Unable to resolve baseline ref '$TARGET_REF'."
echo "Checked local ref, origin/<ref>, and direct fetch origin <ref>."
exit 1
fi
git checkout --force --detach "$RESOLVED_REF"
RESOLVED_COMMIT="$(git rev-parse HEAD)"
echo "Restore baseline snapshot selected: requested=$TARGET_REF resolved=$RESOLVED_REF commit=$RESOLVED_COMMIT"
echo "##vso[task.setvariable variable=RESTORE_BASELINE_REF]$TARGET_REF"
echo "##vso[task.setvariable variable=RESTORE_BASELINE_COMMIT]$RESOLVED_COMMIT"
test -d "$(Build.SourcesDirectory)/$(BACKUP_FOLDER)/$(INTUNE_BACKUP_SUBDIR)"
- task: Bash@3
displayName: Install IntuneCD
inputs:
targetType: inline
script: |
set -euo pipefail
pip3 install "IntuneCD==$(INTUNECD_VERSION)"
- task: Bash@3
displayName: Prepare restore payload scope
inputs:
targetType: inline
script: |
set -euo pipefail
python3 - <<'PY'
import os
import pathlib
import shutil
import sys
repo_root = pathlib.Path(os.environ["BUILD_SOURCESDIRECTORY"]).resolve()
backup_folder = os.environ["BACKUP_FOLDER"]
intune_subdir = os.environ["INTUNE_BACKUP_SUBDIR"]
restore_mode = os.environ.get("RESTORE_MODE", "").strip().lower() or "full"
restore_paths_csv = os.environ.get("RESTORE_PATHS_CSV", "").strip()
temp_root = pathlib.Path(os.environ["AGENT_TEMPDIRECTORY"]).resolve() / "restore-scope-intune"
intune_root = repo_root / backup_folder / intune_subdir
if not intune_root.is_dir():
print(f"##vso[task.logissue type=error]Intune restore source root not found: {intune_root}")
raise SystemExit(1)
if restore_mode == "full":
print(f"Restore mode: full (source={intune_root})")
print(f"##vso[task.setvariable variable=RESTORE_SOURCE_PATH]{intune_root}")
raise SystemExit(0)
if restore_mode not in {"selective", "paths"}:
print(f"##vso[task.logissue type=error]Unsupported restoreMode '{restore_mode}'. Use 'full' or 'selective'.")
raise SystemExit(1)
raw_items = [item.strip() for item in restore_paths_csv.replace("\n", ",").split(",")]
raw_items = [item for item in raw_items if item]
if not raw_items:
print("##vso[task.logissue type=error]restoreMode=selective requires restorePathsCsv with at least one path.")
raise SystemExit(1)
backup_prefix = f"{backup_folder}/{intune_subdir}/"
copied = []
errors = []
if temp_root.exists():
shutil.rmtree(temp_root)
temp_root.mkdir(parents=True, exist_ok=True)
def normalize_to_intune_relative(path_text):
p = path_text.replace("\\", "/").lstrip("./")
if p.startswith("/"):
return None
if p.startswith(backup_prefix):
p = p[len(backup_prefix):]
elif p.startswith(f"{intune_subdir}/"):
p = p[len(intune_subdir) + 1 :]
return p.strip("/")
for item in raw_items:
rel = normalize_to_intune_relative(item)
if not rel:
errors.append(f"Invalid path '{item}'")
continue
parts = pathlib.PurePosixPath(rel).parts
if any(part in {"..", ""} for part in parts):
errors.append(f"Path traversal is not allowed: '{item}'")
continue
src = intune_root.joinpath(*parts)
if not src.is_file():
errors.append(f"Path not found in selected baseline snapshot: '{item}' -> '{src}'")
continue
dst = temp_root.joinpath(*parts)
dst.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(src, dst)
copied.append(rel)
if errors:
for message in errors:
print(f"##vso[task.logissue type=error]{message}")
raise SystemExit(1)
if not copied:
print("##vso[task.logissue type=error]No files were prepared for selective restore.")
raise SystemExit(1)
print(f"Restore mode: selective (files={len(copied)})")
for rel in copied:
print(f" - {backup_folder}/{intune_subdir}/{rel}")
print(f"##vso[task.setvariable variable=RESTORE_SOURCE_PATH]{temp_root}")
PY
env:
RESTORE_MODE: ${{ parameters.restoreMode }}
RESTORE_PATHS_CSV: ${{ parameters.restorePathsCsv }}
- task: AzurePowerShell@5
displayName: Get Graph token for restore
inputs:
azureSubscription: $(SERVICE_CONNECTION_NAME)
azurePowerShellVersion: LatestVersion
ScriptType: inlineScript
Inline: |
$getTokenParams = @{
ResourceTypeName = 'MSGraph'
AsSecureString = $true
ErrorAction = 'Stop'
}
$tokenCommand = Get-Command Get-AzAccessToken -ErrorAction Stop
if ($tokenCommand.Parameters.ContainsKey('ForceRefresh')) {
$getTokenParams['ForceRefresh'] = $true
}
$accessToken = ([PSCredential]::New('dummy', (Get-AzAccessToken @getTokenParams).Token).GetNetworkCredential().Password)
$tokenParts = $accessToken.Split('.')
if ($tokenParts.Length -lt 2) { throw "Invalid Graph access token format." }
$payload = $tokenParts[1].Replace('-', '+').Replace('_', '/')
switch ($payload.Length % 4) {
2 { $payload += '==' }
3 { $payload += '=' }
}
$payloadJson = [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String($payload))
$claims = $payloadJson | ConvertFrom-Json
$roles = @($claims.roles)
$sortedRoles = $roles | Sort-Object
Write-Host "Graph token roles for restore: $($sortedRoles -join ', ')"
$missingRoles = @()
$requiredIntuneWriteRoles = @(
'DeviceManagementApps.ReadWrite.All',
'DeviceManagementConfiguration.ReadWrite.All',
'DeviceManagementManagedDevices.ReadWrite.All',
'DeviceManagementRBAC.ReadWrite.All',
'DeviceManagementScripts.ReadWrite.All',
'DeviceManagementServiceConfig.ReadWrite.All',
'Group.Read.All'
)
foreach ($role in $requiredIntuneWriteRoles) {
if (-not ($roles -contains $role)) { $missingRoles += $role }
}
if ("${{ parameters.includeEntraUpdate }}" -eq "true") {
$requiredEntraWriteRoles = @(
'Policy.Read.All',
'Policy.ReadWrite.ConditionalAccess'
)
foreach ($role in $requiredEntraWriteRoles) {
if (-not ($roles -contains $role)) { $missingRoles += $role }
}
}
if ($missingRoles.Count -gt 0) {
$missingRoles = $missingRoles | Select-Object -Unique | Sort-Object
Write-Host "##vso[task.logissue type=error]Graph token is missing restore permissions: $($missingRoles -join ', ')"
throw "Service connection token is missing required Graph application permissions for restore."
}
Write-Host "##vso[task.setvariable variable=accessToken;issecret=true]$accessToken"
- task: Bash@3
displayName: Run IntuneCD restore/update
inputs:
targetType: inline
script: |
set -euo pipefail
echo "RESTORE_SCRIPT_VERSION=2026-03-12.8"
to_lower() {
echo "$1" | tr '[:upper:]' '[:lower:]'
}
DRY_RUN="$(to_lower "$DRY_RUN")"
UPDATE_ASSIGNMENTS="$(to_lower "$UPDATE_ASSIGNMENTS")"
REMOVE_UNMANAGED="$(to_lower "$REMOVE_UNMANAGED")"
ENTRA_UPDATE="$(to_lower "$ENTRA_UPDATE")"
if [ -z "$(RESTORE_SOURCE_PATH)" ]; then
RESTORE_PATH="$(Build.SourcesDirectory)/$(BACKUP_FOLDER)/$(INTUNE_BACKUP_SUBDIR)"
else
RESTORE_PATH="$(RESTORE_SOURCE_PATH)"
fi
export RESTORE_PATH_ENV="$RESTORE_PATH"
python3 - <<'PY'
import base64
import json
import os
import pathlib
import re
import urllib.parse
import urllib.request
root = pathlib.Path(os.environ["RESTORE_PATH_ENV"]).resolve()
if not root.exists():
print(f"Restore source folder not found; payload normalization skipped: {root}")
raise SystemExit(0)
graph_token = os.environ.get("GRAPH_TOKEN", "").strip()
guid_re = re.compile(
r"^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[1-5][0-9a-fA-F]{3}-[89abAB][0-9a-fA-F]{3}-[0-9a-fA-F]{12}$"
)
group_id_cache = {}
def is_guid(value):
return bool(guid_re.match(str(value or "").strip()))
def resolve_group_id(group_name):
name = str(group_name or "").strip()
if not name or not graph_token:
return None
cache_key = name.lower()
if cache_key in group_id_cache:
return group_id_cache[cache_key]
filter_value = name.replace("'", "''")
query = urllib.parse.urlencode(
{
"$select": "id,displayName",
"$filter": f"displayName eq '{filter_value}'",
}
)
url = f"https://graph.microsoft.com/v1.0/groups?{query}"
request = urllib.request.Request(
url,
headers={
"Authorization": f"Bearer {graph_token}",
"Accept": "application/json",
},
method="GET",
)
try:
with urllib.request.urlopen(request, timeout=30) as response:
payload = json.loads(response.read().decode("utf-8"))
except Exception:
group_id_cache[cache_key] = None
return None
values = payload.get("value", []) if isinstance(payload, dict) else []
exact = [
item for item in values
if str(item.get("displayName", "")) == name and is_guid(item.get("id"))
]
if len(exact) == 1:
resolved = exact[0]["id"]
group_id_cache[cache_key] = resolved
return resolved
ids = [item.get("id") for item in values if is_guid(item.get("id"))]
resolved = ids[0] if len(ids) == 1 else None
group_id_cache[cache_key] = resolved
return resolved
def strip_assignment_display_labels(node):
removed = 0
allowed_assignment_target_keys = {
"@odata.type",
"groupId",
"collectionId",
"deviceAndAppManagementAssignmentFilterId",
"deviceAndAppManagementAssignmentFilterType",
}
if isinstance(node, dict):
odata_type = str(node.get("@odata.type", "") or "").lower()
is_assignment_target = "assignmenttarget" in odata_type
if is_assignment_target:
group_name = (
node.get("groupName")
or node.get("groupDisplayName")
or node.get("displayName")
)
group_id = str(node.get("groupId", "") or "").strip()
is_group_target = "groupassignmenttarget" in odata_type
if is_group_target and not is_guid(group_id):
resolved_group_id = resolve_group_id(group_name)
if resolved_group_id:
node["groupId"] = resolved_group_id
group_id = resolved_group_id
for key in list(node.keys()):
if key.startswith("@odata."):
continue
if key not in allowed_assignment_target_keys:
if key in node:
node.pop(key, None)
removed += 1
# Keep only valid group assignment targets for update payload.
if is_group_target and not is_guid(node.get("groupId")):
node["__drop_assignment_target__"] = True
elif "groupId" in node:
for key in ("groupDisplayName", "groupName", "displayName", "groupType"):
if key in node:
node.pop(key, None)
removed += 1
if "targetDisplayName" in node and isinstance(node.get("target"), dict):
node.pop("targetDisplayName", None)
removed += 1
for value in node.values():
removed += strip_assignment_display_labels(value)
elif isinstance(node, list):
for item in node:
removed += strip_assignment_display_labels(item)
return removed
def prune_invalid_assignment_targets(node):
removed = 0
if isinstance(node, dict):
assignments = node.get("assignments")
if isinstance(assignments, list):
filtered = []
for assignment in assignments:
target = assignment.get("target") if isinstance(assignment, dict) else None
if isinstance(target, dict) and target.get("__drop_assignment_target__") is True:
removed += 1
continue
filtered.append(assignment)
if len(filtered) != len(assignments):
node["assignments"] = filtered
for value in node.values():
removed += prune_invalid_assignment_targets(value)
elif isinstance(node, list):
for item in node:
removed += prune_invalid_assignment_targets(item)
return removed
def remove_internal_markers(node):
removed = 0
if isinstance(node, dict):
if "__drop_assignment_target__" in node:
node.pop("__drop_assignment_target__", None)
removed += 1
for value in node.values():
removed += remove_internal_markers(value)
elif isinstance(node, list):
for item in node:
removed += remove_internal_markers(item)
return removed
normalized_payload_json = 0
sanitized_assignment_labels = 0
invalid_assignment_targets_removed = 0
files_changed = 0
for path in sorted(root.rglob("*.json")):
try:
data = json.loads(path.read_text(encoding="utf-8"))
except Exception:
continue
file_changed = False
# IntuneCD expects payloadJson as base64 string; backup may store dict/list.
# Some exports can be list-root JSON, so only access payloadJson on dict roots.
if isinstance(data, dict):
payload = data.get("payloadJson")
if isinstance(payload, (dict, list)):
payload_json = json.dumps(payload, separators=(",", ":"), ensure_ascii=False).encode("utf-8")
data["payloadJson"] = base64.b64encode(payload_json).decode("ascii")
normalized_payload_json += 1
file_changed = True
removed = strip_assignment_display_labels(data)
if removed > 0:
sanitized_assignment_labels += removed
file_changed = True
dropped = prune_invalid_assignment_targets(data)
if dropped > 0:
invalid_assignment_targets_removed += dropped
file_changed = True
# Clean up internal markers used by prune flow.
if remove_internal_markers(data) > 0:
file_changed = True
if file_changed:
path.write_text(json.dumps(data, indent=4, ensure_ascii=False) + "\n", encoding="utf-8")
files_changed += 1
print(
"Restore payload normalization complete: "
f"filesChanged={files_changed}, "
f"appConfigPayloadJsonNormalized={normalized_payload_json}, "
f"assignmentDisplayLabelsRemoved={sanitized_assignment_labels}, "
f"invalidAssignmentTargetsRemoved={invalid_assignment_targets_removed}"
)
PY
cmd=(
IntuneCD-startupdate
--token "$(accessToken)"
--mode=1
--path "$RESTORE_PATH"
--exit-on-error
)
if [ "$DRY_RUN" = "true" ]; then
cmd+=(--report)
fi
if [ "$UPDATE_ASSIGNMENTS" = "true" ]; then
cmd+=(--update-assignments)
fi
if [ "$REMOVE_UNMANAGED" = "true" ]; then
cmd+=(--remove)
fi
if [ "$ENTRA_UPDATE" = "true" ]; then
cmd+=(--entraupdate)
fi
EXCLUDE_CSV_TRIMMED="$(echo "$EXCLUDE_CSV" | xargs)"
EXCLUDE_CSV_NORMALIZED="$(echo "$EXCLUDE_CSV_TRIMMED" | tr '[:upper:]' '[:lower:]')"
if [ "$EXCLUDE_CSV_NORMALIZED" = "none" ] || [ "$EXCLUDE_CSV_NORMALIZED" = "null" ] || [ "$EXCLUDE_CSV_NORMALIZED" = "n/a" ] || [ "$EXCLUDE_CSV_NORMALIZED" = "-" ] || [ "$EXCLUDE_CSV_NORMALIZED" = "_none_" ]; then
EXCLUDE_CSV_TRIMMED=""
fi
exclude_items=()
if [ -n "$EXCLUDE_CSV_TRIMMED" ]; then
IFS=',' read -r -a raw_items <<< "$EXCLUDE_CSV_TRIMMED"
for item in "${raw_items[@]}"; do
trimmed="$(echo "$item" | xargs)"
if [ -n "$trimmed" ]; then
exclude_items+=("$trimmed")
fi
done
fi
has_dms_exclude=0
for item in "${exclude_items[@]}"; do
if [ "$(echo "$item" | tr '[:upper:]' '[:lower:]')" = "devicemanagementsettings" ]; then
has_dms_exclude=1
break
fi
done
if [ "$has_dms_exclude" -eq 0 ]; then
exclude_items+=("DeviceManagementSettings")
echo "Auto-excluding DeviceManagementSettings (IntuneCD update requires interactive auth for this category)."
fi
if [ "${#exclude_items[@]}" -gt 0 ]; then
cmd+=(--exclude)
cmd+=("${exclude_items[@]}")
fi
echo "Restore command mode: dryRun=$DRY_RUN updateAssignments=$UPDATE_ASSIGNMENTS remove=$REMOVE_UNMANAGED entraupdate=$ENTRA_UPDATE maxWorkers=$MAX_WORKERS sourcePath=$RESTORE_PATH"
if [ "${#exclude_items[@]}" -gt 0 ]; then
joined_excludes="$(IFS=,; echo "${exclude_items[*]}")"
echo "Excluding categories: $joined_excludes"
fi
intunecd_log="${AGENT_TEMPDIRECTORY:-/tmp}/intunecd-restore.log"
rm -f "$intunecd_log"
set +e
"${cmd[@]}" >"$intunecd_log" 2>&1
intunecd_rc=$?
set -e
echo "IntuneCD exit code captured: $intunecd_rc"
if [ "$intunecd_rc" -ne 0 ]; then
echo "IntuneCD restore/update failed with exit code: $intunecd_rc"
marker_pattern="error|\\[ERROR\\]|\\[CRITICAL\\]|request failed|failed with status|modelvalidationfailure|traceback|exception|error updating|failed after|unable to|forbidden|unauthorized"
marker_count="$(grep -Eic "$marker_pattern" "$intunecd_log" || true)"
echo "Detected error-marker lines: $marker_count"
echo "Relevant markers from full output (line:number:text):"
grep -Ein "$marker_pattern" "$intunecd_log" | tail -n 200 || true
echo "First 80 lines of IntuneCD output:"
head -n 80 "$intunecd_log" || true
echo "Last 120 lines of IntuneCD output:"
tail -n 120 "$intunecd_log" || true
if [ "${marker_count:-0}" -eq 0 ]; then
echo "##vso[task.logissue type=warning]IntuneCD returned non-zero without explicit error markers; treating as successful no-op."
intunecd_rc=0
fi
else
echo "Last 60 lines of IntuneCD output:"
tail -n 60 "$intunecd_log" || true
fi
if [ "$intunecd_rc" -ne 0 ]; then
exit "$intunecd_rc"
fi
failOnStderr: false
env:
DRY_RUN: ${{ parameters.dryRun }}
UPDATE_ASSIGNMENTS: ${{ parameters.updateAssignments }}
REMOVE_UNMANAGED: ${{ parameters.removeObjectsNotInBaseline }}
ENTRA_UPDATE: ${{ parameters.includeEntraUpdate }}
MAX_WORKERS: ${{ parameters.maxWorkers }}
EXCLUDE_CSV: ${{ parameters.excludeCsv }}
GRAPH_TOKEN: $(accessToken)

View File

@@ -0,0 +1,194 @@
trigger: none
pr: none
schedules:
- cron: "*/20 * * * *"
displayName: "Review decision sync (every 20 minutes)"
branches:
include:
- main
always: true
batch: true
variables:
# Tenant-specific values are expected in a variable group (see templates/variables-tenant.yml).
# Uncomment the line below after creating the group in your Azure DevOps project.
# - group: vg-astral-tenant
- template: templates/variables-common.yml
jobs:
- job: sync_intune_review_decisions
displayName: Sync Intune reviewer decisions
condition: eq(variables['ENABLE_WORKLOAD_INTUNE'], 'true')
pool:
name: $(AGENT_POOL_NAME)
steps:
- checkout: self
persistCredentials: true
- task: Bash@3
displayName: Apply reviewer /reject decisions (Intune)
condition: eq(variables['ENABLE_PR_REVIEWER_DECISIONS'], 'true')
inputs:
targetType: inline
script: |
set -euo pipefail
python3 "$(Build.SourcesDirectory)/scripts/apply_reviewer_rejections.py" \
--repo-root "$(Build.SourcesDirectory)" \
--workload "intune" \
--drift-branch "$(DRIFT_BRANCH_INTUNE)" \
--baseline-branch "$(BASELINE_BRANCH)"
workingDirectory: "$(Build.SourcesDirectory)"
failOnStderr: false
env:
SYSTEM_ACCESSTOKEN: $(System.AccessToken)
SYSTEM_COLLECTIONURI: $(System.CollectionUri)
SYSTEM_TEAMPROJECT: $(System.TeamProject)
BUILD_REPOSITORY_ID: $(Build.Repository.ID)
- task: Bash@3
displayName: Update automated reviewer summary (Intune)
condition: eq(variables['ENABLE_PR_REVIEW_SUMMARY'], 'true')
inputs:
targetType: inline
script: |
set -euo pipefail
python3 "$(Build.SourcesDirectory)/scripts/update_pr_review_summary.py" \
--repo-root "$(Build.SourcesDirectory)" \
--workload "intune" \
--backup-folder "$(BACKUP_FOLDER)" \
--reports-subdir "$(REPORTS_SUBDIR)" \
--drift-branch "$(DRIFT_BRANCH_INTUNE)" \
--baseline-branch "$(BASELINE_BRANCH)"
workingDirectory: "$(Build.SourcesDirectory)"
failOnStderr: false
env:
SYSTEM_ACCESSTOKEN: $(System.AccessToken)
SYSTEM_COLLECTIONURI: $(System.CollectionUri)
SYSTEM_TEAMPROJECT: $(System.TeamProject)
BUILD_REPOSITORY_ID: $(Build.Repository.ID)
ENABLE_PR_AI_SUMMARY: $(ENABLE_PR_AI_SUMMARY)
AZURE_OPENAI_ENDPOINT: $(AZURE_OPENAI_ENDPOINT)
AZURE_OPENAI_DEPLOYMENT: $(AZURE_OPENAI_DEPLOYMENT)
AZURE_OPENAI_API_KEY: $(AZURE_OPENAI_API_KEY)
AZURE_OPENAI_API_VERSION: $(AZURE_OPENAI_API_VERSION)
REQUIRE_CHANGE_TICKETS: $(REQUIRE_CHANGE_TICKETS)
CHANGE_TICKET_REGEX: $(CHANGE_TICKET_REGEX)
DEBUG_CHANGE_TICKET_THREADS: $(DEBUG_CHANGE_TICKET_THREADS)
ROLLING_PR_DELAY_REVIEWER_NOTIFICATIONS: $(ROLLING_PR_DELAY_REVIEWER_NOTIFICATIONS)
- task: Bash@3
displayName: Queue post-merge remediation from reviewer /reject (Intune)
condition: eq(variables['AUTO_REMEDIATE_AFTER_MERGE'], 'true')
inputs:
targetType: inline
script: |
set -euo pipefail
python3 "$(Build.SourcesDirectory)/scripts/queue_post_merge_restore.py" \
--workload "intune" \
--drift-branch "$(DRIFT_BRANCH_INTUNE)" \
--baseline-branch "$(BASELINE_BRANCH)"
workingDirectory: "$(Build.SourcesDirectory)"
failOnStderr: false
env:
SYSTEM_ACCESSTOKEN: $(System.AccessToken)
SYSTEM_COLLECTIONURI: $(System.CollectionUri)
SYSTEM_TEAMPROJECT: $(System.TeamProject)
BUILD_REPOSITORY_ID: $(Build.Repository.ID)
AUTO_REMEDIATE_AFTER_MERGE: $(AUTO_REMEDIATE_AFTER_MERGE)
AUTO_REMEDIATE_AFTER_MERGE_LOOKBACK_HOURS: $(AUTO_REMEDIATE_AFTER_MERGE_LOOKBACK_HOURS)
AUTO_REMEDIATE_RESTORE_PIPELINE_ID: $(AUTO_REMEDIATE_RESTORE_PIPELINE_ID)
AUTO_REMEDIATE_DRY_RUN: $(AUTO_REMEDIATE_DRY_RUN)
AUTO_REMEDIATE_UPDATE_ASSIGNMENTS: $(AUTO_REMEDIATE_UPDATE_ASSIGNMENTS)
AUTO_REMEDIATE_REMOVE_OBJECTS: $(AUTO_REMEDIATE_REMOVE_OBJECTS)
AUTO_REMEDIATE_MAX_WORKERS: $(AUTO_REMEDIATE_MAX_WORKERS)
AUTO_REMEDIATE_EXCLUDE_CSV: $(AUTO_REMEDIATE_EXCLUDE_CSV)
AUTO_REMEDIATE_INCLUDE_ENTRA_UPDATE: false
- job: sync_entra_review_decisions
displayName: Sync Entra reviewer decisions
condition: eq(variables['ENABLE_WORKLOAD_ENTRA'], 'true')
pool:
name: $(AGENT_POOL_NAME)
steps:
- checkout: self
persistCredentials: true
- task: Bash@3
displayName: Apply reviewer /reject decisions (Entra)
condition: eq(variables['ENABLE_PR_REVIEWER_DECISIONS'], 'true')
inputs:
targetType: inline
script: |
set -euo pipefail
python3 "$(Build.SourcesDirectory)/scripts/apply_reviewer_rejections.py" \
--repo-root "$(Build.SourcesDirectory)" \
--workload "entra" \
--drift-branch "$(DRIFT_BRANCH_ENTRA)" \
--baseline-branch "$(BASELINE_BRANCH)"
workingDirectory: "$(Build.SourcesDirectory)"
failOnStderr: false
env:
SYSTEM_ACCESSTOKEN: $(System.AccessToken)
SYSTEM_COLLECTIONURI: $(System.CollectionUri)
SYSTEM_TEAMPROJECT: $(System.TeamProject)
BUILD_REPOSITORY_ID: $(Build.Repository.ID)
- task: Bash@3
displayName: Update automated reviewer summary (Entra)
condition: eq(variables['ENABLE_PR_REVIEW_SUMMARY'], 'true')
inputs:
targetType: inline
script: |
set -euo pipefail
python3 "$(Build.SourcesDirectory)/scripts/update_pr_review_summary.py" \
--repo-root "$(Build.SourcesDirectory)" \
--workload "entra" \
--backup-folder "$(BACKUP_FOLDER)" \
--reports-subdir "$(REPORTS_SUBDIR)" \
--drift-branch "$(DRIFT_BRANCH_ENTRA)" \
--baseline-branch "$(BASELINE_BRANCH)"
workingDirectory: "$(Build.SourcesDirectory)"
failOnStderr: false
env:
SYSTEM_ACCESSTOKEN: $(System.AccessToken)
SYSTEM_COLLECTIONURI: $(System.CollectionUri)
SYSTEM_TEAMPROJECT: $(System.TeamProject)
BUILD_REPOSITORY_ID: $(Build.Repository.ID)
ENABLE_PR_AI_SUMMARY: $(ENABLE_PR_AI_SUMMARY)
AZURE_OPENAI_ENDPOINT: $(AZURE_OPENAI_ENDPOINT)
AZURE_OPENAI_DEPLOYMENT: $(AZURE_OPENAI_DEPLOYMENT)
AZURE_OPENAI_API_KEY: $(AZURE_OPENAI_API_KEY)
AZURE_OPENAI_API_VERSION: $(AZURE_OPENAI_API_VERSION)
REQUIRE_CHANGE_TICKETS: $(REQUIRE_CHANGE_TICKETS)
CHANGE_TICKET_REGEX: $(CHANGE_TICKET_REGEX)
DEBUG_CHANGE_TICKET_THREADS: $(DEBUG_CHANGE_TICKET_THREADS)
ROLLING_PR_DELAY_REVIEWER_NOTIFICATIONS: $(ROLLING_PR_DELAY_REVIEWER_NOTIFICATIONS)
- task: Bash@3
displayName: Queue post-merge remediation from reviewer /reject (Entra)
condition: eq(variables['AUTO_REMEDIATE_AFTER_MERGE'], 'true')
inputs:
targetType: inline
script: |
set -euo pipefail
python3 "$(Build.SourcesDirectory)/scripts/queue_post_merge_restore.py" \
--workload "entra" \
--drift-branch "$(DRIFT_BRANCH_ENTRA)" \
--baseline-branch "$(BASELINE_BRANCH)"
workingDirectory: "$(Build.SourcesDirectory)"
failOnStderr: false
env:
SYSTEM_ACCESSTOKEN: $(System.AccessToken)
SYSTEM_COLLECTIONURI: $(System.CollectionUri)
SYSTEM_TEAMPROJECT: $(System.TeamProject)
BUILD_REPOSITORY_ID: $(Build.Repository.ID)
AUTO_REMEDIATE_AFTER_MERGE: $(AUTO_REMEDIATE_AFTER_MERGE)
AUTO_REMEDIATE_AFTER_MERGE_LOOKBACK_HOURS: $(AUTO_REMEDIATE_AFTER_MERGE_LOOKBACK_HOURS)
AUTO_REMEDIATE_RESTORE_PIPELINE_ID: $(AUTO_REMEDIATE_RESTORE_PIPELINE_ID)
AUTO_REMEDIATE_DRY_RUN: $(AUTO_REMEDIATE_DRY_RUN)
AUTO_REMEDIATE_UPDATE_ASSIGNMENTS: $(AUTO_REMEDIATE_UPDATE_ASSIGNMENTS)
AUTO_REMEDIATE_REMOVE_OBJECTS: $(AUTO_REMEDIATE_REMOVE_OBJECTS)
AUTO_REMEDIATE_MAX_WORKERS: $(AUTO_REMEDIATE_MAX_WORKERS)
AUTO_REMEDIATE_EXCLUDE_CSV: $(AUTO_REMEDIATE_EXCLUDE_CSV)
AUTO_REMEDIATE_INCLUDE_ENTRA_UPDATE: true

2147
azure-pipelines.yml Normal file

File diff suppressed because it is too large Load Diff

46
deploy/RELEASE.md Normal file
View File

@@ -0,0 +1,46 @@
# ASTRAL Public Release Checklist
Use this checklist before publishing a new sanitized version of ASTRAL to the public repository.
## Pre-release scan
Run the following commands from the repository root to ensure no tenant-specific data remains:
```bash
# Search for the original tenant identifiers
grep -ri "cqre" --include="*.{yml,yaml,py,md,json,sh,ps1}" . | grep -v node_modules | grep -v __pycache__ | grep -v .git
grep -ri "kracmar" --include="*.{yml,yaml,py,md,json,sh,ps1}" . | grep -v node_modules | grep -v __pycache__ | grep -v .git
grep -ri "sc_intunebackup" --include="*.{yml,yaml,py,md,json,sh,ps1}" . | grep -v node_modules | grep -v __pycache__ | grep -v .git
# Search for the original tenant ID (replace with your actual tenant ID)
grep -ri "0ec9f34c-17c8-4541-b084-7d64ecdcc997" --include="*.{yml,yaml,py,md,json,sh,ps1}" . | grep -v node_modules | grep -v __pycache__ | grep -v .git
```
Expected result: **zero matches** outside of this release checklist.
## File verification
- [ ] `azure-pipelines.yml` contains no hardcoded tenant domain, email, or service connection name.
- [ ] `azure-pipelines-restore.yml` contains no hardcoded tenant domain, email, or service connection name.
- [ ] `azure-pipelines-review-sync.yml` contains no hardcoded tenant-specific values.
- [ ] `scripts/common.py` uses a generic fallback name (not `CQRE_Intune_Backupper`).
- [ ] `tenant-state/` contains only placeholder files (`.gitkeep`, `README.md`).
- [ ] `prod-as-built.md` has been deleted.
- [ ] All markdown documentation uses generic examples (`contoso.onmicrosoft.com`, `astral-backup@contoso.com`, `sc-astral-backup`).
## Test verification
- [ ] Unit tests pass: `python3 -m unittest discover -s tests -v`
## Publication steps
1. Ensure you are on a clean branch (e.g. `publish/v1.x`).
2. Run the pre-release scan above.
3. Commit any last-minute fixes.
4. Tag the release: `git tag -a v1.0.0 -m "ASTRAL v1.0.0"`
5. Push the tag.
6. Publish to the public repository (fresh clone or specific branch push).
## Note on Git history
If the original repository contained live tenant exports in its history, consider publishing from a **squashed or freshly initialized repository** rather than pushing the full private history. The public template does not benefit from historical tenant data, and a clean history avoids accidental exposure of old exports.

228
deploy/bootstrap-tenant.ps1 Normal file
View File

@@ -0,0 +1,228 @@
#requires -Version 5.1
<#
.SYNOPSIS
Bootstraps an Azure AD app registration for ASTRAL with required Microsoft Graph permissions.
.DESCRIPTION
Creates a single-tenant app registration, assigns read (and optional write) Graph application permissions,
grants admin consent, and configures a workload federated credential for Azure DevOps.
.PARAMETER TenantName
The Microsoft 365 tenant domain, e.g. contoso.onmicrosoft.com.
.PARAMETER ServiceConnectionName
The intended Azure DevOps service connection name (used for the federated credential subject).
.PARAMETER AppDisplayName
Optional display name for the app registration. Default: "ASTRAL Backup Service".
.PARAMETER AdoOrganizationUrl
Optional Azure DevOps organization URL, e.g. https://dev.azure.com/contoso.
If provided, the script prints a one-liner to create the service connection via REST API.
.PARAMETER AddRestorePermissions
If specified, also adds write permissions for the restore pipeline.
.EXAMPLE
.\bootstrap-tenant.ps1 -TenantName "contoso.onmicrosoft.com" -ServiceConnectionName "sc-astral-backup"
#>
[CmdletBinding()]
param (
[Parameter(Mandatory = $true)]
[string]$TenantName,
[Parameter(Mandatory = $true)]
[string]$ServiceConnectionName,
[string]$AppDisplayName = "ASTRAL Backup Service",
[string]$AdoOrganizationUrl = "",
[switch]$AddRestorePermissions
)
$ErrorActionPreference = "Stop"
function Test-ModuleInstalled {
param ([string]$Name)
$mod = Get-Module -ListAvailable -Name $Name | Select-Object -First 1
if (-not $mod) {
Write-Host "Installing module: $Name" -ForegroundColor Cyan
Install-Module $Name -Scope CurrentUser -Force -AllowClobber
}
}
Test-ModuleInstalled "Microsoft.Graph.Applications"
Test-ModuleInstalled "Microsoft.Graph.Identity.SignIns"
Import-Module Microsoft.Graph.Applications
Import-Module Microsoft.Graph.Identity.SignIns
Write-Host "Connecting to Microsoft Graph..." -ForegroundColor Cyan
Connect-MgGraph -Scopes "Application.ReadWrite.All","AppRoleAssignment.ReadWrite.All","Directory.Read.All" -NoWelcome
$tenant = Get-MgOrganization | Select-Object -First 1
if (-not $tenant) {
throw "Unable to read tenant details. Ensure you are authenticated to the correct tenant."
}
Write-Host "Tenant: $($tenant.DisplayName) ($($tenant.Id))" -ForegroundColor Green
# Required read permissions
$readPermissions = @(
"Device.Read.All",
"DeviceManagementApps.Read.All",
"DeviceManagementConfiguration.Read.All",
"DeviceManagementManagedDevices.Read.All",
"DeviceManagementRBAC.Read.All",
"DeviceManagementScripts.Read.All",
"DeviceManagementServiceConfig.Read.All",
"Group.Read.All",
"Policy.Read.All",
"Policy.Read.ConditionalAccess",
"Policy.Read.DeviceConfiguration",
"User.Read.All",
"Application.Read.All"
)
$optionalReadPermissions = @(
"RoleManagement.Read.Directory",
"Directory.Read.All",
"AuditLog.Read.All"
)
$restorePermissions = @(
"DeviceManagementApps.ReadWrite.All",
"DeviceManagementConfiguration.ReadWrite.All",
"DeviceManagementManagedDevices.ReadWrite.All",
"DeviceManagementRBAC.ReadWrite.All",
"DeviceManagementScripts.ReadWrite.All",
"DeviceManagementServiceConfig.ReadWrite.All",
"Policy.Read.All",
"Policy.ReadWrite.ConditionalAccess"
)
$allPermissions = $readPermissions + $optionalReadPermissions
if ($AddRestorePermissions) {
$allPermissions += $restorePermissions
}
# Get Microsoft Graph SP to map permissions to AppRoles
$graphSp = Get-MgServicePrincipal -Filter "appId eq '00000003-0000-0000-c000-000000000000'"
if (-not $graphSp) {
throw "Microsoft Graph service principal not found in tenant."
}
$requiredResourceAccess = @()
$appRoles = @()
foreach ($permName in ($allPermissions | Select-Object -Unique)) {
$appRole = $graphSp.AppRoles | Where-Object { $_.Value -eq $permName } | Select-Object -First 1
if (-not $appRole) {
Write-Warning "Permission '$permName' not found in Microsoft Graph. Skipping."
continue
}
$appRoles += $appRole
}
if ($appRoles.Count -eq 0) {
throw "No valid Graph permissions resolved. Cannot continue."
}
$resourceAccess = @()
foreach ($ar in $appRoles) {
$resourceAccess += @{
id = $ar.Id
type = "Role"
}
}
$requiredResourceAccess = @(
@{
resourceAppId = $graphSp.AppId
resourceAccess = $resourceAccess
}
)
# Create or update app registration
$existingApp = Get-MgApplication -Filter "displayName eq '$AppDisplayName'" | Select-Object -First 1
if ($existingApp) {
Write-Host "Found existing app registration: $($existingApp.AppId)" -ForegroundColor Yellow
$app = $existingApp
Update-MgApplication -ApplicationId $app.Id -RequiredResourceAccess $requiredResourceAccess
Write-Host "Updated required resource access." -ForegroundColor Green
}
else {
Write-Host "Creating app registration: $AppDisplayName" -ForegroundColor Cyan
$app = New-MgApplication -DisplayName $AppDisplayName -SignInAudience "AzureADMyOrg" -RequiredResourceAccess $requiredResourceAccess
Write-Host "Created app registration. AppId: $($app.AppId)" -ForegroundColor Green
}
# Ensure service principal exists
$sp = Get-MgServicePrincipal -Filter "appId eq '$($app.AppId)'" | Select-Object -First 1
if (-not $sp) {
Write-Host "Creating service principal..." -ForegroundColor Cyan
$sp = New-MgServicePrincipal -AppId $app.AppId
}
# Grant admin consent
Write-Host "Granting admin consent..." -ForegroundColor Cyan
foreach ($ar in $appRoles) {
$existingAssignment = Get-MgServicePrincipalAppRoleAssignment -ServicePrincipalId $sp.Id | Where-Object { $_.AppRoleId -eq $ar.Id }
if (-not $existingAssignment) {
New-MgServicePrincipalAppRoleAssignment -ServicePrincipalId $sp.Id -PrincipalId $sp.Id -ResourceId $graphSp.Id -AppRoleId $ar.Id | Out-Null
}
}
Write-Host "Admin consent granted." -ForegroundColor Green
# Federated credential for Azure DevOps
$federatedCredentialName = "AstralAzureDevOps-$ServiceConnectionName"
$existingFedCred = Get-MgApplicationFederatedIdentityCredential -ApplicationId $app.Id | Where-Object { $_.Name -eq $federatedCredentialName }
if (-not $existingFedCred) {
Write-Host "Creating federated credential for Azure DevOps..." -ForegroundColor Cyan
# Subject identifier for Azure DevOps workload identity federation
# Format: sc://<ado-org>/<project>/<service-connection-name>
# We require the user to fill in org/project manually or via parameters.
$adoOrg = Read-Host "Enter your Azure DevOps organization name (e.g. 'contoso')"
$adoProject = Read-Host "Enter your Azure DevOps project name (e.g. 'ASTRAL')"
$subject = "sc://$adoOrg/$adoProject/$ServiceConnectionName"
$params = @{
Name = $federatedCredentialName
Issuer = "https://vstoken.dev.azure.com"
Subject = $subject
Audiences = @("api://AzureADTokenExchange")
}
New-MgApplicationFederatedIdentityCredential -ApplicationId $app.Id -BodyParameter $params | Out-Null
Write-Host "Federated credential created. Subject: $subject" -ForegroundColor Green
}
else {
Write-Host "Federated credential already exists." -ForegroundColor Yellow
}
Write-Host ""
Write-Host "=== Bootstrap complete ===" -ForegroundColor Green
Write-Host "Tenant Name: $TenantName"
Write-Host "Tenant ID: $($tenant.Id)"
Write-Host "App Display Name: $AppDisplayName"
Write-Host "App ID: $($app.AppId)"
Write-Host "Service Connection: $ServiceConnectionName"
Write-Host ""
Write-Host "Next steps:" -ForegroundColor Cyan
Write-Host "1. In Azure DevOps, create a Workload Identity Federation service connection."
Write-Host " - Tenant ID: $($tenant.Id)"
Write-Host " - App ID: $($app.AppId)"
Write-Host " - Name: $ServiceConnectionName"
Write-Host ""
if ($AdoOrganizationUrl) {
$project = if ($AdoOrganizationUrl -match "/([^/]+)$") { $matches[1] } else { "YOUR_PROJECT" }
$pat = Read-Host "Enter an Azure DevOps PAT with 'ServiceConnections: Read & manage' scope (input is hidden)" -AsSecureString
$patPlain = [System.Net.NetworkCredential]::new("", $pat).Password
Write-Host ""
Write-Host "You can create the service connection via REST API using:"
Write-Host " curl -u :$patPlain -X POST -H 'Content-Type: application/json' "
Write-Host " -d '{ ... }' "
Write-Host " '$AdoOrganizationUrl/_apis/serviceendpoint/endpoints?api-version=7.1'"
}
Disconnect-MgGraph | Out-Null

View File

@@ -0,0 +1,150 @@
# ASTRAL Onboarding Runbook
This guide walks through deploying ASTRAL into a new Azure DevOps organization and Microsoft 365 tenant.
## Prerequisites
- Azure DevOps organization and project created.
- Owner or Contributor access to the target Microsoft 365 tenant.
- Permission to create app registrations and grant admin consent in Entra ID.
- PowerShell 7+ or Windows PowerShell 5.1 with the `Microsoft.Graph` module (for the bootstrap script).
## Step 1: Import the repository
1. In Azure DevOps, create a new Git repository in your project.
2. Push the contents of this repository into it, or use **Import repository** from a public Git URL.
## Step 2: Create the tenant variable group
1. In Azure DevOps, go to **Pipelines > Library** and create a new Variable Group.
2. Recommended name: `vg-astral-tenant` (you can choose any name).
3. Add the variables from `templates/variables-tenant.yml`. Use your real tenant values:
| Variable | Example value | Notes |
| --- | --- | --- |
| `TENANT_NAME` | `contoso.onmicrosoft.com` | Your M365 tenant domain |
| `SERVICE_CONNECTION_NAME` | `sc-astral-backup` | Name you will use for the service connection |
| `USER_NAME` | `ASTRAL Backup Service` | Git committer name |
| `USER_EMAIL` | `astral-backup@contoso.com` | Git committer email |
| `AGENT_POOL_NAME` | `Azure Pipelines` | Change if using a self-hosted pool |
| `BACKUP_TIMEZONE` | `Europe/Prague` | Valid tz database name |
| `FULL_RUN_HOUR` | `00` | Hour that triggers full export |
| `AUTO_REMEDIATE_RESTORE_PIPELINE_ID` | *(leave empty)* | Filled in Step 8 |
4. If you plan to use Azure OpenAI summaries, also add:
- `ENABLE_PR_AI_SUMMARY` = `true`
- `AZURE_OPENAI_ENDPOINT`
- `AZURE_OPENAI_DEPLOYMENT`
- `AZURE_OPENAI_API_KEY` *(mark as secret)*
## Step 3: Link the variable group to the pipelines
Open each pipeline YAML and uncomment the variable group line near the top:
```yaml
variables:
- group: vg-astral-tenant # <-- uncomment this line
- template: templates/variables-common.yml
```
Do this for:
- `azure-pipelines.yml`
- `azure-pipelines-review-sync.yml`
- `azure-pipelines-restore.yml`
Commit and push the changes.
## Step 4: Run the tenant bootstrap script
Run `deploy/bootstrap-tenant.ps1` in a PowerShell session authenticated to your target tenant.
```powershell
# Example
.\deploy\bootstrap-tenant.ps1 -TenantName "contoso.onmicrosoft.com" -ServiceConnectionName "sc-astral-backup"
```
The script will:
1. Create a single-tenant app registration.
2. Add required Microsoft Graph application permissions.
3. Grant admin consent.
4. Create a workload federated credential for Azure DevOps.
5. Print the App ID and instructions for creating the Azure DevOps service connection.
## Step 5: Create the Azure DevOps service connection
1. In Azure DevOps, go to **Project settings > Service connections**.
2. Click **New service connection > Azure Resource Manager > Workload identity federation (manual)**.
3. Fill in:
- **Subscription**: leave blank or select if you also want ARM access (not required).
- **Tenant ID**: your Microsoft 365 tenant ID.
- **Service Connection Name**: the same value you set in `SERVICE_CONNECTION_NAME` (e.g. `sc-astral-backup`).
- **App ID**: from the bootstrap script output.
4. Save the service connection.
## Step 6: Import the pipelines
1. Go to **Pipelines > Create pipeline > Azure Repos Git**.
2. Select your repository.
3. Choose **Existing Azure Pipelines YAML file**.
4. Import each of the three YAMLs one by one:
- `azure-pipelines.yml` (main backup)
- `azure-pipelines-review-sync.yml` (review sync)
- `azure-pipelines-restore.yml` (restore)
## Step 7: Grant repository permissions to the build identity
1. Go to **Project settings > Repositories**.
2. Select your repository.
3. Under **Security**, grant the **Build Service** account:
- Contribute
- Create branch
- Force push
- Create pull request
- Edit pull request
- Tag creation (if you enable tagging)
4. Under **Pipelines**, grant the build service **Queue builds** permission on `azure-pipelines-restore.yml` if you plan to use auto-remediation.
## Step 8: Set the restore pipeline definition ID
After importing `azure-pipelines-restore.yml`, find its definition ID:
1. Open the restore pipeline in Azure DevOps.
2. The URL contains `definitionId=XX`. Note the number.
3. Go back to your variable group (`vg-astral-tenant`) and set:
- `AUTO_REMEDIATE_RESTORE_PIPELINE_ID` = `XX`
## Step 9: Validate the deployment
1. Import `deploy/validate-deployment.yml` as a one-time pipeline.
2. Run it.
3. Verify that all checks pass:
- Graph token acquisition
- Required roles present
- Test read from Graph
- Test PR creation and abandonment
## Step 10: Run the first backup
1. Queue a manual run of `azure-pipelines.yml`.
2. Set `forceFullRun=true` to get a complete initial snapshot.
3. Verify that `tenant-state/` is populated and a rolling PR is created.
## Optional: progressive feature rollout
| Phase | What to enable |
| --- | --- |
| Backup-only | `ENABLE_PR_REVIEW_SUMMARY=false`, `ENABLE_PR_REVIEWER_DECISIONS=false`, `AUTO_REMEDIATE_AFTER_MERGE=false` |
| Review package | `ENABLE_PR_REVIEW_SUMMARY=true`, `ENABLE_PR_REVIEWER_DECISIONS=true` |
| Full package | Also enable restore and set `AUTO_REMEDIATE_AFTER_MERGE=true` if desired |
| AI summaries | `ENABLE_PR_AI_SUMMARY=true` plus Azure OpenAI variables |
## Troubleshooting
| Symptom | Likely cause | Fix |
| --- | --- | --- |
| Pipeline fails at "Get Graph Token" | Wrong service connection name or missing federated credential | Verify `SERVICE_CONNECTION_NAME` matches the service connection exactly |
| "Missing required Graph roles" | Admin consent not granted | Run bootstrap script again or grant consent manually in Entra ID |
| Rolling PR not created | Build identity lacks PR permissions | Add **Create pull request** and **Edit pull request** permissions |
| Restore pipeline queue fails | `AUTO_REMEDIATE_RESTORE_PIPELINE_ID` wrong or missing queue permission | Verify the ID and grant **Queue builds** on the restore pipeline |
| Empty `tenant-state/` after run | First run may have no data if Graph returns nothing; also check `BACKUP_FOLDER` path | Verify Graph permissions and re-run |

118
deploy/publish-public.yml Normal file
View File

@@ -0,0 +1,118 @@
trigger: none
pr: none
# Publisher pipeline: pushes a sanitized snapshot of the dev repo to the public template repo.
#
# Usage:
# Queue this pipeline manually and optionally provide a tag name (e.g. v1.1.0).
#
# Prerequisites:
# - PUBLIC_REPO_URL (pipeline variable)
# - PUBLIC_REPO_PAT (secret pipeline variable)
parameters:
- name: tagName
displayName: Optional release tag (e.g. v1.1.0)
type: string
default: ""
variables:
- template: ../templates/variables-common.yml
jobs:
- job: publish_public_template
displayName: Publish sanitized snapshot to public repo
pool:
name: $(AGENT_POOL_NAME)
steps:
- checkout: self
persistCredentials: true
- task: Bash@3
displayName: Run sync-to-public
inputs:
targetType: inline
script: |
set -euo pipefail
chmod +x "$(Build.SourcesDirectory)/deploy/sync-to-public.sh"
TMP_DIR="$(mktemp -d)"
trap 'rm -rf "$TMP_DIR"' EXIT
# Run the sync script; it clones the public repo into a temp subdir
"$(Build.SourcesDirectory)/deploy/sync-to-public.sh" \
"$(PUBLIC_REPO_URL)" \
"${{ parameters.tagName }}"
# The script prints the clone path in its output. Extract the last temp dir it used.
PUBLIC_CLONE="$TMP_DIR/public"
mkdir -p "$PUBLIC_CLONE"
# Re-run the sync into our controlled temp dir to guarantee the path
cd "$(Build.SourcesDirectory)"
rsync -a \
--exclude='.git' \
--exclude='tenant-state' \
--exclude='prod-as-built.md' \
--exclude='node_modules' \
--exclude='__pycache__' \
--exclude='.DS_Store' \
--exclude='deploy/sync-to-public.sh' \
--exclude='deploy/publish-public.yml' \
"$(Build.SourcesDirectory)/" "$PUBLIC_CLONE/"
cd "$PUBLIC_CLONE"
# Re-create empty tenant-state structure
mkdir -p tenant-state/intune tenant-state/entra tenant-state/reports/intune tenant-state/reports/entra
touch tenant-state/intune/.gitkeep tenant-state/entra/.gitkeep tenant-state/reports/intune/.gitkeep tenant-state/reports/entra/.gitkeep
cat > tenant-state/README.md <<'EOF'
# tenant-state
This directory is populated automatically by the ASTRAL pipeline.
Do not place manual files here; they will be overwritten on the next export.
EOF
git init
git remote add origin "$(PUBLIC_REPO_URL)" 2>/dev/null || git remote set-url origin "$(PUBLIC_REPO_URL)"
git config user.email "astral-publish@local"
git config user.name "ASTRAL Publisher"
# Fetch existing public main so we can diff against it
git fetch origin main || true
# Stage everything
git add -A
if git diff --cached --quiet; then
echo "No changes to publish."
exit 0
fi
DEV_SHA="$(git -C '$(Build.SourcesDirectory)' rev-parse --short HEAD)"
DEV_BRANCH="$(git -C '$(Build.SourcesDirectory)' rev-parse --abbrev-ref HEAD)"
git commit -m "Sync from dev @ ${DEV_SHA}
Source: ${DEV_BRANCH} (${DEV_SHA})
Excluded: live tenant exports, generated artifacts, and dev-only tooling."
if [ -n "${{ parameters.tagName }}" ]; then
git tag -a "${{ parameters.tagName }}" -m "Release ${{ parameters.tagName }}"
fi
# Push commit (and tag if provided)
git push origin HEAD:main --force
if [ -n "${{ parameters.tagName }}" ]; then
git push origin "${{ parameters.tagName }}"
fi
echo "Publication complete."
if [ -n "${{ parameters.tagName }}" ]; then
echo "Tag: ${{ parameters.tagName }}"
fi
env:
GIT_ASKPASS: echo
GIT_USERNAME: $(PUBLIC_REPO_USERNAME)
GIT_PASSWORD: $(PUBLIC_REPO_PAT)

View File

@@ -0,0 +1,120 @@
trigger: none
pr: none
# One-time validation pipeline for ASTRAL onboarding.
# Import this pipeline, run it manually, and verify all checks pass.
variables:
# Uncomment after creating your tenant variable group.
# - group: vg-astral-tenant
- template: ../templates/variables-common.yml
jobs:
- job: validate_environment
displayName: Validate ASTRAL deployment
pool:
name: $(AGENT_POOL_NAME)
steps:
- checkout: self
persistCredentials: true
- task: AzurePowerShell@5
displayName: Validate Graph token acquisition
inputs:
azureSubscription: $(SERVICE_CONNECTION_NAME)
azurePowerShellVersion: LatestVersion
ScriptType: inlineScript
Inline: |
$getTokenParams = @{
ResourceTypeName = 'MSGraph'
AsSecureString = $true
ErrorAction = 'Stop'
}
$tokenCommand = Get-Command Get-AzAccessToken -ErrorAction Stop
if ($tokenCommand.Parameters.ContainsKey('ForceRefresh')) {
$getTokenParams['ForceRefresh'] = $true
}
$accessToken = ([PSCredential]::New('dummy', (Get-AzAccessToken @getTokenParams).Token).GetNetworkCredential().Password)
$tokenParts = $accessToken.Split('.')
if ($tokenParts.Length -lt 2) { throw "Invalid Graph access token format." }
$payload = $tokenParts[1].Replace('-', '+').Replace('_', '/')
switch ($payload.Length % 4) {
2 { $payload += '==' }
3 { $payload += '=' }
}
$payloadJson = [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String($payload))
$claims = $payloadJson | ConvertFrom-Json
$roles = @($claims.roles)
$sortedRoles = $roles | Sort-Object
Write-Host "Graph token roles: $($sortedRoles -join ', ')"
$requiredReadRoles = @(
"Device.Read.All",
"DeviceManagementApps.Read.All",
"DeviceManagementConfiguration.Read.All",
"DeviceManagementManagedDevices.Read.All",
"DeviceManagementRBAC.Read.All",
"DeviceManagementScripts.Read.All",
"DeviceManagementServiceConfig.Read.All",
"Group.Read.All",
"Policy.Read.All",
"Policy.Read.ConditionalAccess",
"Policy.Read.DeviceConfiguration",
"User.Read.All",
"Application.Read.All"
)
$missing = $requiredReadRoles | Where-Object { $_ -notin $sortedRoles }
if ($missing) {
throw "Missing required Graph roles: $($missing -join ', ')"
}
Write-Host "All required read roles are present." -ForegroundColor Green
# Export token for subsequent steps
Write-Host "##vso[task.setvariable variable=GRAPH_TOKEN;issecret=true]$accessToken"
- task: Bash@3
displayName: Validate Graph read access
inputs:
targetType: inline
script: |
set -euo pipefail
TOKEN="$(GRAPH_TOKEN)"
URL="https://graph.microsoft.com/v1.0/organization?$select=id,displayName"
RESPONSE=$(curl -sf -H "Authorization: Bearer $TOKEN" "$URL")
echo "Graph read test response: $RESPONSE"
echo "Graph connectivity confirmed."
- task: Bash@3
displayName: Validate PR creation permission
inputs:
targetType: inline
script: |
set -euo pipefail
TOKEN="$(System.AccessToken)"
COLLECTION_URI="$(System.CollectionUri)"
PROJECT="$(System.TeamProject)"
REPO_ID="$(Build.Repository.ID)"
API="${COLLECTION_URI%/}/${PROJECT}/_apis/git/repositories/${REPO_ID}/pullrequests?api-version=7.1"
BODY=$(cat <<EOF
{
"sourceRefName": "refs/heads/main",
"targetRefName": "refs/heads/main",
"title": "ASTRAL validation test PR",
"description": "This is a temporary PR created by the validate-deployment pipeline. It will be abandoned immediately.",
"isDraft": true
}
EOF
)
echo "Creating test PR..."
PR_RESPONSE=$(curl -sf -u ":$TOKEN" -H "Content-Type: application/json" -d "$BODY" "$API")
PR_ID=$(echo "$PR_RESPONSE" | python3 -c "import sys,json; print(json.load(sys.stdin)['pullRequestId'])")
echo "Created test PR #$PR_ID"
ABANDON_API="${COLLECTION_URI%/}/${PROJECT}/_apis/git/repositories/${REPO_ID}/pullrequests/${PR_ID}?api-version=7.1"
echo "Abandoning test PR #$PR_ID..."
curl -sf -u ":$TOKEN" -H "Content-Type: application/json" -X PATCH -d '{"status":"abandoned"}' "$ABANDON_API"
echo "PR creation and abandonment successful."

View File

@@ -0,0 +1,59 @@
<svg width="1200" height="320" viewBox="0 0 1200 320" fill="none" xmlns="http://www.w3.org/2000/svg" role="img" aria-labelledby="title desc">
<title id="title">ASTRAL logo</title>
<desc id="desc">ASTRAL wordmark with celestial shield, orbit, and star symbol.</desc>
<defs>
<linearGradient id="shieldGradient" x1="34" y1="28" x2="192" y2="230" gradientUnits="userSpaceOnUse">
<stop stop-color="#071A2E"/>
<stop offset="0.52" stop-color="#123B68"/>
<stop offset="1" stop-color="#0D6E8E"/>
</linearGradient>
<radialGradient id="coreGlow" cx="0" cy="0" r="1" gradientUnits="userSpaceOnUse" gradientTransform="translate(112 110) rotate(90) scale(88)">
<stop stop-color="#5EEAD4" stop-opacity="0.95"/>
<stop offset="0.45" stop-color="#38BDF8" stop-opacity="0.45"/>
<stop offset="1" stop-color="#0EA5E9" stop-opacity="0"/>
</radialGradient>
<linearGradient id="starGradient" x1="88" y1="82" x2="138" y2="154" gradientUnits="userSpaceOnUse">
<stop stop-color="#FFF3B0"/>
<stop offset="0.55" stop-color="#FFD166"/>
<stop offset="1" stop-color="#F59E0B"/>
</linearGradient>
<linearGradient id="orbitGradient" x1="62" y1="76" x2="160" y2="160" gradientUnits="userSpaceOnUse">
<stop stop-color="#BFF6FF" stop-opacity="0.95"/>
<stop offset="1" stop-color="#67E8F9" stop-opacity="0.35"/>
</linearGradient>
<linearGradient id="crescentGradient" x1="78" y1="66" x2="120" y2="106" gradientUnits="userSpaceOnUse">
<stop stop-color="#D9F99D"/>
<stop offset="1" stop-color="#7DD3FC"/>
</linearGradient>
<filter id="softShadow" x="0" y="0" width="260" height="260" filterUnits="userSpaceOnUse" color-interpolation-filters="sRGB">
<feDropShadow dx="0" dy="10" stdDeviation="10" flood-color="#102A43" flood-opacity="0.16"/>
</filter>
</defs>
<g filter="url(#softShadow)">
<path d="M112 22L188 52V120C188 173 155 221 112 244C69 221 36 173 36 120V52L112 22Z" fill="url(#shieldGradient)"/>
<path d="M112 46L168 68V120C168 162 145 199 112 219C79 199 56 162 56 120V68L112 46Z" fill="#F8FBFF" fill-opacity="0.08" stroke="#C9F3FF" stroke-opacity="0.42" stroke-width="2"/>
<circle cx="112" cy="112" r="72" fill="url(#coreGlow)"/>
<path d="M94 74C83 81 77 94 77 108C77 124 86 138 100 145C92 137 88 126 88 115C88 99 96 84 109 75C104 74 99 73 94 74Z" fill="url(#crescentGradient)" fill-opacity="0.9"/>
<ellipse cx="112" cy="112" rx="58" ry="34" transform="rotate(-18 112 112)" stroke="url(#orbitGradient)" stroke-width="3"/>
<circle cx="161" cy="95" r="4.5" fill="#CFFAFE"/>
<path d="M112 72L120.5 95.5L145.5 96.5L125.5 111.5L133 135L112 121L91 135L98.5 111.5L78.5 96.5L103.5 95.5L112 72Z" fill="url(#starGradient)"/>
<circle cx="79" cy="82" r="2" fill="#E0F2FE"/>
<circle cx="145" cy="69" r="1.7" fill="#FDE68A"/>
<circle cx="149" cy="144" r="1.7" fill="#BAE6FD"/>
<circle cx="69" cy="140" r="1.5" fill="#F0FDFA"/>
<path d="M149 56L151.5 62L158 64.5L151.5 67L149 73L146.5 67L140 64.5L146.5 62L149 56Z" fill="#F8FAFC" fill-opacity="0.92"/>
<circle cx="112" cy="112" r="44" stroke="#BEE3F8" stroke-opacity="0.28" stroke-width="1.5" stroke-dasharray="4 6"/>
</g>
<g transform="translate(250 62)">
<text x="0" y="78" fill="#102A43" font-family="Avenir Next, Montserrat, Inter, Arial, sans-serif" font-size="92" font-weight="800" letter-spacing="10">ASTRAL</text>
<text x="2" y="124" fill="#0B6E8A" font-family="Avenir Next, Montserrat, Inter, Arial, sans-serif" font-size="22" font-weight="600" letter-spacing="2.4">
Admin Security Through Review, Automation &amp; Least-privilege
</text>
<line x1="0" y1="146" x2="756" y2="146" stroke="#D9E2EC" stroke-width="2"/>
<text x="2" y="186" fill="#486581" font-family="Avenir Next, Montserrat, Inter, Arial, sans-serif" font-size="26" font-weight="500" letter-spacing="0.6">
Configuration drift review and remediation for Intune and Entra
</text>
</g>
</svg>

After

Width:  |  Height:  |  Size: 3.9 KiB

View File

@@ -0,0 +1,143 @@
# M365 Baseline Expansion Roadmap
This document tracks the repository from its current implemented state rather than from the original proposal.
## Current State
The repository already operates as a two-workload baseline system:
- Intune drift backup via IntuneCD
- Entra drift backup for Named Locations, Authentication Strengths, Conditional Access, App Registrations, and Enterprise Applications
The surrounding control loop is also implemented:
- hourly backup pipeline plus midnight Prague full run
- rolling PR per workload
- deterministic reviewer summary with optional Azure OpenAI narrative
- optional per-file change-ticket threads
- reviewer `/reject` and `/accept` decision sync every 20 minutes
- auto-remediation for rejected drift snapshots
- post-merge restore queue for partial-accept / partial-reject review flows
- selective historical restore from branch, tag, or commit
- output validation and drift-noise filtering before commit
## What Is Stable Today
Stable and part of the normal operating model:
- Intune export and reporting
- Entra Named Locations export
- Entra Authentication Strengths export
- Entra Conditional Access export with reference-name enrichment
- object inventory and assignment reporting for both workloads
- Entra app inventory reporting
- drift-branch commit and rolling PR update workflow
Implemented, but intentionally constrained:
- App Registrations export is full-run only
- Enterprise Applications export is full-run only
- light runs preserve the previous committed snapshot for those two Entra categories
Not yet implemented:
- Directory role templates and active directory roles
- Exchange Online, Teams, SharePoint, Purview, and Azure governance modules
## Current Gaps And Stabilization Backlog
1. Fix App Registrations light-run stability.
The current pipeline still disables hourly App Registrations export because some runs produce resolver-only churn. Re-enabling hourly export requires a deterministic light-run result.
2. Keep Enterprise Applications scoped as a heavy module unless runtime proves otherwise.
Enterprise Applications are already exported, but only on full runs. The current design assumes this category should remain bounded to the daily/full path unless runtime and diff quality support widening scope.
3. Add the next identity-baseline module only after Phase 1 is fully stable.
Directory roles are the next logical addition, but they should follow the same pattern: deterministic export, report generation, validation, reviewer-noise filtering, and tests.
## Design Rules For New Modules
Every expansion module should follow the conventions already used by Intune and Entra:
1. Store raw JSON under `tenant-state/<workload-or-module>/`.
2. Store human-review reports under `tenant-state/reports/<workload-or-module>/`.
3. Keep one object per file with deterministic naming and stable key ordering where possible.
4. Validate expected outputs before drift commit.
5. Filter known non-config churn before PR creation.
6. Update permissions, README, and tests in the same change.
7. Start as daily/full-run scope unless there is evidence it is safe and cheap to run hourly.
## Roadmap By Phase
### Phase 1: Identity Baseline
Completed:
- Entra Named Locations
- Entra Authentication Strengths
- Entra Conditional Access
- App Registrations exporter
- Enterprise Applications exporter
Remaining:
- stabilize App Registrations for light runs
- decide whether Enterprise Applications should remain full-run only
- add Directory Roles / Directory Role Templates
### Phase 2: Service Policy Baseline
Candidate modules:
- Exchange Online transport and mail flow rules
- Defender for Office policy configuration
- Teams policy families
- SharePoint and OneDrive tenant-level sharing controls
### Phase 3: Governance Baseline
Candidate modules:
- Purview DLP policies
- Purview retention labels and policies
- Azure policy assignments and initiatives
- Azure RBAC role assignments
## Proposed Future Repository Shape
Keep the existing lowercase workload structure and extend it consistently:
```text
tenant-state/
intune/
entra/
exchange/
teams/
sharepoint/
purview/
azure-governance/
reports/
intune/
entra/
exchange/
teams/
sharepoint/
purview/
azure-governance/
```
## Recommended Execution Order
1. Finish Phase 1 stabilization.
2. Add Directory Roles as the next identity module.
3. Add one Phase 2 or Phase 3 module at a time.
4. Require several stable daily cycles before widening scope or adding the next module.
5. Promote a module from full-run only to hourly only after unchanged tenants produce clean, low-noise diffs.
## Acceptance Criteria
- No regression in current Intune or Entra backup success rate.
- Unchanged environments produce deterministic outputs across repeated runs.
- Reviewer PRs stay focused on configuration-effective drift, not enrichment noise.
- New modules document exact permissions and expected outputs.
- Restore and review workflows remain coherent as scope expands.

View File

@@ -0,0 +1,39 @@
# Security Review Email Draft
## Subject
Security review package for ASTRAL
## Email Body
Hello,
As discussed, I am sending the security review package for ASTRAL.
ASTRAL stands for Admin Security Through Review, Automation & Least-privilege.
Attached are:
- `security-review-package.pdf` - product security overview, architecture, deployment modes, permissions, data flows, and key security considerations
- `security-review-questionnaire.pdf` - short-form questionnaire answers for easier circulation within your security review process
A few points to highlight up front:
- the platform supports multiple deployment modes, from backup-only through full review and remediation workflows
- AI-assisted review summaries are optional and can be enabled or disabled independently of the backup and restore functions
- when AI is enabled, the intended model is a customer-controlled Azure OpenAI deployment rather than an unrelated public AI service
- the AI summary feature is advisory and is intended to help non-technical reviewers such as PMs or management understand technical Intune and Entra changes in plain language
The source repository is private because it contains operational implementation details and tenant-specific configuration material. If your review requires deeper technical evidence, we can provide a controlled walkthrough of the implementation, configuration, and pipeline behavior.
If useful, I can also provide:
- a live architecture walkthrough
- a permission-by-permission review of the Microsoft Graph access model
- a demonstration of deployment modes and AI-assisted review summaries
Please let me know if your team would like any additional material in a different format.
Best regards,
[Your Name]

View File

@@ -0,0 +1,402 @@
<img src="./assets/astral-logo.svg" alt="ASTRAL logo" width="760" />
# ASTRAL Security Review Package
Prepared: 2026-03-27
## Purpose
This document describes the security posture of ASTRAL, an Intune / Entra drift backup, review, and remediation platform implemented in this repository.
ASTRAL stands for:
- Admin Security Through Review, Automation & Least-privilege
The goal of the platform is to:
- export Microsoft Intune and selected Entra ID configuration from a production tenant,
- store approved configuration snapshots in Git,
- surface drift through rolling pull requests,
- optionally restore tenant configuration back to the approved baseline.
This package is intended for customer security review of the full product and its available deployment modes.
## Executive Summary
ASTRAL is an Azure DevOps pipeline based administrative workflow, not a customer-facing application and not an endpoint agent.
Key characteristics:
- No inbound listener or public application endpoint is exposed by this repository.
- The normal operating mode is outbound-only scheduled jobs from Azure DevOps to Microsoft Graph and Azure DevOps APIs.
- The default backup/review path is read-oriented against Microsoft Graph.
- A separate restore path can write configuration back to the tenant, but only through the dedicated restore pipeline and only when enabled and authorized.
- AI-assisted PR summaries are optional and are not required for backup, review, or restore.
## Deployment Modes
The repository can be deployed progressively. It does not need to be introduced as an all-or-nothing package.
| Mode | Scope | Graph Access Profile | Azure DevOps Scope | AI |
| --- | --- | --- | --- | --- |
| Backup-only | Export tenant configuration, generate reports, retain Git-tracked snapshots | Read-only | Repository and scheduled pipeline only | Disabled |
| Review package | Backup-only plus rolling PR review, reviewer summaries, optional change-ticket threads, reviewer `/accept` and `/reject` processing | Read-only | Repository, PR workflows, review-sync pipeline | Optional |
| Full package | Review package plus restore pipeline, rollback support, selective remediation, and optional auto-remediation | Read + Write for restore path only | Repository, PR workflows, review-sync, restore pipeline | Optional |
Important clarifications:
- AI is an add-on, not a core dependency.
- Restore is a separate capability, not a requirement for backup or review.
- Organizations can adopt the platform progressively, starting with backup-only and adding review or restore capabilities later.
- AI can be enabled or disabled independently of the backup, review, and restore layers.
## System Overview
### In-Scope Components
| Component | Function | Security Relevance |
| --- | --- | --- |
| Azure DevOps pipeline `azure-pipelines.yml` | Scheduled backup, drift commit, rolling PR management, documentation artifact publishing | Main execution path |
| Azure DevOps pipeline `azure-pipelines-review-sync.yml` | Processes reviewer `/reject` and `/accept` decisions and refreshes PR summaries | Uses Azure DevOps API token |
| Azure DevOps pipeline `azure-pipelines-restore.yml` | Restores approved baseline to tenant | Write-capable path |
| Azure DevOps Git repository | Stores approved baseline, drift branches, JSON exports, reports, docs | Primary configuration store |
| Microsoft Graph | Source of Intune and Entra configuration; optional target for restore | Production tenant access |
| Azure DevOps REST APIs | PR creation/update, review thread sync, restore queueing | Change-management control plane |
| Optional Azure OpenAI | PR summary generation only | Optional data egress path |
### High-Level Flow
```mermaid
flowchart LR
A["Azure DevOps scheduled pipeline"] --> B["Federated service connection"]
B --> C["Microsoft Graph"]
A --> D["Git repo: main + drift branches"]
A --> E["Azure DevOps PR and thread APIs"]
A --> F["Build artifacts: markdown / HTML / PDF"]
A -. optional .-> G["Azure OpenAI"]
H["Reviewer in Azure DevOps"] --> E
E --> I["Rolling PR approval / rejection"]
I -. optional remediation .-> J["Restore pipeline"]
J --> C
```
## Deployment Model
### Backup and Review
The main pipeline runs hourly on `main`.
- Every hour: export Intune and Entra configuration, generate reports, commit drift to rolling workload branches, and update one rolling PR per workload.
- When delayed reviewer notifications are enabled, newly created rolling PRs are opened as Azure DevOps draft PRs, the automated summary is inserted, and the PR is then published for reviewer notification.
- At the configured full-run hour: perform the same work plus documentation artifact generation (Markdown, and optionally HTML/PDF if browser dependencies are available).
The workload branches are:
- `drift/intune`
- `drift/entra`
Reviewers approve or reject drift through Azure DevOps pull requests. The system is intentionally ex-post change management: admins may make changes in the Microsoft admin portals, and this system detects, records, and routes those changes for review.
### Review Sync
The review-sync pipeline runs every 20 minutes on `main`.
It can:
- refresh automated PR summaries,
- process reviewer `/reject` or `/accept` commands in policy threads,
- optionally queue remediation after merge if rejected items were merged out of the PR scope.
### Restore
The restore pipeline is the only path that writes configuration back to the tenant.
It supports:
- full restore from `main`,
- selective restore of specific policy files,
- restore from a historical Git ref for rollback use cases,
- dry-run mode for report-only validation.
## Data Processed
### Data Categories
| Category | Examples | Source | Stored In |
| --- | --- | --- | --- |
| Intune configuration objects | compliance policies, device configurations, settings catalog, enrollment profiles, apps, scripts, filters, scope tags | Microsoft Graph / IntuneCD export | Git repo under `tenant-state/intune/**` |
| Entra configuration objects | conditional access, named locations, authentication strengths, app registrations, enterprise applications | Microsoft Graph | Git repo under `tenant-state/entra/**` |
| Generated reports | assignment inventories, object inventories, app inventories | Derived from exported configuration | `tenant-state/reports/**` and build artifacts |
| Documentation artifacts | split markdown, optional HTML/PDF | Derived from exported configuration | build artifacts |
| Review metadata | PR descriptions, review threads, accept/reject commands | Azure DevOps reviewers | Azure DevOps PR APIs |
| Optional AI summary payload | sampled changed paths, semantic change descriptions, deterministic summary, fingerprints | Derived from repo diff | Azure OpenAI request payload |
### Data Sensitivity Notes
- The system is designed for administrative configuration data, not end-user business content.
- The repository can still contain sensitive operational material, including policy logic, group names, app identifiers, script bodies, custom configuration payloads, and administrator email addresses present in tenant configuration.
- If tenant-authored scripts or custom payloads contain embedded secrets, those secrets would also be captured. This is a customer governance risk, not something the exporter can reliably prevent.
- For that reason, the repository, drift branches, build logs, and published artifacts should all be treated as confidential administrative data.
- The same sensitivity assumptions apply to any AI summary payload because it is derived from the same administrative configuration changes.
## Authentication and Authorization
### Azure to Microsoft Graph
The pipelines obtain a Microsoft Graph access token at runtime using the Azure DevOps service connection configured in `SERVICE_CONNECTION_NAME` (e.g. `sc-astral-backup`).
Observed controls in the implementation:
- token acquisition is performed at runtime with `Get-AzAccessToken`,
- token role claims are inspected before proceeding,
- the token is stored as a secret pipeline variable (`issecret=true`),
- missing required Graph roles cause early failure.
### Azure DevOps API Access
The pipelines use `System.AccessToken` for:
- creating and updating rolling PRs,
- reading and updating PR threads,
- queuing the restore pipeline.
The repository permissions documented in the implementation are:
- contribute,
- create branch,
- force push,
- create/update pull requests,
- optional create tag.
If restore auto-queue is enabled, the pipeline identity also needs:
- `View builds`,
- `Queue builds`,
- explicit pipeline authorization when enforced by the project.
### Graph Permissions by Mode
#### Backup / Review Mode
Read-oriented Graph application permissions documented in the repository:
- `Device.Read.All`
- `DeviceManagementApps.Read.All`
- `DeviceManagementConfiguration.Read.All`
- `DeviceManagementManagedDevices.Read.All`
- `DeviceManagementRBAC.Read.All`
- `DeviceManagementScripts.Read.All`
- `DeviceManagementServiceConfig.Read.All`
- `Group.Read.All`
- `Policy.Read.All`
- `Policy.Read.ConditionalAccess`
- `Policy.Read.DeviceConfiguration`
- `User.Read.All`
- `Application.Read.All` for Entra app exports
- `RoleManagement.Read.Directory` or `Directory.Read.All` for richer enrichment
- `AuditLog.Read.All` if commit author attribution is desired
#### Restore Mode
Write-capable Graph application permissions documented in the repository:
- `DeviceManagementApps.ReadWrite.All`
- `DeviceManagementConfiguration.ReadWrite.All`
- `DeviceManagementManagedDevices.ReadWrite.All`
- `DeviceManagementRBAC.ReadWrite.All`
- `DeviceManagementScripts.ReadWrite.All`
- `DeviceManagementServiceConfig.ReadWrite.All`
- `Group.Read.All`
- `Policy.Read.All`
- `Policy.ReadWrite.ConditionalAccess` when Entra updates are included
## Security Controls Present in the Implementation
### Network Exposure
- No inbound application endpoint is created by this repository.
- The system is pipeline-driven and relies on outbound HTTPS calls.
- Required outbound destinations are:
- `graph.microsoft.com`
- Azure DevOps organization APIs
- optional Azure OpenAI endpoint
- Python package registry for `IntuneCD`
- npm registry for `md-to-pdf`
- optional OS package repositories when HTML/PDF conversion needs Chromium libraries
### Secrets Handling
- Graph tokens are obtained just-in-time rather than stored in the repository.
- The pipeline marks the Graph token as a secret variable.
- The implementation logs token claims and roles for diagnostics, but not the token value itself.
- Azure OpenAI uses a pipeline secret variable when enabled.
- The pipeline logic itself does not depend on repository-stored application secrets; separate secret scanning of exported tenant content is still recommended.
### Change Control
- Drift is committed to dedicated rolling branches rather than directly to `main`.
- Review happens through rolling pull requests into `main`.
- The implementation can delay reviewer notification by creating new rolling PRs as drafts until the automated summary block is present, reducing generic first-notification content.
- Optional file-level change tickets can be enforced through auto-created PR threads.
- Reviewers can explicitly accept or reject individual configuration files.
- Generated reports are excluded from drift commits and PR diffs to reduce review noise.
### Safety Checks
- Backup jobs validate expected outputs before committing drift.
- Intune backup logic checks for unauthorized Graph 403 responses and fails unless the failure is explicitly allowed by configuration.
- Entra export logic is configured to fail on requested export errors to avoid partial snapshots.
- Restore validates required write permissions before running.
- Selective restore sanitizes requested paths and rejects path traversal or missing-file conditions.
- Restore supports dry-run mode before any tenant change is applied.
### Auditability
- Git history retains approved baseline snapshots.
- Rolling PR history provides reviewer decisions and rationale.
- Azure DevOps build history records pipeline runs and restoration events.
- Optional tags can be created for snapshots.
## Optional Azure OpenAI Integration
Azure OpenAI is used only for PR review narrative generation.
Important scoping facts from the implementation:
- the feature is optional and controlled by pipeline variables,
- the core backup/review/restore workflow does not depend on it,
- it can remain disabled in every deployment mode,
- only a reduced, budget-limited change payload is sent,
- the payload contains changed paths, semantic summaries, risk labels, fingerprints, and deterministic summary text,
- it does not need direct Microsoft Graph access,
- it can be disabled with `ENABLE_PR_AI_SUMMARY=false`.
### Intended AI Deployment Posture
The intended security posture for AI is not an opaque third-party black-box service. The implementation is designed to use a customer-controlled Azure OpenAI deployment defined by:
- `AZURE_OPENAI_ENDPOINT`
- `AZURE_OPENAI_DEPLOYMENT`
- `AZURE_OPENAI_API_KEY`
In the intended production design:
- AI requests are sent to the customer's Azure OpenAI resource,
- the model endpoint is explicitly configured by the customer,
- the AI service is a bounded summarization component rather than a system of record,
- Graph access remains with the pipeline and is not delegated to the model.
For formal security documentation, the safest statement is:
- the system is intended to use customer-managed Azure OpenAI infrastructure, typically within the same Azure tenant or controlled Azure environment, rather than an unrelated public AI service.
### AI Security Considerations
From a security perspective, the AI feature changes the system in these specific ways:
- it introduces an additional outbound destination: the configured Azure OpenAI endpoint,
- it sends a derived review payload based on configuration drift rather than raw tenant-wide exports,
- it does not grant the AI service direct credentials to Microsoft Graph or Azure DevOps,
- it is advisory only and does not approve, merge, reject, or restore changes by itself,
- it can be disabled independently of the rest of the platform.
### AI Business Purpose
The AI summaries exist to make technical Intune and Entra drift understandable to non-technical reviewers.
Their intended audience includes:
- project managers,
- delivery leads,
- security managers,
- customer management stakeholders,
- reviewers who own risk acceptance but do not work daily with raw policy JSON.
The purpose is not to replace technical review. The purpose is to provide a manager-readable explanation of:
- what changed,
- why it matters operationally,
- whether the change appears routine, risky, or potentially security-relevant,
- what a reviewer should verify before approval.
This allows management or PM stakeholders to participate meaningfully in review without needing to parse raw technical policy structures.
## Residual Risks and Customer Decisions
The following items are not fully solved by the repository alone and should be addressed in the customer deployment decision:
| Area | Current State | Recommended Position |
| --- | --- | --- |
| Restore capability | Supported by design; can change production tenant state | Keep restore manual only, or disable auto-remediation by default until operational controls are approved |
| Backup vs restore identity separation | Sample config uses the same service connection name in backup and restore pipelines | Use separate service principals: read-only for backup/review, write-enabled only for restore |
| Azure OpenAI egress | Optional and customer-configurable | Enable only when the organization approves the payload scope and Azure OpenAI deployment model |
| Artifact retention | Not defined in repo; inherited from Azure DevOps settings | Set explicit retention for builds, logs, and artifacts |
| Repo access model | Not defined in repo | Restrict repo and artifact access to administrators/reviewers only |
| Build agent hardening | Pool name exists, but agent type and hardening are deployment-specific | Prefer dedicated hardened agent or approved Microsoft-hosted configuration |
| Runtime package download | `pip`, `npm`, and sometimes `apt-get` are used during pipeline runs | Pre-bake dependencies into the agent image if customer forbids runtime internet package fetches |
| Secret content inside exported scripts | Possible if tenant admins embed secrets in Intune scripts or custom payloads | Review tenant script hygiene before onboarding |
## Recommended Deployment Configuration
For a conservative production deployment, use this profile:
1. Enable backup and review workflows.
2. Enable Azure OpenAI summaries only when a customer-controlled Azure OpenAI deployment is approved.
3. Disable automatic remediation queueing.
4. Do not authorize the restore pipeline for automatic queueing.
5. Use a read-only Graph application identity for backup/review.
6. Keep restore on a separate manual path with a separate write-enabled identity.
7. Apply Azure DevOps branch policies so `main` requires reviewer approval.
8. Set explicit retention and access-control policies for:
- Git repository
- build logs
- markdown/HTML/PDF artifacts
Suggested conservative variable posture:
```text
ENABLE_PR_AI_SUMMARY=<true|false according to approved deployment mode>
AUTO_REMEDIATE_ON_PR_REJECTION=false
AUTO_REMEDIATE_AFTER_MERGE=false
REQUIRE_CHANGE_TICKETS=true
```
## Out of Scope
This repository does not provide:
- endpoint malware protection,
- customer device telemetry collection,
- user authentication to a SaaS application,
- network ingress services,
- a standalone secrets vault,
- customer-managed key support within the application itself.
Those controls, where needed, come from Azure DevOps, Microsoft 365 / Entra, the chosen agent environment, and the customer's broader platform governance.
## Customer-Specific Items to Fill Before Sending
The following are deployment-specific and should be completed with the actual customer environment:
- Azure DevOps organization and project name
- whether the agent pool is Microsoft-hosted or self-hosted
- repo retention period
- build log retention period
- artifact retention period
- named reviewer groups and branch policies
- exact service principal names used for backup and restore
- which Azure OpenAI resource and deployment are used, if AI is enabled
- whether restore is manual-only or fully enabled
## Repository Evidence
The statements in this document are based on the implementation in:
- `README.md`
- `azure-pipelines.yml`
- `azure-pipelines-review-sync.yml`
- `azure-pipelines-restore.yml`
- `scripts/update_pr_review_summary.py`
- `scripts/apply_reviewer_rejections.py`
- `scripts/queue_post_merge_restore.py`
- `scripts/export_entra_baseline.py`

View File

@@ -0,0 +1,35 @@
<img src="./assets/astral-logo.svg" alt="ASTRAL logo" width="700" />
# ASTRAL Security Review Questionnaire
Prepared: 2026-03-27
This appendix is a shorter, copy/paste-friendly companion to the full ASTRAL security review package.
| Question | Answer |
| --- | --- |
| What is the system? | ASTRAL is an Azure DevOps pipeline workflow that exports Microsoft Intune and selected Entra ID configuration, stores approved baseline snapshots in Git, and raises configuration drift for review through rolling pull requests. |
| What deployment modes are supported? | The same repository can be operated in progressive modes: backup-only, review package, or full package with restore/remediation. AI is optional in all modes. |
| Is it a public-facing application? | No. It is an administrative pipeline workflow with no public UI or inbound application endpoint created by this repository. |
| Does it require inbound network access from the internet? | No. The implemented workflow is outbound-only over HTTPS. |
| What production systems does it access? | Microsoft Graph for Intune and Entra configuration, plus Azure DevOps APIs for pull request and pipeline operations. |
| Does it make production changes? | Backup and review pipelines are read-oriented against Microsoft Graph. The restore pipeline is write-capable and can apply approved baseline configuration back to the tenant when explicitly enabled and authorized. |
| What data is processed? | Administrative configuration data such as Intune policies, device configuration, enrollment profiles, apps, scripts, conditional access, named locations, authentication strengths, app registrations, and enterprise application metadata. |
| Does it process end-user business content? | It is not designed for business content. However, exported admin-authored scripts or custom payloads can contain sensitive operational data if the tenant already stores it there. |
| Where is data stored? | In the Azure DevOps Git repository, Azure DevOps pull requests/threads, build logs, and optional build artifacts such as markdown, HTML, and PDF documentation. |
| How does it authenticate to Microsoft Graph? | By obtaining a Microsoft Graph token at runtime through an Azure DevOps Azure service connection using workload identity / federated credential flow. |
| How does it authenticate to Azure DevOps APIs? | With `System.AccessToken` scoped to the pipeline identity. |
| Are long-lived secrets stored in the repository? | The pipeline logic does not require repository-stored application secrets. Runtime tokens are acquired during pipeline execution, but exported tenant content should still be treated as potentially sensitive and reviewed for embedded secrets in admin-authored scripts or custom payloads. |
| How are secrets handled in the pipeline? | The Graph access token is set as a secret pipeline variable. The implementation logs token claims and granted roles for diagnostics, but not the token value. |
| What minimum permissions are required? | Read-only Microsoft Graph application permissions for backup/review, and additional write permissions only for restore. Exact permissions are listed in the full package. |
| Is there separation between read and write access? | The code supports a safe separation model. For production, create separate read-only and write-enabled service principals/connections so backup and restore use different identities. |
| What change-control mechanism exists? | Drift is committed to dedicated workload branches and reviewed through rolling pull requests into `main`. New rolling PRs can be created as drafts until the automated summary is inserted, and optional per-file change-ticket threads and reviewer `/reject` commands are supported. |
| Can reviewers block or scope changes? | Yes. Reviewers can approve the rolling PR, reject it, or reject individual file-level drift items through PR threads when that feature is enabled. |
| Is rollback supported? | Yes. The restore pipeline supports full restore, selective restore by file path, historical restore by Git ref, and dry-run mode. |
| What external network destinations are required? | Microsoft Graph, Azure DevOps APIs, optional Azure OpenAI, Python package registry for `IntuneCD`, npm registry for `md-to-pdf`, and optionally OS package repositories when browser dependencies are installed for HTML/PDF generation. |
| Does the system send data to AI services? | Only if Azure OpenAI summary generation is explicitly configured. It is optional for the platform overall. |
| What AI service is intended? | A customer-controlled Azure OpenAI deployment configured through the Azure OpenAI endpoint and deployment variables, rather than an unrelated public AI service. |
| What data is sent to Azure OpenAI when enabled? | A reduced change-review payload containing changed paths, semantic summaries, deterministic summary text, and fingerprints derived from the repo diff. This is intended to support review summarization, not raw tenant-wide export ingestion. |
| Why is AI included? | The AI summary is meant to translate technical Intune and Entra drift into manager-readable language so PMs, management, and other non-specialist reviewers can understand impact and review intent without parsing raw policy JSON. |
| Recommended deployment posture? | Start with backup-only or review-package mode, enable Azure OpenAI only on a customer-controlled deployment when approved, keep auto-remediation disabled by default, and use separate read-only and write-enabled service principals if restore is enabled. |
| What customer-specific controls still need to be defined? | Agent type and hardening, repo/build/artifact retention, exact access groups, branch policies, and whether restore or Azure OpenAI are enabled in the target deployment. |

30
md2pdf/README.md Normal file
View File

@@ -0,0 +1,30 @@
# Automated Microsoft Intune backup and as-built
A template repository that you can clone to enable a Microsoft Intune tenant backup and as-built report using [IntuneCD](https://github.com/almenscorner/IntuneCD) and [md-to-pdf](https://github.com/simonhaenisch/md-to-pdf).
To learn how to use this repository, see these articles:
- [Automate Microsoft Intune As-Built Documentation on GitHub](https://stealthpuppy.com/automate-intune-documentation-github/)
- [Automate Microsoft Intune As-Built Documentation on Azure DevOps](https://stealthpuppy.com/automate-intune-documentation-azure/)
## Example report
The generated as-built documentation will look something like:
![As-built documentation screenshot](.img/asbuilt-sample.png)
## GitHub
After creating a new repository in GitHub based on this template, you'll need to enable the Actions to run via the repository settings, and add the secrets required by the workflows.
This template repository includes the following workflows:
* [`intune-backup.yml`](.github/workflows/intune-backup.yml) - performs the export from the Intune tenant to create a backup, and generates a markdown version of the as-built document, and tags the release
* [`intune-release.yml`](.github/workflows/intune-release.yml) - generates PDF and HTML versions of the markdown document, creates a release, and adds the documents to the release as assets
* [`remove-releases.yml`](.github/workflows/remove-releases.yml) - prunes the release assets to keep the last 60 releases
## Azure DevOps
Clone this repository into GitHub or Azure DevOps, then import into a project and create a pipeline:
* [`intune-backup.yml`](.devops/intune-backup.yml) - performs the export from the Intune tenant to create a backup, and generates a markdown version of the as-built document, and tags the release, generates PDF and HTML versions of the markdown document, creates a release, and adds the documents to the release as assets

9
md2pdf/htmlconfig.json Normal file
View File

@@ -0,0 +1,9 @@
{
"stylesheet": [
"./md2pdf/htmlstyle.css"
],
"marked_options": {
"headerIds": false,
"smartypants": true
}
}

140
md2pdf/htmlstyle.css Normal file
View File

@@ -0,0 +1,140 @@
* {
box-sizing: border-box;
}
html {
font-size: 100%;
}
body {
font-family: 'Segoe UI', 'Roboto', 'Oxygen', 'Ubuntu', 'Cantarell', 'Fira Sans', 'Droid Sans', 'Helvetica Neue', sans-serif;
line-height: 1.6;
font-size: 0.6875em; /* 11 pt */
color: #111;
margin: 0;
}
body > :first-child {
padding-top: 0;
margin-top: 0;
}
body > :last-child {
margin-bottom: 0;
padding-bottom: 0;
}
h1,
h2,
h3,
h4,
h5,
h6 {
margin: 0;
padding: 0.5em 0 0.25em;
text-transform: uppercase;
}
h5,
h6 {
padding: 0;
}
h5 {
font-size: 1em;
}
h6 {
font-size: 0.875em;
}
p {
margin: 0.25em 0 1em;
}
blockquote {
margin: 0.5em 0 1em;
padding-left: 0.5em;
padding-right: 1em;
border-left: 4px solid gainsboro;
font-style: italic;
}
ul,
ol {
margin: 0;
margin-left: 1em;
padding: 0 1.5em 0.5em;
}
pre {
white-space: pre-wrap;
}
h1 code,
h2 code,
h3 code,
h4 code,
h5 code,
h6 code,
p code,
li code,
pre code {
background-color: #f8f8f8;
padding: 0.1em 0.375em;
border: 1px solid #f8f8f8;
border-radius: 0.25em;
font-family: monospace;
font-size: 1.2em;
}
pre code {
display: block;
padding: 0.5em;
}
.page-break {
page-break-after: always;
}
img {
max-width: 100%;
margin: 1em 0;
}
table {
border-spacing: 0;
border-collapse: collapse;
margin: 0 0 1em;
display: block;
width: 100%;
overflow: auto;
table-layout: auto;
width: 100%;
}
table th,
table td {
padding: 0.5em 1em;
border: 1px solid gainsboro;
}
table th {
font-weight: 600;
text-transform: uppercase;
}
table tr {
background-color: white;
border-top: 1px solid gainsboro;
}
table tr:nth-child(2n) {
background-color: whitesmoke;
}
section {
margin: 0 auto;
font-family: 'Segoe UI', 'Roboto', 'Oxygen', 'Ubuntu', 'Cantarell', 'Fira Sans', 'Droid Sans', 'Helvetica Neue', sans-serif !important;
font-size: 9px;
}

17
md2pdf/pdfconfig.json Normal file
View File

@@ -0,0 +1,17 @@
{
"stylesheet": [
"./md2pdf/pdfstyle.css"
],
"marked_options": {
"headerIds": false,
"smartypants": true
},
"pdf_options": {
"format": "A4",
"margin": "15mm",
"printBackground": false,
"headerTemplate": "<style> section { margin: 0 auto; font-family: sans-serif !important; font-size: 9px; } </style><section><span>ASTRAL Documentation</span></section>",
"footerTemplate": "<style> section { margin: 0 auto; font-family: sans-serif !important; font-size: 9px; } </style><section><div>Page <span class='pageNumber'></span> of <span class='totalPages'></span></div></section>",
"displayHeaderFooter": true
}
}

179
md2pdf/pdfstyle.css Normal file
View File

@@ -0,0 +1,179 @@
* {
box-sizing: border-box;
}
@page {
margin: 18mm 14mm 18mm 14mm;
}
html {
font-size: 100%;
}
body {
font-family: 'Segoe UI', 'Roboto', 'Oxygen', 'Ubuntu', 'Cantarell', 'Fira Sans', 'Droid Sans', 'Helvetica Neue', sans-serif;
line-height: 1.6;
font-size: 0.6875em; /* 11 pt */
color: #111;
margin: 0;
orphans: 3;
widows: 3;
}
body > :first-child {
padding-top: 0;
margin-top: 0;
}
body > :last-child {
margin-bottom: 0;
padding-bottom: 0;
}
h1,
h2,
h3,
h4,
h5,
h6 {
margin: 0;
padding: 0.5em 0 0.25em;
text-transform: uppercase;
page-break-after: avoid;
break-after: avoid-page;
page-break-inside: avoid;
break-inside: avoid-page;
}
h5,
h6 {
padding: 0;
}
h5 {
font-size: 1em;
}
h6 {
font-size: 0.875em;
}
p {
margin: 0.25em 0 1em;
}
blockquote {
margin: 0.5em 0 1em;
padding-left: 0.5em;
padding-right: 1em;
border-left: 4px solid gainsboro;
font-style: italic;
page-break-inside: avoid;
break-inside: avoid-page;
}
ul,
ol {
margin: 0;
margin-left: 1em;
padding: 0 1.5em 0.5em;
page-break-inside: avoid;
break-inside: avoid-page;
}
li {
page-break-inside: avoid;
break-inside: avoid-page;
}
pre {
white-space: pre-wrap;
page-break-inside: avoid;
break-inside: avoid-page;
}
h1 code,
h2 code,
h3 code,
h4 code,
h5 code,
h6 code,
p code,
li code,
pre code {
background-color: #f8f8f8;
padding: 0.1em 0.375em;
border: 1px solid #f8f8f8;
border-radius: 0.25em;
font-family: monospace;
font-size: 1.2em;
}
pre code {
display: block;
padding: 0.5em;
}
.page-break {
page-break-after: always;
}
img {
max-width: 100%;
margin: 0.5em 0 1em;
page-break-inside: avoid;
break-inside: avoid-page;
}
img[alt="ASTRAL logo"] {
max-width: 62%;
margin: 0 0 0.75em;
}
table {
border-spacing: 0;
border-collapse: collapse;
margin: 0 0 1em;
width: 100%;
overflow: auto;
table-layout: auto;
width: 100%;
page-break-inside: avoid;
break-inside: avoid-page;
}
table th,
table td {
padding: 0.5em 1em;
border: 1px solid gainsboro;
vertical-align: top;
}
table th {
font-weight: 600;
text-transform: uppercase;
}
table tr {
background-color: white;
border-top: 1px solid gainsboro;
page-break-inside: avoid;
break-inside: avoid-page;
}
table tr:nth-child(2n) {
background-color: whitesmoke;
}
hr,
svg,
figure {
page-break-inside: avoid;
break-inside: avoid-page;
}
section {
margin: 0 auto;
font-family: 'Segoe UI', 'Roboto', 'Oxygen', 'Ubuntu', 'Cantarell', 'Fira Sans', 'Droid Sans', 'Helvetica Neue', sans-serif !important;
font-size: 9px;
}

28
pyproject.toml Normal file
View File

@@ -0,0 +1,28 @@
[build-system]
requires = ["setuptools>=61"]
build-backend = "setuptools.build_meta"
[project]
name = "intune-entra-drift-backup"
version = "0.0.0"
description = "Git-based snapshots of Microsoft Intune and Entra ID configuration"
requires-python = ">=3.11"
dependencies = [
"IntuneCD==2.5.0",
]
[tool.ruff]
line-length = 120
target-version = "py311"
[tool.ruff.lint]
select = ["E", "F", "I", "W", "UP", "B", "C4", "SIM"]
ignore = ["E501"]
[tool.mypy]
python_version = "3.11"
warn_return_any = true
warn_unused_configs = true
disallow_untyped_defs = false
check_untyped_defs = true
ignore_missing_imports = true

1
requirements.txt Normal file
View File

@@ -0,0 +1 @@
IntuneCD==2.5.0

View File

@@ -0,0 +1,316 @@
#!/usr/bin/env python3
"""Apply per-policy reviewer reject decisions on rolling drift PRs.
Reviewer decision format inside auto Change Needed threads:
- /reject -> remove this file-level drift from rolling PR (reset to baseline)
- /accept -> keep this file-level drift
Latest decision command in the thread wins.
"""
from __future__ import annotations
import argparse
import base64
import json
import os
import re
import subprocess
import sys
import urllib.parse
from pathlib import Path
from typing import Any
# common.py lives in the same directory; ensure it can be imported when the
# script is executed directly.
_sys_path_inserted = False
if __file__:
_script_dir = str(Path(__file__).resolve().parent)
if _script_dir not in sys.path:
sys.path.insert(0, _script_dir)
_sys_path_inserted = True
import common
if _sys_path_inserted:
sys.path.pop(0)
_request_json = common.request_json
_run_git = common.run_git
_configure_git_identity = common.configure_git_identity
AUTO_TICKET_THREAD_PREFIX = "AUTO-CHANGE-TICKET:"
THREAD_STATUS_FIXED = 2
THREAD_STATUS_WONT_FIX = 3
THREAD_STATUS_CLOSED = 4
THREAD_STATUS_BY_DESIGN = 5
DECISION_RE = re.compile(r"(?im)^\s*(?:/|#)?(?P<decision>reject|accept)\b")
def _run_diff_name_only(repo_root: str, baseline_branch: str, drift_branch: str) -> str:
three_dot = f"origin/{baseline_branch}...origin/{drift_branch}"
two_dot = f"origin/{baseline_branch}..origin/{drift_branch}"
try:
return _run_git(repo_root, ["diff", "--name-only", three_dot])
except RuntimeError as exc:
stderr = str(exc).lower()
if "no merge base" not in stderr:
raise
print(
"WARNING: No merge base for rolling branches "
f"(origin/{baseline_branch}, origin/{drift_branch}); using direct diff."
)
return _run_git(repo_root, ["diff", "--name-only", two_dot])
def _git_path_exists(repo_root: str, treeish: str, path: str) -> bool:
proc = subprocess.run(
["git", "cat-file", "-e", f"{treeish}:{path}"],
cwd=repo_root,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
)
return proc.returncode == 0
def _normalize_branch_name(branch: str) -> str:
b = branch.strip()
if b.startswith("refs/heads/"):
return b[len("refs/heads/") :]
return b
def _thread_status_code(thread: dict[str, Any]) -> int:
status = thread.get("status")
if isinstance(status, int):
return status
if isinstance(status, str):
mapping = {
"fixed": THREAD_STATUS_FIXED,
"wontfix": THREAD_STATUS_WONT_FIX,
"closed": THREAD_STATUS_CLOSED,
"bydesign": THREAD_STATUS_BY_DESIGN,
}
return mapping.get(status.strip().lower(), 1)
return 1
def _is_thread_resolved(thread: dict[str, Any]) -> bool:
return _thread_status_code(thread) in (
THREAD_STATUS_FIXED,
THREAD_STATUS_WONT_FIX,
THREAD_STATUS_CLOSED,
THREAD_STATUS_BY_DESIGN,
)
def _ticket_path_from_content(content: str) -> str | None:
marker_re = re.compile(r"<!--\s*" + re.escape(AUTO_TICKET_THREAD_PREFIX) + r"(?P<id>[A-Za-z0-9_-]+)\s*-->")
match = marker_re.search(content or "")
if not match:
return None
encoded = match.group("id")
padding = "=" * ((4 - len(encoded) % 4) % 4)
try:
return base64.urlsafe_b64decode((encoded + padding).encode("ascii")).decode("utf-8")
except Exception:
return None
def _is_doc_like(path: str) -> bool:
lp = path.lower()
return lp.endswith(".md") or lp.endswith(".markdown") or "/docs/" in lp
def _is_report_like(path: str) -> bool:
lp = path.lower()
return "/reports/" in lp or "assignment report" in lp
def _latest_thread_decision(comments: list[dict[str, Any]]) -> str | None:
decision: str | None = None
def _comment_sort_key(c: dict[str, Any]) -> tuple[int, int]:
try:
cid = int(c.get("id", 0))
except Exception:
cid = 0
try:
parent = int(c.get("parentCommentId", 0))
except Exception:
parent = 0
return (cid, parent)
for comment in sorted(comments, key=_comment_sort_key):
content = str(comment.get("content", "") or "")
match = DECISION_RE.search(content)
if match:
decision = match.group("decision").lower()
return decision
def _post_thread_comment(repo_api: str, pr_id: int, thread_id: int, token: str, content: str) -> None:
_request_json(
f"{repo_api}/pullrequests/{pr_id}/threads/{thread_id}/comments?api-version=7.1",
token=token,
method="POST",
body={
"parentCommentId": 0,
"content": content,
"commentType": 1,
},
)
def main() -> int:
parser = argparse.ArgumentParser(description="Apply reviewer /reject decisions for rolling PR threads")
parser.add_argument("--repo-root", required=True)
parser.add_argument("--workload", required=True)
parser.add_argument("--drift-branch", required=True)
parser.add_argument("--baseline-branch", required=True)
args = parser.parse_args()
token = os.environ.get("SYSTEM_ACCESSTOKEN", "").strip()
if not token:
raise SystemExit("SYSTEM_ACCESSTOKEN is empty.")
collection_uri = os.environ["SYSTEM_COLLECTIONURI"].rstrip("/")
project = os.environ["SYSTEM_TEAMPROJECT"]
repository_id = os.environ["BUILD_REPOSITORY_ID"]
drift_branch = _normalize_branch_name(args.drift_branch)
baseline_branch = _normalize_branch_name(args.baseline_branch)
repo_api = f"{collection_uri}/{project}/_apis/git/repositories/{repository_id}"
source_ref = f"refs/heads/{drift_branch}"
target_ref = f"refs/heads/{baseline_branch}"
query = urllib.parse.urlencode(
{
"searchCriteria.status": "active",
"searchCriteria.sourceRefName": source_ref,
"searchCriteria.targetRefName": target_ref,
"api-version": "7.1",
},
quote_via=urllib.parse.quote,
safe="/",
)
payload = _request_json(f"{repo_api}/pullrequests?{query}", token=token)
prs = payload.get("value", []) if isinstance(payload, dict) else []
if not prs:
print("No active rolling PR found; skipping reviewer reject sync.")
return 0
pr = prs[0]
pr_id = int(pr.get("pullRequestId"))
_run_git(args.repo_root, ["fetch", "--quiet", "origin", baseline_branch, drift_branch])
diff_paths = _run_diff_name_only(args.repo_root, baseline_branch, drift_branch)
changed_paths = {
p.strip()
for p in diff_paths.splitlines()
if p.strip() and not _is_doc_like(p.strip()) and not _is_report_like(p.strip())
}
if not changed_paths:
print("No changed policy paths in rolling PR; nothing to auto-reject.")
return 0
threads_payload = _request_json(f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1", token=token)
threads = threads_payload.get("value", []) if isinstance(threads_payload, dict) else []
rejections: list[tuple[str, int]] = []
examined_ticket_threads = 0
for thread in threads:
comments = thread.get("comments", []) if isinstance(thread.get("comments"), list) else []
marker_path: str | None = None
for c in comments:
marker_path = _ticket_path_from_content(str(c.get("content", "") or ""))
if marker_path:
break
if not marker_path:
continue
examined_ticket_threads += 1
if marker_path not in changed_paths:
continue
decision = _latest_thread_decision(comments)
if decision == "reject":
try:
thread_id = int(thread.get("id"))
except Exception:
thread_id = -1
rejections.append((marker_path, thread_id))
if not rejections:
print(
"No /reject decisions found in auto policy threads "
f"(examined={examined_ticket_threads}, changed_paths={len(changed_paths)})."
)
return 0
print(
"Detected /reject decisions in auto policy threads: "
f"{len(rejections)} (examined={examined_ticket_threads})."
)
_run_git(args.repo_root, ["checkout", "--quiet", "--force", "-B", drift_branch, f"origin/{drift_branch}"])
changed = 0
baseline_tree = f"origin/{baseline_branch}"
for path, _thread_id in sorted(set(rejections)):
if _git_path_exists(args.repo_root, baseline_tree, path):
_run_git(args.repo_root, ["checkout", baseline_tree, "--", path])
_run_git(args.repo_root, ["add", "--", path])
changed += 1
else:
file_abs = os.path.join(args.repo_root, path)
if os.path.exists(file_abs):
_run_git(args.repo_root, ["rm", "-f", "--", path])
changed += 1
proc = subprocess.run(
["git", "diff", "--cached", "--quiet"],
cwd=args.repo_root,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
)
if proc.returncode == 0:
print("Reviewer /reject decisions found, but no effective diff remained after baseline reset.")
return 0
_configure_git_identity(args.repo_root)
commit_msg = f"Apply reviewer /reject decisions ({args.workload})"
_run_git(args.repo_root, ["commit", "-m", commit_msg])
_run_git(args.repo_root, ["push", "--force-with-lease", "origin", f"HEAD:{drift_branch}"])
for path, thread_id in rejections:
if thread_id <= 0:
continue
_post_thread_comment(
repo_api=repo_api,
pr_id=pr_id,
thread_id=thread_id,
token=token,
content=(
"Auto-action: /reject detected. This policy drift was reset to baseline on the rolling drift branch, "
"so it is removed from the PR diff.\n\n"
"If tenant rollback is required immediately, run restore pipeline as remediation."
),
)
print(
f"Applied reviewer /reject decisions for {changed} path(s) in PR #{pr_id}; "
f"drift branch '{drift_branch}' updated."
)
return 0
if __name__ == "__main__":
try:
raise SystemExit(main())
except Exception as exc:
print(f"WARNING: Failed to apply reviewer /reject decisions: {exc}", file=sys.stderr)
raise

View File

@@ -0,0 +1,395 @@
#!/usr/bin/env python3
"""Commit Entra drift changes with best-effort change-author attribution."""
from __future__ import annotations
import argparse
import datetime as dt
import json
import pathlib
import subprocess
import sys
import urllib.error
import urllib.parse
import urllib.request
from collections import defaultdict
from dataclasses import dataclass
def _git_run(repo_root: pathlib.Path, args: list[str], check: bool = True) -> subprocess.CompletedProcess[str]:
proc = subprocess.run(
["git", *args],
cwd=str(repo_root),
check=False,
capture_output=True,
text=True,
)
if check and proc.returncode != 0:
stderr = (proc.stderr or "").strip()
raise RuntimeError(f"git {' '.join(args)} failed ({proc.returncode}): {stderr}")
return proc
def _set_output_var(name: str, value: str, is_output: bool = True) -> None:
suffix = ";isOutput=true" if is_output else ""
print(f"##vso[task.setvariable variable={name}{suffix}]{value}")
def _warning(message: str) -> None:
print(f"##vso[task.logissue type=warning]{message}")
def _parse_backup_start(value: str) -> dt.datetime:
candidate = value.strip()
if not candidate:
raise ValueError("Missing required --backup-start value. Ensure the pipeline sets BACKUP_START in the backup_entra job before invoking commit_entra_drift.py.")
parsed = dt.datetime.strptime(candidate, "%Y.%m.%d:%H.%M.%S")
return parsed.replace(tzinfo=dt.timezone.utc)
def _format_filter_datetime(value: dt.datetime) -> str:
return value.astimezone(dt.timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
def _last_entra_commit_date(repo_root: pathlib.Path, depth: int = 30) -> dt.datetime | None:
_git_run(repo_root, ["fetch", f"--depth={depth}"], check=False)
proc = _git_run(
repo_root,
[
"--no-pager",
"log",
"--no-show-signature",
f"-{depth}",
"--format=%s%%%cI",
],
)
for raw in proc.stdout.splitlines():
line = raw.strip()
if not line or "%%%" not in line:
continue
subject, iso_date = line.split("%%%", 1)
if subject.endswith(" (Entra)") and len(subject) >= 18 and subject[4] == ".":
try:
return dt.datetime.fromisoformat(iso_date.replace("Z", "+00:00")).astimezone(dt.timezone.utc)
except ValueError:
continue
return None
def _request_json(url: str, token: str) -> dict:
req = urllib.request.Request(
url,
headers={
"Authorization": f"Bearer {token}",
"Accept": "application/json",
},
method="GET",
)
with urllib.request.urlopen(req, timeout=60) as resp:
return json.loads(resp.read().decode("utf-8"))
@dataclass(frozen=True)
class Identity:
key: str
value: str
name: str
def _display_or_localpart(display_name: str, principal_name: str) -> str:
display_name = (display_name or "").strip()
if display_name:
return display_name
principal_name = (principal_name or "").strip()
if "@" in principal_name:
return principal_name.split("@", 1)[0]
return principal_name
def _extract_identity_from_audit(entry: dict) -> Identity | None:
initiated_by = entry.get("initiatedBy")
if not isinstance(initiated_by, dict):
return None
user = initiated_by.get("user")
if isinstance(user, dict):
principal_name = str(user.get("userPrincipalName") or user.get("email") or "").strip()
display_name = str(user.get("displayName") or "").strip()
if principal_name:
return Identity(
key=f"user:{principal_name}",
value=principal_name,
name=_display_or_localpart(display_name, principal_name),
)
if display_name:
return Identity(
key=f"display:{display_name}",
value=display_name,
name=display_name,
)
app = initiated_by.get("app")
if isinstance(app, dict):
display_name = str(app.get("displayName") or "").strip()
if display_name:
return Identity(
key=f"sp:{display_name}",
value=f"{display_name} (SP)",
name=display_name,
)
return None
def _fetch_directory_audits(
token: str,
last_commit_date: dt.datetime | None,
backup_start: dt.datetime,
) -> list[dict]:
params = {
"$top": "999",
"$select": "activityDateTime,activityDisplayName,category,result,initiatedBy,targetResources",
}
filter_parts = [f"activityDateTime le {_format_filter_datetime(backup_start)}"]
if last_commit_date is not None:
filter_parts.append(f"activityDateTime ge {_format_filter_datetime(last_commit_date)}")
params["$filter"] = " and ".join(filter_parts)
url = f"https://graph.microsoft.com/v1.0/auditLogs/directoryAudits?{urllib.parse.urlencode(params)}"
results: list[dict] = []
while url:
payload = _request_json(url, token)
value = payload.get("value")
if isinstance(value, list):
results.extend(item for item in value if isinstance(item, dict))
next_link = payload.get("@odata.nextLink")
url = str(next_link).strip() if next_link else ""
return results
def _resource_id_from_path(path: str) -> str:
pure = pathlib.PurePosixPath(path)
if pure.suffix.lower() != ".json":
return ""
stem = pure.stem
if "__" not in stem:
return ""
return stem.rsplit("__", 1)[-1].lstrip("_").strip()
def _category_key(path: str) -> str:
pure = pathlib.PurePosixPath(path)
parts = pure.parts
if len(parts) < 3:
return ""
return "/".join(parts[:3])
def _fallback_identity(name: str, email: str) -> Identity:
return Identity(key=f"fallback:{email}", value=email, name=name)
def _effective_fallback_identity(
build_reason: str,
requested_for: str,
requested_for_email: str,
service_name: str,
service_email: str,
) -> Identity:
requested_for_email = requested_for_email.strip()
if build_reason.strip() != "Schedule" and "@" in requested_for_email:
requested_for = requested_for.strip() or requested_for_email.split("@", 1)[0]
return _fallback_identity(requested_for, requested_for_email)
return _fallback_identity(service_name.strip(), service_email.strip())
def _changed_files(repo_root: pathlib.Path, workload_root: str) -> list[str]:
proc = _git_run(repo_root, ["diff", "--cached", "--name-only", "--", workload_root])
return [line.strip() for line in proc.stdout.splitlines() if line.strip()]
def _remote_diff_is_empty(repo_root: pathlib.Path, drift_branch: str, workload_root: str) -> bool:
remote_ref = f"refs/remotes/origin/{drift_branch}"
if _git_run(repo_root, ["show-ref", "--verify", "--quiet", remote_ref], check=False).returncode != 0:
return False
return _git_run(repo_root, ["diff", "--quiet", f"origin/{drift_branch}", "--", workload_root], check=False).returncode == 0
def _build_author_groups(
changed_files: list[str],
audits: list[dict],
fallback: Identity,
) -> tuple[dict[str, dict[str, list[str] | list[Identity]]], int]:
identities_by_resource: dict[str, dict[str, Identity]] = defaultdict(dict)
for audit in audits:
result = str(audit.get("result") or "").strip().lower()
if result and result != "success":
continue
identity = _extract_identity_from_audit(audit)
if identity is None:
continue
target_resources = audit.get("targetResources")
if not isinstance(target_resources, list):
continue
for target in target_resources:
if not isinstance(target, dict):
continue
resource_id = str(target.get("id") or "").strip()
if resource_id:
identities_by_resource[resource_id][identity.key] = identity
resolved_by_category: dict[str, dict[str, Identity]] = defaultdict(dict)
file_identities: dict[str, list[Identity]] = {}
unresolved_count = 0
for path in changed_files:
resource_id = _resource_id_from_path(path)
identities = list(identities_by_resource.get(resource_id, {}).values())
if identities:
file_identities[path] = sorted(identities, key=lambda item: item.key)
for identity in file_identities[path]:
resolved_by_category[_category_key(path)][identity.key] = identity
else:
file_identities[path] = []
if resource_id:
unresolved_count += 1
for path in changed_files:
if file_identities[path]:
continue
category_identities = list(resolved_by_category.get(_category_key(path), {}).values())
if category_identities:
file_identities[path] = sorted(category_identities, key=lambda item: item.key)
else:
file_identities[path] = [fallback]
grouped: dict[str, dict[str, list[str] | list[Identity]]] = {}
for path in changed_files:
identities = file_identities[path] or [fallback]
group_key = "&".join(identity.key for identity in identities)
entry = grouped.setdefault(group_key, {"files": [], "identities": identities})
files = entry["files"]
assert isinstance(files, list)
files.append(path)
return grouped, unresolved_count
def _commit_group(
repo_root: pathlib.Path,
files: list[str],
identities: list[Identity],
backup_start: dt.datetime,
) -> None:
for path in files:
print(f"\t- Adding {repo_root / path}")
_git_run(repo_root, ["add", "--all", "--", path])
author_name = ", ".join(identity.name for identity in identities)
author_email = ", ".join(identity.value for identity in identities)
print(f"\t- Setting commit author(s): {author_name}")
_git_run(repo_root, ["config", "user.name", author_name])
_git_run(repo_root, ["config", "user.email", author_email])
commit_date = backup_start.astimezone(dt.timezone.utc).strftime("%Y.%m.%d_%H.%M")
commit_name = f"{commit_date} -- {author_name} (Entra)"
print(f"\t- Creating commit '{commit_name}'")
_git_run(repo_root, ["commit", "-m", commit_name])
def main() -> int:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--repo-root", required=True)
parser.add_argument("--workload-root", required=True)
parser.add_argument("--baseline-branch", required=True)
parser.add_argument("--drift-branch", required=True)
parser.add_argument("--access-token", required=True)
parser.add_argument("--service-name", required=True)
parser.add_argument("--service-email", required=True)
parser.add_argument("--build-reason", default="")
parser.add_argument("--requested-for", default="")
parser.add_argument("--requested-for-email", default="")
parser.add_argument("--backup-start", required=True)
args = parser.parse_args()
repo_root = pathlib.Path(args.repo_root).resolve()
workload_root = args.workload_root.strip().strip("/")
fallback = _effective_fallback_identity(
build_reason=args.build_reason,
requested_for=args.requested_for,
requested_for_email=args.requested_for_email,
service_name=args.service_name,
service_email=args.service_email,
)
_git_run(repo_root, ["config", "user.name", fallback.name])
_git_run(repo_root, ["config", "user.email", fallback.value])
_git_run(repo_root, ["add", "--all", "--", workload_root])
changed_files = _changed_files(repo_root, workload_root)
if not changed_files:
print("No Entra change detected")
_set_output_var("CHANGE_DETECTED", "0")
_set_output_var("ROLLING_PR_SYNC_REQUIRED", "0")
return 0
if _remote_diff_is_empty(repo_root, args.drift_branch, workload_root):
print("No Entra change detected (snapshot identical to existing drift branch)")
_set_output_var("CHANGE_DETECTED", "0")
_set_output_var("ROLLING_PR_SYNC_REQUIRED", "1")
return 0
backup_start = _parse_backup_start(args.backup_start)
last_commit_date = _last_entra_commit_date(repo_root)
if last_commit_date is None:
_warning("Unable to obtain date of the last Entra backup config commit. All Entra audit events in the current query window will be considered.")
audits: list[dict] = []
try:
print("Getting Entra directory audit logs")
print(f"\t- from: '{last_commit_date}' (UTC) to: '{backup_start}' (UTC)")
audits = _fetch_directory_audits(args.access_token, last_commit_date, backup_start)
except urllib.error.HTTPError as exc:
if exc.code in (401, 403):
_warning("Graph token cannot read Entra directory audit logs. Falling back to pipeline identity for unresolved Entra changes.")
else:
raise
except Exception as exc: # pragma: no cover - defensive path for pipeline runtime issues
_warning(f"Unable to query Entra directory audit logs ({exc}). Falling back to pipeline identity for unresolved Entra changes.")
groups, unresolved_count = _build_author_groups(changed_files, audits, fallback)
if unresolved_count > 0:
_warning(
f"Unable to resolve author from Entra audit logs for {unresolved_count} of {len(changed_files)} changed files. Fallback identity used where needed."
)
_git_run(repo_root, ["reset", "--quiet", "--", workload_root])
print("\nCommit changes")
for group in groups.values():
files = group["files"]
identities = group["identities"]
assert isinstance(files, list)
assert isinstance(identities, list)
_commit_group(repo_root, files, identities, backup_start)
unpushed = _git_run(repo_root, ["cherry", "-v", f"origin/{args.baseline_branch}"]).stdout.strip()
if not unpushed:
_warning("Nothing to commit?! This shouldn't happen.")
_set_output_var("CHANGE_DETECTED", "0")
_set_output_var("ROLLING_PR_SYNC_REQUIRED", "0")
return 0
_git_run(repo_root, ["push", "--force-with-lease", "origin", f"HEAD:{args.drift_branch}"])
commit_sha = _git_run(repo_root, ["rev-parse", "HEAD"]).stdout.strip()
modification_authors = sorted({identity.value for group in groups.values() for identity in group["identities"]}) # type: ignore[index]
_set_output_var("CHANGE_DETECTED", "1")
_set_output_var("ROLLING_PR_SYNC_REQUIRED", "1")
_set_output_var("COMMIT_SHA", commit_sha)
_set_output_var("COMMIT_DATE", backup_start.strftime("%Y.%m.%d_%H.%M"))
_set_output_var("MODIFICATION_AUTHOR", ", ".join(modification_authors))
return 0
if __name__ == "__main__":
try:
raise SystemExit(main())
except Exception as exc:
print(str(exc), file=sys.stderr)
raise

164
scripts/common.py Normal file
View File

@@ -0,0 +1,164 @@
#!/usr/bin/env python3
"""Shared utilities for Intune / Entra drift backup scripts."""
from __future__ import annotations
import json
import os
import re
import subprocess
import time
import urllib.error
import urllib.request
from typing import Any
def env_text(name: str, default: str = "") -> str:
"""Read and sanitize an environment variable, treating unresolved Azure DevOps
macros $(...) as empty.
"""
raw = os.environ.get(name)
if raw is None:
return default
value = raw.strip()
if re.fullmatch(r"\$\([^)]+\)", value):
return default
if not value:
return default
return value
def env_bool(name: str, default: bool = False) -> bool:
"""Interpret an environment variable as a boolean."""
raw = env_text(name, "")
if not raw:
return default
return raw.lower() in {"1", "true", "yes", "y", "on"}
def normalize_exclude_csv(value: str) -> str:
"""Normalize an exclude CSV value, treating sentinel values as empty."""
normalized = str(value or "").strip()
if normalized.lower() in {"", "none", "null", "n/a", "-", "_none_"}:
return ""
return normalized
def normalize_merge_strategy(value: str) -> str:
"""Normalize a merge strategy string to an Azure DevOps API value."""
raw = (value or "").strip().lower().replace("-", "").replace("_", "")
aliases = {
"nofastforward": "noFastForward",
"mergecommit": "noFastForward",
"merge": "noFastForward",
"squash": "squash",
"rebase": "rebase",
"rebasefastforward": "rebase",
"rebaseff": "rebase",
"rebasemerge": "rebaseMerge",
}
return aliases.get(raw, "rebase")
def _get_retry_after_seconds(error: urllib.error.HTTPError) -> float | None:
try:
retry_after = error.headers.get("Retry-After")
if retry_after:
return float(retry_after)
except Exception:
pass
return None
def request_json(
url: str,
method: str = "GET",
body: dict[str, Any] | None = None,
headers: dict[str, str] | None = None,
token: str | None = None,
timeout: float = 60,
max_retries: int = 0,
) -> Any:
"""Make a JSON HTTP request and return the parsed response.
If *token* is provided, an Authorization header is added automatically.
If *max_retries* is greater than zero, transient HTTP errors (429, 500,
502, 503, 504) are retried with exponential back-off.
"""
req_headers: dict[str, str] = {
"Accept": "application/json",
}
if token is not None:
req_headers["Authorization"] = f"Bearer {token}"
if headers is not None:
req_headers.update(headers)
payload: bytes | None = None
if body is not None:
payload = json.dumps(body).encode("utf-8")
req_headers.setdefault("Content-Type", "application/json")
retry_codes = {429, 500, 502, 503, 504}
last_error: Exception | None = None
for attempt in range(max_retries + 1):
req = urllib.request.Request(
url,
data=payload,
method=method,
headers=req_headers,
)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
return json.loads(resp.read().decode("utf-8"))
except urllib.error.HTTPError as exc:
last_error = exc
if exc.code not in retry_codes or attempt == max_retries:
raise
retry_after = _get_retry_after_seconds(exc)
sleep = retry_after if retry_after is not None else (2 ** attempt)
time.sleep(sleep)
except urllib.error.URLError as exc:
last_error = exc
if attempt == max_retries:
raise
time.sleep(2 ** attempt)
# Should never be reached; satisfy type checker.
if last_error is not None:
raise last_error
raise RuntimeError("request_json exhausted all retries")
def run_git(repo_root: str | os.PathLike[str], args: list[str], check: bool = True) -> str:
"""Run a git command and return stdout as a stripped string."""
proc = subprocess.run(
["git", *args],
cwd=str(repo_root),
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
check=False,
)
if check and proc.returncode != 0:
stderr = (proc.stderr or "").strip()
raise RuntimeError(f"git {' '.join(args)} failed ({proc.returncode}): {stderr}")
return proc.stdout.strip()
def configure_git_identity(
repo_root: str | os.PathLike[str],
fallback_name: str | None = None,
fallback_email: str | None = None,
) -> None:
"""Configure git user.name and user.email from pipeline env vars."""
requested_for = (os.environ.get("BUILD_REQUESTEDFOR") or "").strip()
requested_for_email = (os.environ.get("BUILD_REQUESTEDFOREMAIL") or "").strip()
fallback_name = (fallback_name or os.environ.get("USER_NAME") or "ASTRAL Backup Service").strip()
fallback_email = (fallback_email or os.environ.get("USER_EMAIL") or "intune-backup@local.invalid").strip()
author_name = requested_for or fallback_name
author_email = requested_for_email if "@" in requested_for_email else fallback_email
run_git(repo_root, ["config", "user.name", author_name])
run_git(repo_root, ["config", "user.email", author_email])

View File

@@ -0,0 +1,203 @@
#!/usr/bin/env python3
"""
Lightweight Azure OpenAI availability precheck for pipeline diagnostics.
This script is intentionally non-blocking: it always exits 0.
"""
from __future__ import annotations
import json
import os
import sys
from urllib.error import HTTPError, URLError
from urllib.parse import quote, urlsplit
from urllib.request import Request, urlopen
def _env(name: str, default: str = "") -> str:
return os.environ.get(name, default).strip()
def _set_pipeline_var(name: str, value: str) -> None:
print(f"##vso[task.setvariable variable={name}]{value}")
def _normalize_aoai_endpoint(endpoint: str) -> str:
cleaned = endpoint.strip().rstrip("/")
if not cleaned:
return cleaned
parsed = urlsplit(cleaned)
if parsed.scheme and parsed.netloc:
cleaned = f"{parsed.scheme}://{parsed.netloc}"
marker = "/openai"
idx = cleaned.lower().find(marker)
if idx != -1:
return cleaned[:idx]
return cleaned
def _preferred_aoai_token_param(deployment_name: str) -> str:
override = _env("AZURE_OPENAI_TOKEN_PARAM", "").lower()
if override in {"max_tokens", "max_completion_tokens"}:
return override
if deployment_name.strip().lower().startswith("gpt-5"):
return "max_completion_tokens"
return "max_tokens"
def _aoai_token_param_candidates(deployment_name: str) -> list[str]:
preferred = _preferred_aoai_token_param(deployment_name)
alternate = "max_completion_tokens" if preferred == "max_tokens" else "max_tokens"
return [preferred, alternate]
def _preferred_aoai_temperature(deployment_name: str) -> float | None:
override = _env("AZURE_OPENAI_TEMPERATURE", "").lower()
if override in {"default", "none", "omit"}:
return None
if override:
try:
return float(override)
except ValueError:
return None
if deployment_name.strip().lower().startswith("gpt-5"):
return None
return 0.0
def _aoai_temperature_candidates(deployment_name: str) -> list[float | None]:
preferred = _preferred_aoai_temperature(deployment_name)
if preferred is None:
return [None]
return [preferred, None]
def main() -> int:
enabled = _env("ENABLE_PR_AI_SUMMARY", "true").lower() == "true"
if not enabled:
print("Azure OpenAI precheck skipped: ENABLE_PR_AI_SUMMARY=false")
_set_pipeline_var("AOAI_AVAILABLE", "0")
return 0
endpoint = _env("AZURE_OPENAI_ENDPOINT")
deployment = _env("AZURE_OPENAI_DEPLOYMENT")
api_key = _env("AZURE_OPENAI_API_KEY")
api_version = _env("AZURE_OPENAI_API_VERSION", "2024-12-01-preview")
if not endpoint or not deployment or not api_key:
print("Azure OpenAI precheck skipped: missing endpoint/deployment/api-key variable")
_set_pipeline_var("AOAI_AVAILABLE", "0")
return 0
endpoint_raw = endpoint
endpoint = _normalize_aoai_endpoint(endpoint_raw)
deployment_url = f"{endpoint}/openai/deployments/{quote(deployment)}/chat/completions?api-version={quote(api_version)}"
v1_url = f"{endpoint}/openai/v1/chat/completions"
print("Azure OpenAI precheck: starting")
print(f"- endpoint(raw): {endpoint_raw}")
print(f"- endpoint(normalized): {endpoint}")
print(f"- deployment: {deployment}")
print(f"- api_version: {api_version}")
prefer_v1 = endpoint.lower().endswith(".cognitiveservices.azure.com")
health_messages = [
{"role": "system", "content": "You are a health-check assistant."},
{"role": "user", "content": "Reply with: OK"},
]
for temperature in _aoai_temperature_candidates(deployment):
temperature_unsupported = False
for token_param in _aoai_token_param_candidates(deployment):
deployment_payload = {
"messages": health_messages,
token_param: 16,
}
v1_payload = {
"model": deployment,
"messages": health_messages,
token_param: 16,
}
if temperature is not None:
deployment_payload["temperature"] = temperature
v1_payload["temperature"] = temperature
routes = (
[("v1", v1_url, v1_payload), ("deployments", deployment_url, deployment_payload)]
if prefer_v1
else [("deployments", deployment_url, deployment_payload), ("v1", v1_url, v1_payload)]
)
token_param_unsupported = False
for route_name, route_url, payload in routes:
req = Request(
url=route_url,
method="POST",
data=json.dumps(payload).encode("utf-8"),
headers={
"Content-Type": "application/json",
"api-key": api_key,
},
)
try:
with urlopen(req, timeout=45) as resp:
_ = json.loads(resp.read().decode("utf-8"))
print(f"Azure OpenAI precheck: SUCCESS via {route_name} route")
_set_pipeline_var("AOAI_AVAILABLE", "1")
return 0
except HTTPError as exc:
raw = ""
try:
raw = exc.read().decode("utf-8", errors="replace")
except Exception:
raw = ""
print(f"Azure OpenAI precheck: HTTP {exc.code} via {route_name} route")
if raw:
print(raw)
if exc.code == 400:
raw_lower = raw.lower()
if "unsupported parameter" in raw_lower and f"'{token_param}'" in raw_lower:
token_param_unsupported = True
break
if "unsupported value" in raw_lower and "'temperature'" in raw_lower and temperature is not None:
temperature_unsupported = True
break
if exc.code == 404:
# Try fallback route first.
continue
if exc.code in (401, 403):
print("Hint: Check AZURE_OPENAI_API_KEY and endpoint/resource pairing.")
_set_pipeline_var("AOAI_AVAILABLE", "0")
return 0
if exc.code == 400:
print("Hint: Check model/deployment name and API version compatibility.")
_set_pipeline_var("AOAI_AVAILABLE", "0")
return 0
_set_pipeline_var("AOAI_AVAILABLE", "0")
return 0
except URLError as exc:
print(f"Azure OpenAI precheck: network error via {route_name} route: {exc}")
_set_pipeline_var("AOAI_AVAILABLE", "0")
return 0
except Exception as exc: # pragma: no cover
print(f"Azure OpenAI precheck: unexpected error via {route_name} route: {exc}")
_set_pipeline_var("AOAI_AVAILABLE", "0")
return 0
if temperature_unsupported:
break
if not token_param_unsupported:
break
if not temperature_unsupported:
break
print("Azure OpenAI precheck: no successful response from tested routes/token-params")
print("Hint: Verify AZURE_OPENAI_ENDPOINT points to the resource root, without /openai path suffix.")
print("Hint: Verify AZURE_OPENAI_DEPLOYMENT is the deployment name (for v1 this is passed as model).")
_set_pipeline_var("AOAI_AVAILABLE", "0")
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -0,0 +1,651 @@
#!/usr/bin/env python3
"""Create/update rolling drift PR and optionally queue remediation after rejection."""
from __future__ import annotations
import argparse
import hashlib
import json
import os
import subprocess
import sys
import urllib.parse
from pathlib import Path
from typing import Any
# common.py lives in the same directory; ensure it can be imported when the
# script is executed directly.
_sys_path_inserted = False
if __file__:
_script_dir = str(Path(__file__).resolve().parent)
if _script_dir not in sys.path:
sys.path.insert(0, _script_dir)
_sys_path_inserted = True
import common
if _sys_path_inserted:
sys.path.pop(0)
_env_text = common.env_text
_env_bool = common.env_bool
_normalize_exclude_csv = common.normalize_exclude_csv
_normalize_merge_strategy = common.normalize_merge_strategy
_request_json = common.request_json
_run_git = common.run_git
def _query_prs(
repo_api: str,
headers: dict[str, str],
source_ref: str,
target_ref: str,
status: str,
) -> list[dict[str, Any]]:
query = urllib.parse.urlencode(
{
"searchCriteria.status": status,
"searchCriteria.sourceRefName": source_ref,
"searchCriteria.targetRefName": target_ref,
"api-version": "7.1",
},
quote_via=urllib.parse.quote,
safe="/",
)
url = f"{repo_api}/pullrequests?{query}"
payload = _request_json(url, headers=headers)
return payload.get("value", []) if isinstance(payload, dict) else []
def _normalize_branch(branch: str) -> str:
b = branch.strip()
if b.startswith("refs/heads/"):
return b[len("refs/heads/") :]
return b
def _ref_from_branch(branch: str) -> str:
return f"refs/heads/{_normalize_branch(branch)}"
def _pr_web_url(pr_payload: dict[str, Any]) -> str:
pr_id = pr_payload.get("pullRequestId")
return (
pr_payload.get("url", "")
.replace("_apis/git/repositories", "_git")
.replace(f"/pullRequests/{pr_id}", f"/pullrequest/{pr_id}")
)
def _current_tree_id(repo_root: str) -> str:
return _run_git(repo_root, ["rev-parse", "HEAD^{tree}"])
def _tree_id_for_commitish(repo_root: str, commitish: str) -> str:
return _run_git(repo_root, ["rev-parse", f"{commitish}^{{tree}}"])
def _ref_has_commit(repo_root: str, ref: str) -> bool:
proc = subprocess.run(
["git", "rev-parse", "--verify", "--quiet", f"{ref}^{{commit}}"],
cwd=repo_root,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
)
return proc.returncode == 0
def _commit_tree_id(repo_api: str, headers: dict[str, str], commit_id: str) -> str:
url = f"{repo_api}/commits/{commit_id}?api-version=7.1"
payload = _request_json(url, headers=headers)
tree_id = payload.get("treeId", "") if isinstance(payload, dict) else ""
return tree_id.strip()
def _latest_pr_by_creation(prs: list[dict[str, Any]]) -> list[dict[str, Any]]:
return sorted(prs, key=lambda x: x.get("creationDate", ""), reverse=True)
def _normalize_repo_path(path: str) -> str:
return str(path or "").replace("\\", "/").lstrip("./")
def _is_doc_like(path: str) -> bool:
lp = _normalize_repo_path(path).lower()
if lp.endswith((".md", ".html", ".htm", ".pdf", ".csv", ".txt")):
return True
return "/docs/" in f"/{lp}" or "/object inventory/" in f"/{lp}"
def _is_report_like(path: str) -> bool:
lp = _normalize_repo_path(path).lower()
return "/reports/" in f"/{lp}" or "/assignment report/" in f"/{lp}"
def _is_workload_config_path(path: str, workload_dir: str, backup_folder: str, reports_subdir: str) -> bool:
lp = _normalize_repo_path(path).lower()
backup_norm = _normalize_repo_path(backup_folder).lower().strip("/")
workload_norm = _normalize_repo_path(workload_dir).lower().strip("/")
reports_norm = _normalize_repo_path(reports_subdir).lower().strip("/")
if not backup_norm or not workload_norm:
return False
workload_prefix = f"{backup_norm}/{workload_norm}/"
if not lp.startswith(workload_prefix):
return False
if reports_norm and lp.startswith(f"{backup_norm}/{reports_norm}/"):
return False
if _is_doc_like(lp) or _is_report_like(lp):
return False
return True
def _config_fingerprint_from_local_tree(
repo_root: str, commitish: str, workload_dir: str, backup_folder: str, reports_subdir: str
) -> str:
backup_norm = _normalize_repo_path(backup_folder).strip("/")
workload_norm = _normalize_repo_path(workload_dir).strip("/")
path_prefix = f"{backup_norm}/{workload_norm}" if backup_norm and workload_norm else ""
if not path_prefix:
return ""
try:
out = _run_git(repo_root, ["ls-tree", "-r", "--full-tree", commitish, "--", path_prefix])
except Exception:
return ""
pairs: list[str] = []
for line in out.splitlines():
if "\t" not in line:
continue
left, rel_path = line.split("\t", 1)
parts = left.split()
if len(parts) < 3 or parts[1] != "blob":
continue
blob_id = parts[2].strip()
if not blob_id:
continue
if not _is_workload_config_path(rel_path, workload_dir, backup_folder, reports_subdir):
continue
pairs.append(f"{_normalize_repo_path(rel_path)}\t{blob_id}")
if not pairs:
return ""
pairs.sort(key=lambda item: item.lower())
joined = "\n".join(pairs).encode("utf-8")
return hashlib.sha256(joined).hexdigest()
def _config_fingerprint_from_tree_api(
repo_api: str, headers: dict[str, str], tree_id: str, workload_dir: str, backup_folder: str, reports_subdir: str
) -> str:
if not tree_id:
return ""
url = f"{repo_api}/trees/{tree_id}?recursive=true&api-version=7.1"
payload = _request_json(url, headers=headers)
entries = payload.get("treeEntries", []) if isinstance(payload, dict) else []
pairs: list[str] = []
for entry in entries:
if not isinstance(entry, dict):
continue
if str(entry.get("gitObjectType", "")).lower() != "blob":
continue
rel_path = str(entry.get("relativePath", ""))
if not _is_workload_config_path(rel_path, workload_dir, backup_folder, reports_subdir):
continue
blob_id = str(entry.get("objectId", "")).strip()
if not blob_id:
continue
pairs.append(f"{_normalize_repo_path(rel_path)}\t{blob_id}")
if not pairs:
return ""
pairs.sort(key=lambda item: item.lower())
joined = "\n".join(pairs).encode("utf-8")
return hashlib.sha256(joined).hexdigest()
def _workload_config_diff_exists(
repo_root: str,
baseline_commitish: str,
drift_commitish: str,
workload_dir: str,
backup_folder: str,
reports_subdir: str,
) -> bool:
baseline_fingerprint = _config_fingerprint_from_local_tree(
repo_root=repo_root,
commitish=baseline_commitish,
workload_dir=workload_dir,
backup_folder=backup_folder,
reports_subdir=reports_subdir,
)
drift_fingerprint = _config_fingerprint_from_local_tree(
repo_root=repo_root,
commitish=drift_commitish,
workload_dir=workload_dir,
backup_folder=backup_folder,
reports_subdir=reports_subdir,
)
if baseline_fingerprint and drift_fingerprint:
return baseline_fingerprint != drift_fingerprint
try:
return _tree_id_for_commitish(repo_root, baseline_commitish) != _tree_id_for_commitish(repo_root, drift_commitish)
except Exception:
return True
def _find_matching_abandoned_pr(
repo_api: str,
headers: dict[str, str],
abandoned_prs: list[dict[str, Any]],
drift_tree: str,
repo_root: str,
workload_dir: str,
backup_folder: str,
reports_subdir: str,
drift_commitish: str,
) -> tuple[dict[str, Any] | None, str]:
current_config_fingerprint = _config_fingerprint_from_local_tree(
repo_root=repo_root,
commitish=drift_commitish,
workload_dir=workload_dir,
backup_folder=backup_folder,
reports_subdir=reports_subdir,
)
tree_fingerprint_cache: dict[str, str] = {}
for pr in _latest_pr_by_creation(abandoned_prs):
commit_id = (
((pr.get("lastMergeSourceCommit") or {}).get("commitId"))
or ((pr.get("lastMergeCommit") or {}).get("commitId"))
or ""
).strip()
if not commit_id:
continue
try:
pr_tree = _commit_tree_id(repo_api, headers, commit_id)
except Exception:
continue
if pr_tree and pr_tree == drift_tree:
return pr, "exact-tree"
if current_config_fingerprint and pr_tree:
if pr_tree not in tree_fingerprint_cache:
try:
tree_fingerprint_cache[pr_tree] = _config_fingerprint_from_tree_api(
repo_api=repo_api,
headers=headers,
tree_id=pr_tree,
workload_dir=workload_dir,
backup_folder=backup_folder,
reports_subdir=reports_subdir,
)
except Exception:
tree_fingerprint_cache[pr_tree] = ""
if tree_fingerprint_cache[pr_tree] and tree_fingerprint_cache[pr_tree] == current_config_fingerprint:
return pr, "config-fingerprint"
return None, ""
def _pr_has_reject_vote(pr: dict[str, Any]) -> bool:
reviewers = pr.get("reviewers", [])
if not isinstance(reviewers, list):
return False
for reviewer in reviewers:
if not isinstance(reviewer, dict):
continue
try:
vote = int(reviewer.get("vote", 0))
except Exception:
vote = 0
if vote == -10:
return True
return False
def _current_pr_merge_strategy(pr: dict[str, Any]) -> str:
completion_options = pr.get("completionOptions")
if not isinstance(completion_options, dict):
return ""
raw = str(completion_options.get("mergeStrategy") or "").strip()
if not raw:
return ""
return _normalize_merge_strategy(raw)
def _build_description(workload: str, drift_branch: str, baseline_branch: str, build_number: str, build_id: str) -> str:
is_entra = workload.lower() == "entra"
lead = "Rolling Entra drift PR created by backup pipeline." if is_entra else "Rolling drift PR created by backup pipeline."
return (
f"{lead}\n\n"
f"- Source branch: `{drift_branch}`\n"
f"- Target branch: `{baseline_branch}`\n"
f"- Last pipeline run: `{build_number}` (BuildId: {build_id})\n\n"
"The automated review summary is generated immediately after PR creation and inserted "
"above the reviewer actions section.\n\n"
"## Reviewer Quick Actions\n\n"
"### 1) Accept all changes\n"
"- Merge PR to accept drift into baseline.\n\n"
"### 2) Reject whole PR and revert\n"
"- Set reviewer vote to **Reject**.\n"
"- Abandon PR.\n"
"- Auto-remediation queues restore (if `AUTO_REMEDIATE_ON_PR_REJECTION=true`).\n\n"
"### 3) Reject only selected policy changes\n"
"- In each `Change Needed` policy thread, comment `/reject` for changes you do not want.\n"
"- Optional: use `/accept` for changes you want to keep.\n"
"- Wait for review-sync pipeline (about 5 minutes) to update PR diff.\n"
"- Merge remaining accepted changes.\n"
"- Post-merge auto-remediation queues restore to reconcile tenant to merged baseline "
"(if `AUTO_REMEDIATE_AFTER_MERGE=true`)."
)
def _threads_with_marker(repo_api: str, headers: dict[str, str], pr_id: int, marker: str) -> bool:
url = f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1"
payload = _request_json(url, headers=headers)
threads = payload.get("value", []) if isinstance(payload, dict) else []
for thread in threads:
for comment in thread.get("comments", []):
content = str(comment.get("content", ""))
if marker in content:
return True
return False
def _queue_restore_pipeline(
collection_uri: str,
project: str,
headers: dict[str, str],
definition_id: int,
baseline_branch: str,
include_entra_update: bool,
dry_run: bool,
update_assignments: bool,
remove_unmanaged: bool,
max_workers: int,
exclude_csv: str,
) -> dict[str, Any]:
build_api = f"{collection_uri}/{project}/_apis/build/builds?api-version=7.1"
template_parameters = {
"dryRun": dry_run,
"updateAssignments": update_assignments,
"removeObjectsNotInBaseline": remove_unmanaged,
"includeEntraUpdate": include_entra_update,
"baselineBranch": baseline_branch,
"maxWorkers": max_workers,
}
exclude_csv = _normalize_exclude_csv(exclude_csv)
if exclude_csv:
template_parameters["excludeCsv"] = exclude_csv
body = {
"definition": {"id": definition_id},
"sourceBranch": _ref_from_branch(baseline_branch),
"templateParameters": template_parameters,
}
return _request_json(build_api, headers=headers, method="POST", body=body)
def _post_pr_thread(repo_api: str, headers: dict[str, str], pr_id: int, content: str) -> None:
url = f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1"
body = {
"comments": [{"parentCommentId": 0, "content": content, "commentType": 1}],
"status": "active",
}
_request_json(url, headers=headers, method="POST", body=body)
def main() -> int:
parser = argparse.ArgumentParser(description="Ensure rolling PR exists with optional remediation-on-rejection")
parser.add_argument("--repo-root", required=True)
parser.add_argument("--workload", required=True, choices=["intune", "entra"])
parser.add_argument("--drift-branch", required=True)
parser.add_argument("--baseline-branch", required=True)
parser.add_argument("--pr-title", required=True)
args = parser.parse_args()
token = os.environ.get("SYSTEM_ACCESSTOKEN", "").strip()
if not token:
raise SystemExit("SYSTEM_ACCESSTOKEN is empty. Enable OAuth token access for this pipeline.")
collection_uri = os.environ["SYSTEM_COLLECTIONURI"].rstrip("/")
project = os.environ["SYSTEM_TEAMPROJECT"]
repository_id = os.environ["BUILD_REPOSITORY_ID"]
build_number = os.environ.get("BUILD_BUILDNUMBER", "")
build_id = os.environ.get("BUILD_BUILDID", "")
auto_remediate = _env_bool("AUTO_REMEDIATE_ON_PR_REJECTION", False)
include_entra_update = _env_bool("AUTO_REMEDIATE_INCLUDE_ENTRA_UPDATE", False)
remediation_def_id_raw = _env_text("AUTO_REMEDIATE_RESTORE_PIPELINE_ID", "")
remediation_dry_run = _env_bool("AUTO_REMEDIATE_DRY_RUN", False)
remediation_update_assignments = _env_bool("AUTO_REMEDIATE_UPDATE_ASSIGNMENTS", True)
remediation_remove_unmanaged = _env_bool("AUTO_REMEDIATE_REMOVE_OBJECTS", False)
remediation_max_workers_raw = _env_text("AUTO_REMEDIATE_MAX_WORKERS", "10")
remediation_exclude_csv = _normalize_exclude_csv(_env_text("AUTO_REMEDIATE_EXCLUDE_CSV", ""))
pr_merge_strategy = _normalize_merge_strategy(_env_text("ROLLING_PR_MERGE_STRATEGY", "rebase"))
create_as_draft = _env_bool("ROLLING_PR_DELAY_REVIEWER_NOTIFICATIONS", False)
try:
remediation_max_workers = int(remediation_max_workers_raw)
except ValueError as exc:
raise SystemExit(f"Invalid AUTO_REMEDIATE_MAX_WORKERS value: {remediation_max_workers_raw}") from exc
if auto_remediate and not remediation_def_id_raw:
print(
"WARNING: AUTO_REMEDIATE_ON_PR_REJECTION=true but AUTO_REMEDIATE_RESTORE_PIPELINE_ID is empty; "
"remediation queueing disabled for this run.",
file=sys.stderr,
)
auto_remediate = False
try:
remediation_def_id = int(remediation_def_id_raw) if remediation_def_id_raw else 0
except ValueError as exc:
raise SystemExit(
f"Invalid AUTO_REMEDIATE_RESTORE_PIPELINE_ID value: {remediation_def_id_raw}"
) from exc
drift_branch = _normalize_branch(args.drift_branch)
baseline_branch = _normalize_branch(args.baseline_branch)
backup_folder = _env_text("BACKUP_FOLDER", "tenant-state")
reports_subdir = _env_text("REPORTS_SUBDIR", "reports")
workload_dir = _env_text(
"INTUNE_BACKUP_SUBDIR" if args.workload == "intune" else "ENTRA_BACKUP_SUBDIR",
args.workload,
)
source_ref = _ref_from_branch(drift_branch)
target_ref = _ref_from_branch(baseline_branch)
repo_api = f"{collection_uri}/{project}/_apis/git/repositories/{repository_id}"
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
"Accept": "application/json",
}
description = _build_description(args.workload, drift_branch, baseline_branch, build_number, build_id)
completion_options = {"mergeStrategy": pr_merge_strategy}
print(f"Rolling PR completion merge strategy: {pr_merge_strategy}")
active_prs = _query_prs(repo_api, headers, source_ref, target_ref, "active")
if active_prs:
pr = active_prs[0]
pr_id = pr.get("pullRequestId")
current_title = str(pr.get("title") or "")
current_description = str(pr.get("description") or "")
current_merge_strategy = _current_pr_merge_strategy(pr)
desired_description = current_description if current_description.strip() else description
needs_patch = (
current_title != args.pr_title
or not current_description.strip()
or current_merge_strategy != pr_merge_strategy
)
if needs_patch:
update_url = f"{repo_api}/pullrequests/{pr_id}?api-version=7.1"
_request_json(
update_url,
headers=headers,
method="PATCH",
body={
"title": args.pr_title,
"description": desired_description,
"completionOptions": completion_options,
},
)
web_url = _pr_web_url(pr)
if needs_patch:
print(f"Updated rolling {args.workload} PR #{pr_id}: {web_url}")
else:
print(f"Rolling {args.workload} PR #{pr_id} already up to date: {web_url}")
print(f"##vso[task.setvariable variable=DRIFT_PR_ID;isOutput=true]{pr_id}")
if web_url:
print(f"##vso[task.setvariable variable=DRIFT_PR_URL;isOutput=true]{web_url}")
print("##vso[task.setvariable variable=DRIFT_PR_SUPPRESSED;isOutput=true]0")
return 0
_run_git(args.repo_root, ["fetch", "--quiet", "origin", baseline_branch, drift_branch])
baseline_commitish = f"origin/{baseline_branch}" if _ref_has_commit(args.repo_root, f"origin/{baseline_branch}") else baseline_branch
drift_commitish = f"origin/{drift_branch}" if _ref_has_commit(args.repo_root, f"origin/{drift_branch}") else "HEAD"
if not _workload_config_diff_exists(
repo_root=args.repo_root,
baseline_commitish=baseline_commitish,
drift_commitish=drift_commitish,
workload_dir=workload_dir,
backup_folder=backup_folder,
reports_subdir=reports_subdir,
):
print(
"Suppressed PR recreation: drift branch has no effective workload configuration diff "
f"against {baseline_branch}."
)
print("##vso[task.setvariable variable=DRIFT_PR_SUPPRESSED;isOutput=true]1")
return 0
drift_tree = _tree_id_for_commitish(args.repo_root, drift_commitish)
abandoned_prs = _query_prs(repo_api, headers, source_ref, target_ref, "abandoned")
matching_abandoned, match_reason = _find_matching_abandoned_pr(
repo_api=repo_api,
headers=headers,
abandoned_prs=abandoned_prs,
drift_tree=drift_tree,
repo_root=args.repo_root,
workload_dir=workload_dir,
backup_folder=backup_folder,
reports_subdir=reports_subdir,
drift_commitish=drift_commitish,
)
if matching_abandoned:
if match_reason == "config-fingerprint":
print(
"Matched abandoned PR using configuration fingerprint "
"(ignoring docs/reports churn)."
)
pr_id = int(matching_abandoned["pullRequestId"])
if not _pr_has_reject_vote(matching_abandoned):
print(
"Matched abandoned PR without reviewer Reject vote; "
"skipping remediation and suppressing PR recreation for this unchanged drift snapshot."
)
print("##vso[task.setvariable variable=DRIFT_PR_SUPPRESSED;isOutput=true]1")
return 0
if not auto_remediate:
print(
"Suppressed PR recreation: latest drift matches a rejected PR, "
"but AUTO_REMEDIATE_ON_PR_REJECTION is disabled."
)
print("##vso[task.setvariable variable=DRIFT_PR_SUPPRESSED;isOutput=true]1")
return 0
marker = f"Automation marker: AUTO-REMEDIATE-TREE:{drift_tree}"
already_queued = _threads_with_marker(repo_api, headers, pr_id, marker)
if already_queued:
print(
"Suppressed PR recreation: latest drift matches a previously rejected PR and remediation was already queued."
)
else:
queued = _queue_restore_pipeline(
collection_uri=collection_uri,
project=project,
headers=headers,
definition_id=remediation_def_id,
baseline_branch=baseline_branch,
include_entra_update=include_entra_update,
dry_run=remediation_dry_run,
update_assignments=remediation_update_assignments,
remove_unmanaged=remediation_remove_unmanaged,
max_workers=remediation_max_workers,
exclude_csv=remediation_exclude_csv,
)
build_queued_id = queued.get("id")
build_url = ((queued.get("_links") or {}).get("web") or {}).get("href", "")
if not build_url and build_queued_id:
build_url = f"{collection_uri}/{project}/_build/results?buildId={build_queued_id}"
comment = (
"Auto-remediation queued because the latest drift matches a rejected PR.\n\n"
f"Workload: {args.workload}\n"
f"Rejected PR: #{pr_id}\n"
f"Drift tree: {drift_tree}\n"
f"Restore pipeline definition: {remediation_def_id}\n"
f"Restore run: {build_url or '(queued)'}\n\n"
f"{marker}"
)
try:
_post_pr_thread(repo_api, headers, pr_id, comment)
except Exception as exc:
print(f"WARNING: Remediation queued, but failed to post PR thread on #{pr_id}: {exc}")
print(
f"Queued remediation pipeline run (definition={remediation_def_id}, buildId={build_queued_id}) and suppressed PR recreation."
)
print("##vso[task.setvariable variable=DRIFT_PR_SUPPRESSED;isOutput=true]1")
return 0
if abandoned_prs:
print(
f"No abandoned PR snapshot match for current drift tree (checked {len(abandoned_prs)} abandoned PR(s)); creating/updating rolling PR."
)
create_url = f"{repo_api}/pullrequests?api-version=7.1"
created = _request_json(
create_url,
headers=headers,
method="POST",
body={
"sourceRefName": source_ref,
"targetRefName": target_ref,
"title": args.pr_title,
"description": description,
"isDraft": create_as_draft,
"completionOptions": completion_options,
},
)
pr_id = created.get("pullRequestId")
web_url = _pr_web_url(created)
print(f"Created rolling {args.workload} PR #{pr_id}: {web_url}")
print(f"##vso[task.setvariable variable=DRIFT_PR_ID;isOutput=true]{pr_id}")
if web_url:
print(f"##vso[task.setvariable variable=DRIFT_PR_URL;isOutput=true]{web_url}")
print("##vso[task.setvariable variable=DRIFT_PR_SUPPRESSED;isOutput=true]0")
return 0
if __name__ == "__main__":
try:
raise SystemExit(main())
except Exception as exc:
print(f"ERROR: Failed to ensure rolling PR: {exc}", file=sys.stderr)
raise

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,171 @@
#!/usr/bin/env python3
"""Revert Entra JSON file edits when only enrichment metadata changed."""
from __future__ import annotations
import argparse
import json
import subprocess
import sys
from pathlib import Path, PurePosixPath
from typing import Any
ENRICHMENT_KEY_NAMES = {
"ownersresolved",
"approleassignmentsresolved",
"requiredresourceaccessresolved",
"appownerorganizationresolved",
"resolutionstatus",
}
def _to_bool(value: str) -> bool:
return str(value).strip().lower() in {"1", "true", "yes", "y", "on"}
def _run_git(repo_root: Path, args: list[str], check: bool = True) -> subprocess.CompletedProcess[bytes]:
proc = subprocess.run(
["git", *args],
cwd=str(repo_root),
check=False,
capture_output=True,
)
if check and proc.returncode != 0:
stderr = proc.stderr.decode("utf-8", errors="replace").strip()
raise RuntimeError(f"git {' '.join(args)} failed ({proc.returncode}): {stderr}")
return proc
def _strip_enrichment(value: Any) -> Any:
if isinstance(value, dict):
cleaned: dict[str, Any] = {}
for key, child in value.items():
if str(key).strip().lower() in ENRICHMENT_KEY_NAMES:
continue
cleaned[key] = _strip_enrichment(child)
return cleaned
if isinstance(value, list):
return [_strip_enrichment(item) for item in value]
return value
def _is_enrichment_only_change(old_text: str, new_text: str) -> bool:
if not old_text or not new_text:
return False
try:
old_payload = json.loads(old_text)
new_payload = json.loads(new_text)
except Exception:
return False
if not isinstance(old_payload, dict) or not isinstance(new_payload, dict):
return False
old_stripped = _strip_enrichment(old_payload)
new_stripped = _strip_enrichment(new_payload)
if old_stripped != new_stripped:
return False
return old_payload != new_payload
def _modified_paths(repo_root: Path, workload_root: str) -> list[str]:
proc = _run_git(
repo_root,
["diff", "--name-only", "-z", "--diff-filter=M", "--", workload_root],
check=True,
)
raw = proc.stdout.split(b"\x00")
paths: list[str] = []
for chunk in raw:
text = chunk.decode("utf-8", errors="replace").strip()
if text:
paths.append(text)
return paths
def _is_json_path(path: str) -> bool:
return PurePosixPath(path.replace("\\", "/")).suffix.lower() == ".json"
def filter_enrichment_only_files(repo_root: Path, workload_root: str) -> list[str]:
reverted: list[str] = []
for rel_path in _modified_paths(repo_root, workload_root):
if not _is_json_path(rel_path):
continue
head_proc = _run_git(repo_root, ["show", f"HEAD:{rel_path}"], check=False)
if head_proc.returncode != 0:
continue
old_text = head_proc.stdout.decode("utf-8", errors="replace")
abs_path = repo_root / rel_path
if not abs_path.is_file():
continue
new_text = abs_path.read_text(encoding="utf-8")
if _is_enrichment_only_change(old_text, new_text):
_run_git(repo_root, ["checkout", "--quiet", "--", rel_path], check=True)
reverted.append(rel_path)
return reverted
def find_enrichment_only_modified_files(repo_root: Path, workload_root: str) -> list[str]:
matches: list[str] = []
for rel_path in _modified_paths(repo_root, workload_root):
if not _is_json_path(rel_path):
continue
head_proc = _run_git(repo_root, ["show", f"HEAD:{rel_path}"], check=False)
if head_proc.returncode != 0:
continue
old_text = head_proc.stdout.decode("utf-8", errors="replace")
abs_path = repo_root / rel_path
if not abs_path.is_file():
continue
new_text = abs_path.read_text(encoding="utf-8")
if _is_enrichment_only_change(old_text, new_text):
matches.append(rel_path)
return matches
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--repo-root", required=True, help="Repository root path.")
parser.add_argument(
"--workload-root",
default="tenant-state/entra",
help="Path scope inside repo to inspect (default: tenant-state/entra).",
)
parser.add_argument(
"--fail-on-residual-enrichment-drift",
default="true",
help="Exit non-zero when enrichment-only modified files remain after filtering (true/false).",
)
return parser.parse_args()
def main() -> int:
args = parse_args()
repo_root = Path(args.repo_root).resolve()
reverted = filter_enrichment_only_files(repo_root=repo_root, workload_root=args.workload_root)
if reverted:
print(f"Reverted enrichment-only Entra file changes: {len(reverted)}")
for path in reverted:
print(f" - {path}")
else:
print("No enrichment-only Entra file changes detected.")
residual = find_enrichment_only_modified_files(repo_root=repo_root, workload_root=args.workload_root)
if residual:
print(f"Residual enrichment-only Entra file changes still present: {len(residual)}")
for path in residual:
print(f" - {path}")
if _to_bool(args.fail_on_residual_enrichment_drift):
return 2
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -0,0 +1,144 @@
#!/usr/bin/env python3
"""Revert Intune Settings Catalog partial exports where settings payload is missing."""
from __future__ import annotations
import argparse
import json
import subprocess
import sys
from pathlib import Path
from typing import Any
def _to_bool(value: str) -> bool:
return str(value).strip().lower() in {"1", "true", "yes", "y", "on"}
def _run_git_show(repo_root: Path, ref: str, rel_path: str) -> str | None:
proc = subprocess.run(
["git", "show", f"{ref}:{rel_path}"],
cwd=str(repo_root),
check=False,
capture_output=True,
)
if proc.returncode != 0:
return None
return proc.stdout.decode("utf-8", errors="replace")
def _is_settings_catalog_json(file_path: Path, backup_root: Path) -> bool:
if file_path.suffix.lower() != ".json":
return False
rel = file_path.relative_to(backup_root).as_posix().lower()
return rel.startswith("settings catalog/")
def _is_partial_settings_payload(payload: Any) -> bool:
if not isinstance(payload, dict):
return False
setting_count = payload.get("settingCount")
if not isinstance(setting_count, int) or setting_count <= 0:
return False
settings = payload.get("settings")
if not isinstance(settings, list):
return True
return len(settings) == 0
def restore_partial_settings_from_baseline(
repo_root: Path,
backup_root: Path,
baseline_ref: str,
) -> tuple[list[str], list[str]]:
restored: list[str] = []
unresolved: list[str] = []
for file_path in sorted(backup_root.rglob("*.json")):
if not _is_settings_catalog_json(file_path, backup_root):
continue
try:
current_payload = json.loads(file_path.read_text(encoding="utf-8"))
except Exception:
continue
if not _is_partial_settings_payload(current_payload):
continue
rel_path = file_path.relative_to(repo_root).as_posix()
baseline_text = _run_git_show(repo_root, baseline_ref, rel_path)
if not baseline_text:
unresolved.append(rel_path)
continue
try:
baseline_payload = json.loads(baseline_text)
except Exception:
unresolved.append(rel_path)
continue
baseline_settings = baseline_payload.get("settings")
if not isinstance(baseline_settings, list) or len(baseline_settings) == 0:
unresolved.append(rel_path)
continue
current_payload["settings"] = baseline_settings
file_path.write_text(json.dumps(current_payload, indent=5, ensure_ascii=False), encoding="utf-8")
restored.append(rel_path)
return restored, unresolved
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--repo-root", required=True, help="Repository root path.")
parser.add_argument(
"--backup-root",
default="tenant-state/intune",
help="Path to Intune backup root (default: tenant-state/intune).",
)
parser.add_argument(
"--baseline-ref",
default="HEAD",
help="Git ref used as baseline for restoration (default: HEAD).",
)
parser.add_argument(
"--fail-on-unresolved-partial-exports",
default="true",
help="Exit non-zero when partial exports cannot be restored from baseline (true/false).",
)
return parser.parse_args()
def main() -> int:
args = parse_args()
repo_root = Path(args.repo_root).resolve()
backup_root_arg = Path(args.backup_root)
backup_root = backup_root_arg if backup_root_arg.is_absolute() else repo_root / backup_root_arg
backup_root = backup_root.resolve()
restored, unresolved = restore_partial_settings_from_baseline(
repo_root=repo_root,
backup_root=backup_root,
baseline_ref=args.baseline_ref,
)
if restored:
print(f"Restored partial Intune Settings Catalog exports from baseline: {len(restored)}")
for path in restored:
print(f" - {path}")
else:
print("No partial Intune Settings Catalog exports detected.")
if unresolved:
print(f"Unresolved partial Intune Settings Catalog exports: {len(unresolved)}")
for path in unresolved:
print(f" - {path}")
if _to_bool(args.fail_on_unresolved_partial_exports):
return 2
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -0,0 +1,259 @@
#!/usr/bin/env python3
"""Generate a dedicated apps inventory CSV from Entra app exports."""
from __future__ import annotations
import argparse
import csv
import json
from pathlib import Path
from typing import Any
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--root", required=True, help="Path to the Entra workload backup root (tenant-state/entra).")
parser.add_argument(
"--output-dir",
required=True,
help="Directory where apps inventory report files will be written.",
)
parser.add_argument(
"--output-name",
default="apps-inventory.csv",
help="Output CSV filename (default: apps-inventory.csv).",
)
return parser.parse_args()
def safe_text(value: object) -> str:
if value is None:
return ""
return str(value).strip()
def summarize_owners(owners: object) -> tuple[int, str]:
if not isinstance(owners, list):
return 0, ""
labels: list[str] = []
for owner in owners:
if not isinstance(owner, dict):
continue
label = (
safe_text(owner.get("displayName"))
or safe_text(owner.get("userPrincipalName"))
or safe_text(owner.get("appId"))
or safe_text(owner.get("id"))
or "Unknown owner"
)
labels.append(label)
return len(labels), "; ".join(labels)
def summarize_required_resource_access(entries: object) -> tuple[int, str]:
if not isinstance(entries, list):
return 0, ""
summary: list[str] = []
total_permissions = 0
for entry in entries:
if not isinstance(entry, dict):
continue
resource_name = safe_text(entry.get("resourceDisplayName")) or "Unresolved resource"
resource_app_id = safe_text(entry.get("resourceAppId"))
permissions = entry.get("permissions")
permission_labels: list[str] = []
if isinstance(permissions, list):
for permission in permissions:
if not isinstance(permission, dict):
continue
total_permissions += 1
perm_type = safe_text(permission.get("type")) or "UnknownType"
perm_label = (
safe_text(permission.get("value"))
or safe_text(permission.get("displayName"))
or safe_text(permission.get("id"))
or "UnknownPermission"
)
permission_labels.append(f"{perm_label} [{perm_type}]")
resource_label = resource_name
if resource_app_id:
resource_label += f" ({resource_app_id})"
if permission_labels:
summary.append(f"{resource_label}: {', '.join(permission_labels)}")
else:
summary.append(resource_label)
return total_permissions, "; ".join(summary)
def summarize_enterprise_app_role_assignments(entries: object) -> tuple[int, str]:
if not isinstance(entries, list):
return 0, ""
summary: list[str] = []
count = 0
for entry in entries:
if not isinstance(entry, dict):
continue
count += 1
resource_name = safe_text(entry.get("resourceDisplayName")) or "Unresolved resource"
resource_id = safe_text(entry.get("resourceId"))
role_name = (
safe_text(entry.get("appRoleValue"))
or safe_text(entry.get("appRoleDisplayName"))
or safe_text(entry.get("appRoleId"))
or "Default access"
)
label = resource_name
if resource_id:
label += f" ({resource_id})"
summary.append(f"{label}: {role_name}")
return count, "; ".join(summary)
def verified_publisher_label(value: object) -> str:
if not isinstance(value, dict):
return ""
return (
safe_text(value.get("displayName"))
or safe_text(value.get("verifiedPublisherId"))
or safe_text(value.get("addedDateTime"))
)
def iter_exported_json(export_dir: Path) -> list[tuple[Path, dict[str, Any]]]:
if not export_dir.exists():
return []
items: list[tuple[Path, dict[str, Any]]] = []
for path in sorted(export_dir.rglob("*.json")):
try:
payload = json.loads(path.read_text(encoding="utf-8"))
except Exception:
continue
if isinstance(payload, dict):
items.append((path, payload))
return items
def main() -> int:
args = parse_args()
root = Path(args.root).resolve()
output_dir = Path(args.output_dir).resolve()
output_path = output_dir / args.output_name
if not root.exists():
raise SystemExit(f"Backup path does not exist: {root}")
app_reg_dir = root / "App Registrations"
ent_apps_dir = root / "Enterprise Applications"
app_reg_items = iter_exported_json(app_reg_dir)
ent_app_items = iter_exported_json(ent_apps_dir)
rows: list[dict[str, str]] = []
for source_path, payload in app_reg_items:
owner_count, owners = summarize_owners(payload.get("ownersResolved"))
perm_count, permissions = summarize_required_resource_access(
payload.get("requiredResourceAccessResolved")
)
rows.append(
{
"AppType": "AppRegistration",
"DisplayName": safe_text(payload.get("displayName")) or source_path.stem,
"ObjectId": safe_text(payload.get("id")),
"AppId": safe_text(payload.get("appId")),
"SignInAudience": safe_text(payload.get("signInAudience")),
"ServicePrincipalType": "",
"AccountEnabled": "",
"PublisherDomain": safe_text(payload.get("publisherDomain")),
"PublisherName": "",
"VerifiedPublisher": verified_publisher_label(payload.get("verifiedPublisher")),
"CreatedDateTime": safe_text(payload.get("createdDateTime")),
"OwnersCount": str(owner_count),
"OwnersResolved": owners,
"ResolvedPermissionCount": str(perm_count),
"ResolvedPermissions": permissions,
"ResolvedAppRoleAssignmentCount": "0",
"ResolvedAppRoleAssignments": "",
"SourceFile": source_path.relative_to(root).as_posix(),
}
)
for source_path, payload in ent_app_items:
owner_count, owners = summarize_owners(payload.get("ownersResolved"))
assignment_count, assignments = summarize_enterprise_app_role_assignments(
payload.get("appRoleAssignmentsResolved")
)
rows.append(
{
"AppType": "EnterpriseApplication",
"DisplayName": safe_text(payload.get("displayName")) or source_path.stem,
"ObjectId": safe_text(payload.get("id")),
"AppId": safe_text(payload.get("appId")),
"SignInAudience": "",
"ServicePrincipalType": safe_text(payload.get("servicePrincipalType")),
"AccountEnabled": safe_text(payload.get("accountEnabled")),
"PublisherDomain": "",
"PublisherName": safe_text(payload.get("publisherName")),
"VerifiedPublisher": verified_publisher_label(payload.get("verifiedPublisher")),
"CreatedDateTime": "",
"OwnersCount": str(owner_count),
"OwnersResolved": owners,
"ResolvedPermissionCount": "0",
"ResolvedPermissions": "",
"ResolvedAppRoleAssignmentCount": str(assignment_count),
"ResolvedAppRoleAssignments": assignments,
"SourceFile": source_path.relative_to(root).as_posix(),
}
)
rows.sort(
key=lambda row: (
row["AppType"].lower(),
row["DisplayName"].lower(),
row["ObjectId"].lower(),
)
)
output_dir.mkdir(parents=True, exist_ok=True)
fieldnames = [
"AppType",
"DisplayName",
"ObjectId",
"AppId",
"SignInAudience",
"ServicePrincipalType",
"AccountEnabled",
"PublisherDomain",
"PublisherName",
"VerifiedPublisher",
"CreatedDateTime",
"OwnersCount",
"OwnersResolved",
"ResolvedPermissionCount",
"ResolvedPermissions",
"ResolvedAppRoleAssignmentCount",
"ResolvedAppRoleAssignments",
"SourceFile",
]
with output_path.open("w", encoding="utf-8", newline="") as handle:
writer = csv.DictWriter(handle, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(rows)
print(
"Generated apps inventory report: "
+ f"{output_path} "
+ f"(rows={len(rows)}, appRegistrations={len(app_reg_items)}, enterpriseApps={len(ent_app_items)})"
)
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,419 @@
#!/usr/bin/env python3
"""Generate a policy assignment inventory report from Intune backup JSON files."""
from __future__ import annotations
import argparse
import csv
import json
from dataclasses import dataclass
from datetime import datetime, timezone
from pathlib import Path
from typing import Iterable
GROUP_TARGET_TYPES = {
"#microsoft.graph.groupAssignmentTarget",
"#microsoft.graph.exclusionGroupAssignmentTarget",
}
DEFAULT_POLICY_TYPES = {
"app configuration",
"app protection",
"applications",
"compliance policies",
"conditional access",
"device configurations",
"enrollment configurations",
"enrollment profiles",
"filters",
"scripts",
"settings catalog",
}
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--root", required=True, help="Path to the workload backup root (for example tenant-state/intune).")
parser.add_argument(
"--output-dir",
required=True,
help="Directory where report files will be written.",
)
parser.add_argument(
"--policy-type",
action="append",
default=[],
help=(
"Optional filter for policy type (top-level backup folder name). "
"Repeat the flag or pass a comma-separated list."
),
)
parser.add_argument(
"--graph-type",
action="append",
default=[],
help=(
"Optional filter for Graph @odata.type values. "
"Repeat the flag or pass a comma-separated list."
),
)
return parser.parse_args()
@dataclass
class AssignmentRow:
category: str
policy_type: str
object_name: str
object_type: str
assignment_state: str
assignment_count: int
intent: str
assignment_target: str
target_type: str
assignment_filter: str
filter_type: str
source_file: str
def safe_text(value: object) -> str:
if value is None:
return ""
return str(value).strip()
def normalize_intent(intent: str) -> str:
normalized = safe_text(intent).lower()
if normalized in {"apply", "include"}:
return "Include"
if normalized in {"exclude"}:
return "Exclude"
if not normalized:
return "Include"
return normalized.capitalize()
def infer_intent(assignment: dict, target_type: str) -> str:
target_type_lower = safe_text(target_type).lower()
if "exclusion" in target_type_lower:
return "Exclude"
explicit = safe_text(assignment.get("intent"))
if explicit:
return normalize_intent(explicit)
return "Include"
def resolve_assignment_target(target: dict) -> str:
target_type = safe_text(target.get("@odata.type"))
if target_type == "#microsoft.graph.allDevicesAssignmentTarget":
return "All devices"
if target_type == "#microsoft.graph.allLicensedUsersAssignmentTarget":
return "All users"
if target_type in GROUP_TARGET_TYPES:
return (
safe_text(target.get("groupDisplayName"))
or safe_text(target.get("groupName"))
or safe_text(target.get("groupId"))
or "Unresolved group"
)
return (
safe_text(target.get("groupDisplayName"))
or safe_text(target.get("groupName"))
or safe_text(target.get("displayName"))
or safe_text(target.get("id"))
or "Unknown target"
)
def escape_md_cell(value: str) -> str:
return value.replace("\\", "\\\\").replace("|", "\\|").replace("\n", " ").strip()
def parse_filter_values(raw_values: list[str]) -> set[str]:
values = set()
for raw in raw_values:
for item in safe_text(raw).split(","):
normalized = safe_text(item)
if normalized:
values.add(normalized.lower())
return values
def iter_assignment_rows(
root: Path,
policy_type_filter: set[str],
graph_type_filter: set[str],
) -> Iterable[AssignmentRow]:
excluded_categories = {
"App Registrations",
"Enterprise Applications",
}
for path in sorted(root.rglob("*.json")):
try:
rel_path = path.relative_to(root)
except ValueError:
continue
if rel_path.parts and rel_path.parts[0] in {"reports"}:
continue
if "__archive__" in rel_path.parts:
continue
try:
payload = json.loads(path.read_text(encoding="utf-8"))
except Exception:
continue
if not isinstance(payload, dict):
continue
object_name = safe_text(payload.get("displayName")) or safe_text(payload.get("name"))
if not object_name:
object_name = path.stem.split("__")[0]
object_type = safe_text(payload.get("@odata.type"))
category = "/".join(rel_path.parent.parts)
policy_type = rel_path.parts[0] if rel_path.parts else ""
if any(
category == excluded or category.startswith(f"{excluded}/")
for excluded in excluded_categories
):
continue
if policy_type_filter and policy_type.lower() not in policy_type_filter:
continue
if graph_type_filter and object_type.lower() not in graph_type_filter:
continue
assignments = payload.get("assignments")
if not isinstance(assignments, list):
yield AssignmentRow(
category=category,
policy_type=policy_type,
object_name=object_name,
object_type=object_type,
assignment_state="NotExported",
assignment_count=0,
intent="None",
assignment_target="Not exported in backup",
target_type="",
assignment_filter="",
filter_type="",
source_file=rel_path.as_posix(),
)
continue
if not assignments:
yield AssignmentRow(
category=category,
policy_type=policy_type,
object_name=object_name,
object_type=object_type,
assignment_state="Unassigned",
assignment_count=0,
intent="None",
assignment_target="No assignments",
target_type="",
assignment_filter="",
filter_type="",
source_file=rel_path.as_posix(),
)
continue
assignment_count = len([item for item in assignments if isinstance(item, dict)])
if assignment_count == 0:
yield AssignmentRow(
category=category,
policy_type=policy_type,
object_name=object_name,
object_type=object_type,
assignment_state="Unassigned",
assignment_count=0,
intent="None",
assignment_target="No assignments",
target_type="",
assignment_filter="",
filter_type="",
source_file=rel_path.as_posix(),
)
continue
for assignment in assignments:
if not isinstance(assignment, dict):
continue
target = assignment.get("target") if isinstance(assignment.get("target"), dict) else {}
target_type = safe_text(target.get("@odata.type"))
intent = infer_intent(assignment, target_type)
assignment_target = resolve_assignment_target(target)
assignment_filter = safe_text(target.get("deviceAndAppManagementAssignmentFilterId"))
filter_type = safe_text(target.get("deviceAndAppManagementAssignmentFilterType"))
yield AssignmentRow(
category=category,
policy_type=policy_type,
object_name=object_name,
object_type=object_type,
assignment_state="Assigned",
assignment_count=assignment_count,
intent=intent,
assignment_target=assignment_target,
target_type=target_type,
assignment_filter=assignment_filter,
filter_type=filter_type,
source_file=rel_path.as_posix(),
)
def write_csv(rows: list[AssignmentRow], output_path: Path) -> None:
output_path.parent.mkdir(parents=True, exist_ok=True)
with output_path.open("w", encoding="utf-8", newline="") as handle:
writer = csv.writer(handle)
writer.writerow(
[
"Category",
"PolicyType",
"ObjectName",
"ObjectType",
"AssignmentState",
"AssignmentCount",
"Intent",
"AssignmentTarget",
"TargetType",
"AssignmentFilter",
"FilterType",
"SourceFile",
]
)
for row in rows:
writer.writerow(
[
row.category,
row.policy_type,
row.object_name,
row.object_type,
row.assignment_state,
row.assignment_count,
row.intent,
row.assignment_target,
row.target_type,
row.assignment_filter,
row.filter_type,
row.source_file,
]
)
def write_markdown(rows: list[AssignmentRow], output_path: Path) -> None:
output_path.parent.mkdir(parents=True, exist_ok=True)
generated = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S UTC")
objects = {(row.category, row.object_name, row.source_file) for row in rows}
assigned_objects = {
(row.category, row.object_name, row.source_file)
for row in rows
if row.assignment_state == "Assigned"
}
unassigned_objects = {
(row.category, row.object_name, row.source_file)
for row in rows
if row.assignment_state == "Unassigned"
}
not_exported_objects = {
(row.category, row.object_name, row.source_file)
for row in rows
if row.assignment_state == "NotExported"
}
policy_type_counts = {}
for row in rows:
key = row.policy_type or "Unknown"
policy_type_counts[key] = policy_type_counts.get(key, 0) + 1
with output_path.open("w", encoding="utf-8") as handle:
handle.write("# Policy Assignment Inventory Report\n\n")
handle.write(f"Generated: `{generated}`\n\n")
handle.write(f"- Total objects in report: **{len(objects)}**\n")
handle.write(f"- Objects with assignments: **{len(assigned_objects)}**\n")
handle.write(f"- Objects without assignments: **{len(unassigned_objects)}**\n")
handle.write(f"- Objects with assignment field not exported: **{len(not_exported_objects)}**\n")
handle.write(f"- Total rows: **{len(rows)}**\n\n")
handle.write("## Rows by policy type\n\n")
handle.write("| Policy Type | Rows |\n")
handle.write("|---|---|\n")
for policy_type, count in sorted(policy_type_counts.items(), key=lambda item: item[0].lower()):
handle.write(f"| {escape_md_cell(policy_type)} | {count} |\n")
handle.write("\n")
handle.write(
"| Policy Type | Category | Object | Object Type | Assignment State | Assignment Count | Intent | Assignment Target | Target Type | Filter | Filter Type | Source |\n"
)
handle.write("|---|---|---|---|---|---|---|---|---|---|---|---|\n")
for row in rows:
handle.write(
"| "
+ " | ".join(
[
escape_md_cell(row.policy_type),
escape_md_cell(row.category),
escape_md_cell(row.object_name),
escape_md_cell(row.object_type),
escape_md_cell(row.assignment_state),
escape_md_cell(str(row.assignment_count)),
escape_md_cell(row.intent),
escape_md_cell(row.assignment_target),
escape_md_cell(row.target_type),
escape_md_cell(row.assignment_filter),
escape_md_cell(row.filter_type),
escape_md_cell(row.source_file),
]
)
+ " |\n"
)
def main() -> int:
args = parse_args()
root = Path(args.root).resolve()
output_dir = Path(args.output_dir).resolve()
policy_type_filter = parse_filter_values(args.policy_type)
graph_type_filter = parse_filter_values(args.graph_type)
using_default_policy_scope = False
if not policy_type_filter:
policy_type_filter = set(DEFAULT_POLICY_TYPES)
using_default_policy_scope = True
if not root.exists():
raise SystemExit(f"Backup path does not exist: {root}")
rows = sorted(
iter_assignment_rows(root, policy_type_filter, graph_type_filter),
key=lambda x: (
x.policy_type.lower(),
x.category.lower(),
x.object_name.lower(),
x.assignment_state,
x.intent.lower(),
x.assignment_target.lower(),
),
)
markdown_path = output_dir / "policy-assignments.md"
csv_path = output_dir / "policy-assignments.csv"
write_markdown(rows, markdown_path)
write_csv(rows, csv_path)
print(
f"Generated assignment report with {len(rows)} rows: "
f"{markdown_path} and {csv_path}"
)
if using_default_policy_scope:
print(
"Applied default policy scope: "
+ ", ".join(sorted(DEFAULT_POLICY_TYPES))
)
elif policy_type_filter:
print(f"Applied policy type filter: {', '.join(sorted(policy_type_filter))}")
if graph_type_filter:
print(f"Applied graph type filter: {', '.join(sorted(graph_type_filter))}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,231 @@
#!/usr/bin/env python3
"""Generate broad object inventory CSV reports from backup JSON files."""
from __future__ import annotations
import argparse
import csv
import json
import re
from pathlib import Path
GROUP_TARGET_TYPES = {
"#microsoft.graph.groupAssignmentTarget",
"#microsoft.graph.exclusionGroupAssignmentTarget",
}
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--root", required=True, help="Path to the workload backup root (for example tenant-state/intune).")
parser.add_argument(
"--output-dir",
required=True,
help="Directory where report files will be written.",
)
parser.add_argument(
"--per-type-dir",
default="Object Inventory",
help="Directory name under output-dir for per-policy-type CSVs.",
)
return parser.parse_args()
def safe_text(value: object) -> str:
if value is None:
return ""
return str(value).strip()
def slugify(value: str) -> str:
text = safe_text(value).lower()
text = re.sub(r"[^a-z0-9]+", "-", text).strip("-")
return text or "unknown"
def infer_intent(assignment: dict, target_type: str) -> str:
if "exclusion" in target_type.lower():
return "Exclude"
explicit = safe_text(assignment.get("intent")).lower()
if explicit in {"exclude"}:
return "Exclude"
return "Include"
def resolve_assignment_target(target: dict) -> str:
target_type = safe_text(target.get("@odata.type"))
if target_type == "#microsoft.graph.allDevicesAssignmentTarget":
return "All devices"
if target_type == "#microsoft.graph.allLicensedUsersAssignmentTarget":
return "All users"
if target_type in GROUP_TARGET_TYPES:
return (
safe_text(target.get("groupDisplayName"))
or safe_text(target.get("groupName"))
or safe_text(target.get("groupId"))
or "Unresolved group"
)
return (
safe_text(target.get("groupDisplayName"))
or safe_text(target.get("groupName"))
or safe_text(target.get("displayName"))
or safe_text(target.get("id"))
or "Unknown target"
)
def summarize_assignments(payload: dict) -> dict[str, object]:
assignments = payload.get("assignments")
if not isinstance(assignments, list):
return {
"state": "NotExported",
"total": 0,
"include_targets": "",
"exclude_targets": "",
"all_users_assigned": "false",
"all_devices_assigned": "false",
}
include_targets: list[str] = []
exclude_targets: list[str] = []
all_users = False
all_devices = False
valid = [item for item in assignments if isinstance(item, dict)]
for assignment in valid:
target = assignment.get("target") if isinstance(assignment.get("target"), dict) else {}
target_type = safe_text(target.get("@odata.type"))
target_name = resolve_assignment_target(target)
intent = infer_intent(assignment, target_type)
if target_type == "#microsoft.graph.allLicensedUsersAssignmentTarget":
all_users = True
if target_type == "#microsoft.graph.allDevicesAssignmentTarget":
all_devices = True
if intent == "Exclude":
exclude_targets.append(target_name)
else:
include_targets.append(target_name)
state = "Assigned" if valid else "Unassigned"
if assignments == []:
state = "Unassigned"
return {
"state": state,
"total": len(valid),
"include_targets": "; ".join(sorted(set(include_targets))),
"exclude_targets": "; ".join(sorted(set(exclude_targets))),
"all_users_assigned": str(all_users).lower(),
"all_devices_assigned": str(all_devices).lower(),
}
def iter_rows(root: Path) -> list[dict[str, str]]:
rows: list[dict[str, str]] = []
for path in sorted(root.rglob("*.json")):
rel = path.relative_to(root)
if rel.parts and rel.parts[0] in {"reports"}:
continue
if "__archive__" in rel.parts:
continue
try:
payload = json.loads(path.read_text(encoding="utf-8"))
except Exception:
continue
if not isinstance(payload, dict):
continue
summary = summarize_assignments(payload)
policy_type = rel.parts[0] if rel.parts else ""
category = "/".join(rel.parent.parts)
object_name = safe_text(payload.get("displayName")) or safe_text(payload.get("name"))
if not object_name:
object_name = path.stem.split("__")[0]
rows.append(
{
"PolicyType": policy_type,
"Category": category,
"ObjectName": object_name,
"ObjectType": safe_text(payload.get("@odata.type")),
"ObjectId": safe_text(payload.get("id")),
"AppId": safe_text(payload.get("appId")),
"Description": safe_text(payload.get("description")),
"AssignmentState": safe_text(summary["state"]),
"AssignmentCount": str(summary["total"]),
"IncludeTargets": safe_text(summary["include_targets"]),
"ExcludeTargets": safe_text(summary["exclude_targets"]),
"AllUsersAssigned": safe_text(summary["all_users_assigned"]),
"AllDevicesAssigned": safe_text(summary["all_devices_assigned"]),
"SourceFile": rel.as_posix(),
}
)
rows.sort(
key=lambda row: (
row["PolicyType"].lower(),
row["Category"].lower(),
row["ObjectName"].lower(),
row["SourceFile"].lower(),
)
)
return rows
def write_csv(path: Path, rows: list[dict[str, str]]) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
headers = [
"PolicyType",
"Category",
"ObjectName",
"ObjectType",
"ObjectId",
"AppId",
"Description",
"AssignmentState",
"AssignmentCount",
"IncludeTargets",
"ExcludeTargets",
"AllUsersAssigned",
"AllDevicesAssigned",
"SourceFile",
]
with path.open("w", encoding="utf-8", newline="") as handle:
writer = csv.DictWriter(handle, fieldnames=headers)
writer.writeheader()
writer.writerows(rows)
def main() -> int:
args = parse_args()
root = Path(args.root).resolve()
output_dir = Path(args.output_dir).resolve()
per_type_root = output_dir / args.per_type_dir
if not root.exists():
raise SystemExit(f"Backup path does not exist: {root}")
rows = iter_rows(root)
all_report = output_dir / "object-inventory-all.csv"
write_csv(all_report, rows)
per_type_counts: dict[str, int] = {}
for policy_type in sorted({row["PolicyType"] for row in rows}):
type_rows = [row for row in rows if row["PolicyType"] == policy_type]
per_type_report = per_type_root / f"{slugify(policy_type)}-inventory.csv"
write_csv(per_type_report, type_rows)
per_type_counts[policy_type] = len(type_rows)
print(
f"Generated object inventory reports: all={all_report}, "
f"perTypeCount={len(per_type_counts)}, rows={len(rows)}"
)
for policy_type, count in sorted(per_type_counts.items(), key=lambda item: item[0].lower()):
print(f" - {policy_type}: {count} rows")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,447 @@
#!/usr/bin/env python3
"""Queue restore automatically after merged rolling PR that contains /reject decisions."""
from __future__ import annotations
import argparse
import base64
import datetime as dt
import json
import os
import re
import sys
import urllib.parse
from pathlib import Path
from typing import Any
# common.py lives in the same directory; ensure it can be imported when the
# script is executed directly.
_sys_path_inserted = False
if __file__:
_script_dir = str(Path(__file__).resolve().parent)
if _script_dir not in sys.path:
sys.path.insert(0, _script_dir)
_sys_path_inserted = True
import common
if _sys_path_inserted:
sys.path.pop(0)
_env_text = common.env_text
_env_bool = common.env_bool
_request_json = common.request_json
REJECT_CMD_RE = re.compile(r"(?im)^\s*(?:/|#)?reject\b")
DECISION_RE = re.compile(r"(?im)^\s*(?:/|#)?(?P<decision>reject|accept)\b")
AUTO_TICKET_THREAD_PREFIX = "AUTO-CHANGE-TICKET:"
MERGE_MARKER_PREFIX = "AUTO-RESTORE-AFTER-MERGE:"
def _normalize_branch(branch: str) -> str:
b = branch.strip()
if b.startswith("refs/heads/"):
return b[len("refs/heads/") :]
return b
def _ref_from_branch(branch: str) -> str:
return f"refs/heads/{_normalize_branch(branch)}"
def _parse_iso_utc(value: str) -> dt.datetime | None:
text = (value or "").strip()
if not text:
return None
if text.endswith("Z"):
text = text[:-1] + "+00:00"
try:
parsed = dt.datetime.fromisoformat(text)
except ValueError:
return None
if parsed.tzinfo is None:
parsed = parsed.replace(tzinfo=dt.timezone.utc)
return parsed.astimezone(dt.timezone.utc)
def _query_completed_prs(
repo_api: str,
headers: dict[str, str],
source_ref: str,
target_ref: str,
) -> list[dict[str, Any]]:
query = urllib.parse.urlencode(
{
"searchCriteria.status": "completed",
"searchCriteria.sourceRefName": source_ref,
"searchCriteria.targetRefName": target_ref,
"api-version": "7.1",
},
quote_via=urllib.parse.quote,
safe="/",
)
payload = _request_json(f"{repo_api}/pullrequests?{query}", headers=headers)
items = payload.get("value", []) if isinstance(payload, dict) else []
return sorted(items, key=lambda x: x.get("closedDate", ""), reverse=True)
def _threads(repo_api: str, headers: dict[str, str], pr_id: int) -> list[dict[str, Any]]:
payload = _request_json(
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
headers=headers,
)
return payload.get("value", []) if isinstance(payload, dict) else []
def _thread_comment_contents(threads: list[dict[str, Any]]) -> list[str]:
out: list[str] = []
for thread in threads:
comments = thread.get("comments", []) if isinstance(thread.get("comments"), list) else []
for comment in comments:
out.append(str(comment.get("content", "") or ""))
return out
def _ticket_path_from_content(content: str) -> str | None:
marker_re = re.compile(
r"(?:^|\n)\s*(?:Automation marker:\s*)?"
+ re.escape(AUTO_TICKET_THREAD_PREFIX)
+ r"(?P<id>[A-Za-z0-9_-]+)\s*(?:$|\n)"
)
match = marker_re.search(content or "")
if not match:
return None
encoded = match.group("id")
padding = "=" * ((4 - len(encoded) % 4) % 4)
try:
return base64.urlsafe_b64decode((encoded + padding).encode("ascii")).decode("utf-8")
except Exception:
return None
def _latest_thread_decision(comments: list[dict[str, Any]]) -> str | None:
decision: str | None = None
def _comment_sort_key(comment: dict[str, Any]) -> tuple[int, int]:
try:
comment_id = int(comment.get("id", 0))
except Exception:
comment_id = 0
try:
parent_id = int(comment.get("parentCommentId", 0))
except Exception:
parent_id = 0
return (comment_id, parent_id)
for comment in sorted(comments, key=_comment_sort_key):
content = str(comment.get("content", "") or "")
match = DECISION_RE.search(content)
if match:
decision = match.group("decision").lower()
return decision
def _rejected_ticket_paths(threads: list[dict[str, Any]]) -> list[str]:
rejected: set[str] = set()
for thread in threads:
comments = thread.get("comments", []) if isinstance(thread.get("comments"), list) else []
marker_path: str | None = None
for comment in comments:
marker_path = _ticket_path_from_content(str(comment.get("content", "") or ""))
if marker_path:
break
if not marker_path:
continue
decision = _latest_thread_decision(comments)
if decision == "reject":
rejected.add(marker_path)
return sorted(rejected)
def _has_reject_signal(comments: list[str]) -> bool:
for content in comments:
if REJECT_CMD_RE.search(content):
return True
if "Auto-action: /reject detected." in content:
return True
return False
def _has_merge_marker(comments: list[str], merge_commit: str) -> bool:
marker = f"Automation marker: {MERGE_MARKER_PREFIX}{merge_commit}"
return any(marker in content for content in comments)
def _is_permission_error(exc: Exception) -> bool:
msg = str(exc).lower()
return "http 403" in msg or "forbidden" in msg
def _normalize_exclude_csv(value: str) -> str:
normalized = str(value or "").strip()
if normalized.lower() in {"", "none", "null", "n/a", "-", "_none_"}:
return ""
return normalized
def _diagnose_queue_permission(
collection_uri: str,
project: str,
headers: dict[str, str],
definition_id: int,
) -> None:
definition_url = (
f"{collection_uri}/{project}/_apis/build/definitions/{definition_id}"
"?api-version=7.1"
)
try:
payload = _request_json(definition_url, headers=headers)
definition_name = str(payload.get("name", "") or "").strip()
print(
"Diagnostic: restore pipeline definition is readable "
f"(id={definition_id}, name='{definition_name or 'n/a'}')."
)
print(
"Diagnostic: queue call was forbidden, so missing permission is likely "
"'Queue builds' on that restore pipeline (or pipeline is not authorized to use it)."
)
except Exception as diag_exc:
print(
"Diagnostic: unable to read restore pipeline definition "
f"id={definition_id}. Details: {diag_exc}"
)
print(
"Diagnostic: likely wrong definition ID, wrong project, or missing 'View builds' permission "
"for the calling pipeline identity."
)
def _queue_restore_pipeline(
collection_uri: str,
project: str,
headers: dict[str, str],
definition_id: int,
baseline_branch: str,
include_entra_update: bool,
dry_run: bool,
update_assignments: bool,
remove_unmanaged: bool,
max_workers: int,
exclude_csv: str,
restore_mode: str = "full",
restore_paths_csv: str = "",
) -> dict[str, Any]:
build_api = f"{collection_uri}/{project}/_apis/build/builds?api-version=7.1"
template_parameters = {
"dryRun": dry_run,
"updateAssignments": update_assignments,
"removeObjectsNotInBaseline": remove_unmanaged,
"includeEntraUpdate": include_entra_update,
"baselineBranch": baseline_branch,
"maxWorkers": max_workers,
"restoreMode": restore_mode,
}
if restore_mode == "selective" and restore_paths_csv.strip():
template_parameters["restorePathsCsv"] = restore_paths_csv.strip()
exclude_csv = _normalize_exclude_csv(exclude_csv)
if exclude_csv:
template_parameters["excludeCsv"] = exclude_csv
body = {
"definition": {"id": definition_id},
"sourceBranch": _ref_from_branch(baseline_branch),
"templateParameters": template_parameters,
}
return _request_json(build_api, headers=headers, method="POST", body=body)
def _post_pr_thread(repo_api: str, headers: dict[str, str], pr_id: int, content: str) -> None:
_request_json(
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
headers=headers,
method="POST",
body={
"comments": [
{
"parentCommentId": 0,
"content": content,
"commentType": 1,
}
],
"status": 1,
},
)
def main() -> int:
parser = argparse.ArgumentParser(description="Queue restore after merged rolling PR with /reject decisions")
parser.add_argument("--workload", required=True, choices=["intune", "entra"])
parser.add_argument("--drift-branch", required=True)
parser.add_argument("--baseline-branch", required=True)
args = parser.parse_args()
if not _env_bool("AUTO_REMEDIATE_AFTER_MERGE", False):
print("Post-merge auto-remediation disabled (set AUTO_REMEDIATE_AFTER_MERGE=true).")
return 0
token = os.environ.get("SYSTEM_ACCESSTOKEN", "").strip()
if not token:
raise SystemExit("SYSTEM_ACCESSTOKEN is empty.")
definition_raw = _env_text("AUTO_REMEDIATE_RESTORE_PIPELINE_ID", "")
if not definition_raw:
print(
"Post-merge auto-remediation queue skipped: "
"AUTO_REMEDIATE_RESTORE_PIPELINE_ID is empty."
)
return 0
try:
definition_id = int(definition_raw)
except ValueError as exc:
raise SystemExit(f"Invalid AUTO_REMEDIATE_RESTORE_PIPELINE_ID: {definition_raw}") from exc
max_workers_raw = _env_text("AUTO_REMEDIATE_MAX_WORKERS", "10")
try:
max_workers = int(max_workers_raw)
except ValueError as exc:
raise SystemExit(f"Invalid AUTO_REMEDIATE_MAX_WORKERS: {max_workers_raw}") from exc
lookback_hours_raw = _env_text("AUTO_REMEDIATE_AFTER_MERGE_LOOKBACK_HOURS", "168")
try:
lookback_hours = int(lookback_hours_raw)
except ValueError as exc:
raise SystemExit(f"Invalid AUTO_REMEDIATE_AFTER_MERGE_LOOKBACK_HOURS: {lookback_hours_raw}") from exc
collection_uri = os.environ["SYSTEM_COLLECTIONURI"].rstrip("/")
project = os.environ["SYSTEM_TEAMPROJECT"]
repository_id = os.environ["BUILD_REPOSITORY_ID"]
include_entra_update = _env_bool("AUTO_REMEDIATE_INCLUDE_ENTRA_UPDATE", False)
dry_run = _env_bool("AUTO_REMEDIATE_DRY_RUN", False)
update_assignments = _env_bool("AUTO_REMEDIATE_UPDATE_ASSIGNMENTS", True)
remove_unmanaged = _env_bool("AUTO_REMEDIATE_REMOVE_OBJECTS", False)
exclude_csv = _normalize_exclude_csv(_env_text("AUTO_REMEDIATE_EXCLUDE_CSV", ""))
source_ref = _ref_from_branch(args.drift_branch)
target_ref = _ref_from_branch(args.baseline_branch)
repo_api = f"{collection_uri}/{project}/_apis/git/repositories/{repository_id}"
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
"Accept": "application/json",
}
cutoff = dt.datetime.now(dt.timezone.utc) - dt.timedelta(hours=lookback_hours)
completed = _query_completed_prs(repo_api, headers, source_ref, target_ref)
candidate: dict[str, Any] | None = None
candidate_threads: list[dict[str, Any]] = []
candidate_comments: list[str] = []
for pr in completed:
closed_at = _parse_iso_utc(str(pr.get("closedDate", "") or ""))
if closed_at and closed_at < cutoff:
continue
merge_commit = (((pr.get("lastMergeCommit") or {}).get("commitId")) or "").strip()
if not merge_commit:
continue
pr_id = int(pr.get("pullRequestId"))
threads = _threads(repo_api, headers, pr_id)
comments = _thread_comment_contents(threads)
if not _has_reject_signal(comments):
continue
if _has_merge_marker(comments, merge_commit):
continue
candidate = pr
candidate_threads = threads
candidate_comments = comments
break
if not candidate:
print("No merged rolling PR requiring post-merge remediation was found.")
return 0
pr_id = int(candidate.get("pullRequestId"))
merge_commit = (((candidate.get("lastMergeCommit") or {}).get("commitId")) or "").strip()
rejected_paths = _rejected_ticket_paths(candidate_threads)
restore_mode = "full"
restore_paths_csv = ""
if args.workload == "intune" and rejected_paths:
restore_mode = "selective"
restore_paths_csv = ",".join(rejected_paths)
print(f"Post-merge remediation scope: selective ({len(rejected_paths)} rejected path(s)).")
for path in rejected_paths:
print(f" - {path}")
else:
print("Post-merge remediation scope: full.")
try:
queued = _queue_restore_pipeline(
collection_uri=collection_uri,
project=project,
headers=headers,
definition_id=definition_id,
baseline_branch=args.baseline_branch,
include_entra_update=include_entra_update,
dry_run=dry_run,
update_assignments=update_assignments,
remove_unmanaged=remove_unmanaged,
max_workers=max_workers,
exclude_csv=exclude_csv,
restore_mode=restore_mode,
restore_paths_csv=restore_paths_csv,
)
except Exception as exc:
if _is_permission_error(exc):
print(
"WARNING: Post-merge remediation queue skipped due permissions. "
f"Definition={definition_id}. Details: {exc}"
)
_diagnose_queue_permission(collection_uri, project, headers, definition_id)
print(
"Grant 'Queue builds' permission for this pipeline identity on the restore pipeline "
"and ensure the pipeline has access to run it."
)
return 0
raise
build_id = queued.get("id")
build_url = ((queued.get("_links") or {}).get("web") or {}).get("href", "")
if not build_url and build_id:
build_url = f"{collection_uri}/{project}/_build/results?buildId={build_id}"
marker = f"Automation marker: {MERGE_MARKER_PREFIX}{merge_commit}"
comment = (
"Auto-remediation queued after merged rolling PR with reviewer /reject decision(s).\n\n"
f"Workload: {args.workload}\n"
f"Merged PR: #{pr_id}\n"
f"Merge commit: {merge_commit}\n"
f"Restore pipeline definition: {definition_id}\n"
f"Restore run: {build_url or '(queued)'}\n\n"
f"{marker}"
)
try:
_post_pr_thread(repo_api, headers, pr_id, comment)
except Exception as exc:
print(f"WARNING: Restore queued, but failed posting merge marker comment on PR #{pr_id}: {exc}")
print(
f"Queued post-merge remediation for PR #{pr_id} (merge_commit={merge_commit}, buildId={build_id})."
)
return 0
if __name__ == "__main__":
try:
raise SystemExit(main())
except Exception as exc:
print(f"WARNING: Failed post-merge remediation check: {exc}", file=sys.stderr)
raise

View File

@@ -0,0 +1,273 @@
#!/usr/bin/env python3
"""Resolve Conditional Access GUID references to display names in backup JSON."""
from __future__ import annotations
import argparse
import json
import pathlib
import urllib.error
import urllib.parse
import urllib.request
SPECIAL_APP_IDS = {
"All": "All applications",
"None": "None",
"Office365": "Office 365",
"MicrosoftAdminPortals": "Microsoft Admin Portals",
}
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--root", required=True, help="Path to workload backup root (for Entra: tenant-state/entra).")
parser.add_argument("--token", required=True, help="Microsoft Graph access token.")
return parser.parse_args()
class GraphResolver:
def __init__(self, token: str):
self.token = token.strip()
self.group_cache: dict[str, str | None] = {}
self.role_cache: dict[str, str | None] = {}
self.app_cache: dict[str, str | None] = {}
self.location_cache: dict[str, str | None] = {}
self.auth_strength_cache: dict[str, str | None] = {}
self._warned: set[str] = set()
def _warn_once(self, key: str, message: str) -> None:
if key in self._warned:
return
self._warned.add(key)
print(f"Warning: {message}")
def _get(self, url: str) -> dict | None:
req = urllib.request.Request(
url,
headers={
"Authorization": f"Bearer {self.token}",
"Accept": "application/json",
},
method="GET",
)
try:
with urllib.request.urlopen(req, timeout=30) as resp:
return json.loads(resp.read().decode("utf-8"))
except urllib.error.HTTPError as exc:
if exc.code == 404:
return None
self._warn_once(url, f"Graph lookup failed for {url} (HTTP {exc.code})")
return None
except Exception as exc: # noqa: BLE001
self._warn_once(url, f"Graph lookup failed for {url} ({exc})")
return None
def group_name(self, group_id: str) -> str | None:
if group_id in self.group_cache:
return self.group_cache[group_id]
url = (
"https://graph.microsoft.com/v1.0/groups/"
+ urllib.parse.quote(group_id)
+ "?$select=id,displayName"
)
payload = self._get(url)
name = payload.get("displayName") if isinstance(payload, dict) else None
self.group_cache[group_id] = name
return name
def role_name(self, role_template_id: str) -> str | None:
if role_template_id in self.role_cache:
return self.role_cache[role_template_id]
url = (
"https://graph.microsoft.com/v1.0/directoryRoleTemplates/"
+ urllib.parse.quote(role_template_id)
+ "?$select=id,displayName"
)
payload = self._get(url)
name = payload.get("displayName") if isinstance(payload, dict) else None
self.role_cache[role_template_id] = name
return name
def app_name(self, app_or_object_id: str) -> str | None:
if app_or_object_id in SPECIAL_APP_IDS:
return SPECIAL_APP_IDS[app_or_object_id]
if app_or_object_id in self.app_cache:
return self.app_cache[app_or_object_id]
# CA app conditions usually use appId; try appId lookup first.
url = (
"https://graph.microsoft.com/v1.0/servicePrincipals"
+ "?$select=id,appId,displayName"
+ "&$top=1"
+ "&$filter=appId eq '"
+ urllib.parse.quote(app_or_object_id)
+ "'"
)
payload = self._get(url)
name = None
if isinstance(payload, dict):
value = payload.get("value")
if isinstance(value, list) and value:
first = value[0]
if isinstance(first, dict):
name = first.get("displayName")
if not name:
# Fallback: treat value as service principal object id.
by_id_url = (
"https://graph.microsoft.com/v1.0/servicePrincipals/"
+ urllib.parse.quote(app_or_object_id)
+ "?$select=id,appId,displayName"
)
by_id = self._get(by_id_url)
if isinstance(by_id, dict):
name = by_id.get("displayName")
self.app_cache[app_or_object_id] = name
return name
def location_name(self, location_id: str) -> str | None:
if location_id in self.location_cache:
return self.location_cache[location_id]
if location_id in {"All", "AllTrusted"}:
name = "All locations" if location_id == "All" else "All trusted locations"
self.location_cache[location_id] = name
return name
url = (
"https://graph.microsoft.com/v1.0/identity/conditionalAccess/namedLocations/"
+ urllib.parse.quote(location_id)
+ "?$select=id,displayName"
)
payload = self._get(url)
name = payload.get("displayName") if isinstance(payload, dict) else None
self.location_cache[location_id] = name
return name
def auth_strength_name(self, auth_strength_id: str) -> str | None:
if auth_strength_id in self.auth_strength_cache:
return self.auth_strength_cache[auth_strength_id]
url = (
"https://graph.microsoft.com/beta/identity/conditionalAccess/authenticationStrength/policies/"
+ urllib.parse.quote(auth_strength_id)
+ "?$select=id,displayName"
)
payload = self._get(url)
name = payload.get("displayName") if isinstance(payload, dict) else None
self.auth_strength_cache[auth_strength_id] = name
return name
def resolve_id_list(
values: list,
lookup_fn,
) -> list[dict[str, str]]:
resolved: list[dict[str, str]] = []
for raw in values:
if not isinstance(raw, str) or not raw:
continue
resolved.append(
{
"id": raw,
"displayName": lookup_fn(raw) or "Unresolved",
}
)
return resolved
def main() -> int:
args = parse_args()
root = pathlib.Path(args.root).resolve()
token = args.token.strip()
if not token:
print("No Graph token provided. Skipping Conditional Access reference enrichment.")
return 0
ca_dir = root / "Conditional Access"
if not ca_dir.exists():
print(f"Conditional Access folder not found at {ca_dir}. Skipping.")
return 0
resolver = GraphResolver(token)
updated_files = 0
processed_files = 0
for file_path in sorted(ca_dir.glob("*.json")):
try:
payload = json.loads(file_path.read_text(encoding="utf-8"))
except Exception: # noqa: BLE001
continue
if not isinstance(payload, dict):
continue
processed_files += 1
changed = False
conditions = payload.get("conditions")
if not isinstance(conditions, dict):
conditions = {}
users = conditions.get("users")
if isinstance(users, dict):
for key, lookup in (
("includeGroups", resolver.group_name),
("excludeGroups", resolver.group_name),
("includeRoles", resolver.role_name),
("excludeRoles", resolver.role_name),
):
value = users.get(key)
if isinstance(value, list):
resolved_key = f"{key}Resolved"
resolved_value = resolve_id_list(value, lookup)
if users.get(resolved_key) != resolved_value:
users[resolved_key] = resolved_value
changed = True
apps = conditions.get("applications")
if isinstance(apps, dict):
for key in ("includeApplications", "excludeApplications"):
value = apps.get(key)
if isinstance(value, list):
resolved_key = f"{key}Resolved"
resolved_value = resolve_id_list(value, resolver.app_name)
if apps.get(resolved_key) != resolved_value:
apps[resolved_key] = resolved_value
changed = True
locations = conditions.get("locations")
if isinstance(locations, dict):
for key in ("includeLocations", "excludeLocations"):
value = locations.get(key)
if isinstance(value, list):
resolved_key = f"{key}Resolved"
resolved_value = resolve_id_list(value, resolver.location_name)
if locations.get(resolved_key) != resolved_value:
locations[resolved_key] = resolved_value
changed = True
grant_controls = payload.get("grantControls")
if isinstance(grant_controls, dict):
auth_strength = grant_controls.get("authenticationStrength")
if isinstance(auth_strength, dict):
auth_strength_id = auth_strength.get("id")
if isinstance(auth_strength_id, str) and auth_strength_id:
resolved = {
"id": auth_strength_id,
"displayName": resolver.auth_strength_name(auth_strength_id) or "Unresolved",
}
if grant_controls.get("authenticationStrengthResolved") != resolved:
grant_controls["authenticationStrengthResolved"] = resolved
changed = True
if changed:
file_path.write_text(json.dumps(payload, indent=5, ensure_ascii=False) + "\n", encoding="utf-8")
updated_files += 1
print(
"Conditional Access GUID enrichment complete. "
+ f"Processed files: {processed_files}. "
+ f"Updated files: {updated_files}."
)
return 0
if __name__ == "__main__":
raise SystemExit(main())

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,130 @@
#!/usr/bin/env python3
"""Validate backup outputs for Intune and Entra workloads."""
from __future__ import annotations
import argparse
from pathlib import Path
def to_bool(value: str) -> bool:
return str(value).strip().lower() in {"1", "true", "yes", "y", "on"}
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--workload", required=True, choices=["intune", "entra"])
parser.add_argument("--mode", default="light", choices=["light", "full"])
parser.add_argument("--root", required=True, help="Workload backup root path.")
parser.add_argument("--reports-root", required=True, help="Workload reports root path.")
parser.add_argument("--include-named-locations", default="false")
parser.add_argument("--include-authentication-strengths", default="false")
parser.add_argument("--include-conditional-access", default="false")
parser.add_argument("--include-enterprise-applications", default="false")
parser.add_argument("--include-enterprise-applications-effective", default="false")
parser.add_argument("--include-app-registrations", default="false")
parser.add_argument("--include-app-registrations-effective", default="false")
return parser.parse_args()
def _require_file(path: Path, label: str, errors: list[str]) -> None:
if not path.is_file():
errors.append(f"Missing {label}: {path}")
def _json_count(root: Path) -> int:
if not root.exists():
return 0
return sum(1 for _ in root.rglob("*.json"))
def _validate_intune(root: Path, reports_root: Path, errors: list[str]) -> None:
if not root.exists():
errors.append(f"Missing Intune backup root: {root}")
return
json_count = _json_count(root)
if json_count == 0:
errors.append(f"Intune backup root has no JSON exports: {root}")
_require_file(reports_root / "policy-assignments.md", "Intune assignment markdown report", errors)
_require_file(reports_root / "policy-assignments.csv", "Intune assignment CSV report", errors)
_require_file(reports_root / "object-inventory-all.csv", "Intune object inventory CSV", errors)
if errors:
return
print(f"Intune output validation passed: jsonFiles={json_count}")
def _validate_entra(root: Path, reports_root: Path, args: argparse.Namespace, errors: list[str]) -> None:
if not root.exists():
errors.append(f"Missing Entra backup root: {root}")
return
include_named_locations = to_bool(args.include_named_locations)
include_auth_strengths = to_bool(args.include_authentication_strengths)
include_conditional_access = to_bool(args.include_conditional_access)
include_enterprise_apps = to_bool(args.include_enterprise_applications)
include_enterprise_apps_effective = to_bool(args.include_enterprise_applications_effective)
include_app_registrations = to_bool(args.include_app_registrations)
include_app_registrations_effective = to_bool(args.include_app_registrations_effective)
expected_category_indexes: list[tuple[str, bool]] = [
("Named Locations", include_named_locations),
("Authentication Strengths", include_auth_strengths),
("Conditional Access", include_conditional_access),
("App Registrations", include_app_registrations_effective),
("Enterprise Applications", include_enterprise_apps_effective),
]
for category_name, is_required in expected_category_indexes:
if not is_required:
continue
index_path = root / category_name / f"{category_name}.md"
_require_file(index_path, f"Entra export index for '{category_name}'", errors)
_require_file(reports_root / "object-inventory-all.csv", "Entra object inventory CSV", errors)
if include_conditional_access:
_require_file(reports_root / "policy-assignments.md", "Entra assignment markdown report", errors)
_require_file(reports_root / "policy-assignments.csv", "Entra assignment CSV report", errors)
if include_app_registrations_effective or include_enterprise_apps_effective:
_require_file(reports_root / "apps-inventory.csv", "Entra apps inventory CSV", errors)
if errors:
return
json_count = _json_count(root)
print(
"Entra output validation passed: "
f"jsonFiles={json_count}, "
f"mode={args.mode}, "
f"enterpriseAppsConfigured={str(include_enterprise_apps).lower()}, "
f"enterpriseAppsEffective={str(include_enterprise_apps_effective).lower()}, "
f"appRegistrationsConfigured={str(include_app_registrations).lower()}, "
f"appRegistrationsEffective={str(include_app_registrations_effective).lower()}"
)
def main() -> int:
args = parse_args()
root = Path(args.root).resolve()
reports_root = Path(args.reports_root).resolve()
errors: list[str] = []
if args.workload == "intune":
_validate_intune(root=root, reports_root=reports_root, errors=errors)
else:
_validate_entra(root=root, reports_root=reports_root, args=args, errors=errors)
if errors:
print("Backup output validation failed:")
for item in errors:
print(f" - {item}")
return 1
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,48 @@
# Common variables shared across backup and review-sync pipelines.
# Include with: variables: [ template: templates/variables-common.yml ]
variables:
- name: BASELINE_BRANCH
value: main
- name: DRIFT_BRANCH_INTUNE
value: drift/intune
- name: DRIFT_BRANCH_ENTRA
value: drift/entra
- name: BACKUP_FOLDER
value: tenant-state
- name: REPORTS_SUBDIR
value: reports
- name: ENABLE_WORKLOAD_INTUNE
value: true
- name: ENABLE_WORKLOAD_ENTRA
value: true
- name: ENABLE_PR_REVIEW_SUMMARY
value: true
- name: ENABLE_PR_REVIEWER_DECISIONS
value: true
- name: ENABLE_PR_AI_SUMMARY
value: true
- name: ROLLING_PR_DELAY_REVIEWER_NOTIFICATIONS
value: true
- name: REQUIRE_CHANGE_TICKETS
value: false
- name: CHANGE_TICKET_REGEX
value: "[A-Z][A-Z0-9]+-[0-9]+"
- name: DEBUG_CHANGE_TICKET_THREADS
value: false
- name: AZURE_OPENAI_API_VERSION
value: "2024-12-01-preview"
- name: AUTO_REMEDIATE_AFTER_MERGE
value: true
- name: AUTO_REMEDIATE_AFTER_MERGE_LOOKBACK_HOURS
value: 168
- name: AUTO_REMEDIATE_DRY_RUN
value: false
- name: AUTO_REMEDIATE_UPDATE_ASSIGNMENTS
value: true
- name: AUTO_REMEDIATE_REMOVE_OBJECTS
value: false
- name: AUTO_REMEDIATE_MAX_WORKERS
value: 10
- name: AUTO_REMEDIATE_EXCLUDE_CSV
value: ""

View File

@@ -0,0 +1,59 @@
# Tenant-specific variables for ASTRAL
#
# Copy these variables into an Azure DevOps Variable Group (e.g. vg-astral-tenant)
# and reference that group in your pipeline YAMLs. Do not commit secrets to Git.
#
# Example pipeline reference:
# variables:
# - group: vg-astral-tenant
# - template: templates/variables-common.yml
variables:
# Required: Microsoft 365 tenant domain
- name: TENANT_NAME
value: contoso.onmicrosoft.com
# Required: Azure DevOps service connection name (workload federated credential)
- name: SERVICE_CONNECTION_NAME
value: sc-astral-backup
# Required: Git commit identity used by the pipeline
- name: USER_NAME
value: ASTRAL Backup Service
# Required: Git commit email used by the pipeline
- name: USER_EMAIL
value: astral-backup@contoso.com
# Optional: Agent pool name. Default uses Azure-hosted agents.
- name: AGENT_POOL_NAME
value: Azure Pipelines
# Optional: Timezone for light/full run decisions. Must be a valid tz database name.
- name: BACKUP_TIMEZONE
value: Europe/Prague
# Optional: Full-run hour in BACKUP_TIMEZONE (24h format, zero-padded).
# The main pipeline runs hourly; only this hour triggers a full export.
- name: FULL_RUN_HOUR
value: "00"
# Optional: Cron schedule for the main backup pipeline.
- name: SCHEDULE_CRON
value: "0 * * * *"
# Optional but recommended: pipeline definition ID of azure-pipelines-restore.yml.
# Set this after you have imported the restore pipeline into Azure DevOps.
- name: AUTO_REMEDIATE_RESTORE_PIPELINE_ID
value: ""
# Optional: Azure OpenAI settings for AI-assisted PR summaries.
# Store AZURE_OPENAI_API_KEY as a secret variable.
- name: ENABLE_PR_AI_SUMMARY
value: false
- name: AZURE_OPENAI_ENDPOINT
value: ""
- name: AZURE_OPENAI_DEPLOYMENT
value: ""
- name: AZURE_OPENAI_API_KEY
value: ""

4
tenant-state/README.md Normal file
View File

@@ -0,0 +1,4 @@
# tenant-state
This directory is populated automatically by the ASTRAL pipeline.
Do not place manual files here; they will be overwritten on the next export.

View File

View File

View File

View File

View File

@@ -0,0 +1,342 @@
from __future__ import annotations
import importlib.util
import os
import subprocess
import sys
import tempfile
import unittest
from pathlib import Path
from unittest.mock import patch
MODULE_PATH = Path(__file__).resolve().parents[1] / "scripts" / "ensure_rolling_pr.py"
def load_module():
# Preload common helper so the script can import it.
common_path = MODULE_PATH.parent / "common.py"
common_spec = importlib.util.spec_from_file_location("common", common_path)
if common_spec is not None and common_spec.loader is not None:
common_mod = importlib.util.module_from_spec(common_spec)
sys.modules["common"] = common_mod
common_spec.loader.exec_module(common_mod)
module_name = "ensure_rolling_pr"
spec = importlib.util.spec_from_file_location(module_name, MODULE_PATH)
if spec is None or spec.loader is None:
raise RuntimeError(f"Unable to load module from {MODULE_PATH}")
module = importlib.util.module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module)
return module
def _run(cmd: list[str], cwd: Path) -> None:
subprocess.run(cmd, cwd=cwd, check=True, capture_output=True, text=True)
class EnsureRollingPrTests(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
cls.module = load_module()
def test_is_workload_config_path_filters_docs_and_reports(self) -> None:
is_path = self.module._is_workload_config_path
self.assertTrue(
is_path(
"tenant-state/entra/Conditional Access/policy.json",
workload_dir="entra",
backup_folder="tenant-state",
reports_subdir="reports",
)
)
self.assertFalse(
is_path(
"tenant-state/entra/Conditional Access/policy.md",
workload_dir="entra",
backup_folder="tenant-state",
reports_subdir="reports",
)
)
self.assertFalse(
is_path(
"tenant-state/reports/entra/assignment_report.md",
workload_dir="entra",
backup_folder="tenant-state",
reports_subdir="reports",
)
)
def test_config_fingerprint_ignores_docs_and_reports(self) -> None:
with tempfile.TemporaryDirectory() as tmp:
repo = Path(tmp)
_run(["git", "init"], repo)
_run(["git", "config", "user.name", "Test"], repo)
_run(["git", "config", "user.email", "test@example.com"], repo)
config_file = repo / "tenant-state" / "entra" / "Conditional Access" / "policy.json"
report_file = repo / "tenant-state" / "reports" / "entra" / "summary.md"
doc_file = repo / "tenant-state" / "entra" / "README.md"
config_file.parent.mkdir(parents=True, exist_ok=True)
report_file.parent.mkdir(parents=True, exist_ok=True)
doc_file.parent.mkdir(parents=True, exist_ok=True)
config_file.write_text('{"state":"enabled"}\n', encoding="utf-8")
report_file.write_text("report v1\n", encoding="utf-8")
doc_file.write_text("doc v1\n", encoding="utf-8")
_run(["git", "add", "."], repo)
_run(["git", "commit", "-m", "initial"], repo)
fp1 = self.module._config_fingerprint_from_local_tree(
repo_root=str(repo),
commitish="HEAD",
workload_dir="entra",
backup_folder="tenant-state",
reports_subdir="reports",
)
report_file.write_text("report v2\n", encoding="utf-8")
doc_file.write_text("doc v2\n", encoding="utf-8")
_run(["git", "add", "."], repo)
_run(["git", "commit", "-m", "doc/report only"], repo)
fp2 = self.module._config_fingerprint_from_local_tree(
repo_root=str(repo),
commitish="HEAD",
workload_dir="entra",
backup_folder="tenant-state",
reports_subdir="reports",
)
config_file.write_text('{"state":"disabled"}\n', encoding="utf-8")
_run(["git", "add", "."], repo)
_run(["git", "commit", "-m", "config change"], repo)
fp3 = self.module._config_fingerprint_from_local_tree(
repo_root=str(repo),
commitish="HEAD",
workload_dir="entra",
backup_folder="tenant-state",
reports_subdir="reports",
)
self.assertTrue(fp1)
self.assertEqual(fp1, fp2)
self.assertNotEqual(fp2, fp3)
def test_ref_has_commit_for_local_and_missing_ref(self) -> None:
with tempfile.TemporaryDirectory() as tmp:
repo = Path(tmp)
_run(["git", "init"], repo)
_run(["git", "config", "user.name", "Test"], repo)
_run(["git", "config", "user.email", "test@example.com"], repo)
(repo / "README.md").write_text("x\n", encoding="utf-8")
_run(["git", "add", "."], repo)
_run(["git", "commit", "-m", "init"], repo)
self.assertTrue(self.module._ref_has_commit(str(repo), "HEAD"))
self.assertFalse(self.module._ref_has_commit(str(repo), "origin/does-not-exist"))
def test_workload_config_diff_exists_ignores_docs_and_reports(self) -> None:
with tempfile.TemporaryDirectory() as tmp:
repo = Path(tmp)
_run(["git", "init"], repo)
_run(["git", "config", "user.name", "Test"], repo)
_run(["git", "config", "user.email", "test@example.com"], repo)
config_file = repo / "tenant-state" / "intune" / "Device Configurations" / "policy.json"
report_file = repo / "tenant-state" / "reports" / "intune" / "summary.md"
doc_file = repo / "tenant-state" / "intune" / "README.md"
config_file.parent.mkdir(parents=True, exist_ok=True)
report_file.parent.mkdir(parents=True, exist_ok=True)
doc_file.parent.mkdir(parents=True, exist_ok=True)
config_file.write_text('{"setting":"enabled"}\n', encoding="utf-8")
report_file.write_text("report v1\n", encoding="utf-8")
doc_file.write_text("doc v1\n", encoding="utf-8")
_run(["git", "add", "."], repo)
_run(["git", "commit", "-m", "baseline"], repo)
baseline_commit = subprocess.run(
["git", "rev-parse", "HEAD"],
cwd=repo,
check=True,
capture_output=True,
text=True,
).stdout.strip()
report_file.write_text("report v2\n", encoding="utf-8")
doc_file.write_text("doc v2\n", encoding="utf-8")
_run(["git", "add", "."], repo)
_run(["git", "commit", "-m", "doc only"], repo)
doc_only_commit = subprocess.run(
["git", "rev-parse", "HEAD"],
cwd=repo,
check=True,
capture_output=True,
text=True,
).stdout.strip()
config_file.write_text('{"setting":"disabled"}\n', encoding="utf-8")
_run(["git", "add", "."], repo)
_run(["git", "commit", "-m", "config change"], repo)
config_change_commit = subprocess.run(
["git", "rev-parse", "HEAD"],
cwd=repo,
check=True,
capture_output=True,
text=True,
).stdout.strip()
self.assertFalse(
self.module._workload_config_diff_exists(
repo_root=str(repo),
baseline_commitish=baseline_commit,
drift_commitish=doc_only_commit,
workload_dir="intune",
backup_folder="tenant-state",
reports_subdir="reports",
)
)
self.assertTrue(
self.module._workload_config_diff_exists(
repo_root=str(repo),
baseline_commitish=baseline_commit,
drift_commitish=config_change_commit,
workload_dir="intune",
backup_folder="tenant-state",
reports_subdir="reports",
)
)
def test_main_suppresses_pr_creation_when_drift_matches_baseline_config(self) -> None:
env = {
"SYSTEM_ACCESSTOKEN": "token",
"SYSTEM_COLLECTIONURI": "https://dev.azure.com/example",
"SYSTEM_TEAMPROJECT": "Project",
"BUILD_REPOSITORY_ID": "repo-id",
}
with patch.dict(os.environ, env, clear=False):
with patch.object(
sys,
"argv",
[
"ensure_rolling_pr.py",
"--repo-root",
"/tmp/repo",
"--workload",
"intune",
"--drift-branch",
"drift/intune",
"--baseline-branch",
"main",
"--pr-title",
"Intune drift review (rolling)",
],
):
with patch.object(self.module, "_query_prs", return_value=[]):
with patch.object(self.module, "_run_git"):
with patch.object(self.module, "_ref_has_commit", return_value=True):
with patch.object(self.module, "_workload_config_diff_exists", return_value=False):
with patch.object(self.module, "_request_json") as request_json:
result = self.module.main()
self.assertEqual(result, 0)
request_json.assert_not_called()
def test_main_creates_pr_as_draft_when_notification_delay_enabled(self) -> None:
env = {
"SYSTEM_ACCESSTOKEN": "token",
"SYSTEM_COLLECTIONURI": "https://dev.azure.com/example",
"SYSTEM_TEAMPROJECT": "Project",
"BUILD_REPOSITORY_ID": "repo-id",
"BUILD_BUILDNUMBER": "42",
"BUILD_BUILDID": "1001",
"ROLLING_PR_DELAY_REVIEWER_NOTIFICATIONS": "true",
}
created_bodies: list[dict[str, object]] = []
def request_json(url: str, headers: dict[str, str], method: str = "GET", body: dict[str, object] | None = None):
if method == "POST" and url.endswith("/pullrequests?api-version=7.1"):
created_bodies.append(body or {})
return {"pullRequestId": 123}
raise AssertionError(f"Unexpected request: {method} {url}")
with patch.dict(os.environ, env, clear=False):
with patch.object(
sys,
"argv",
[
"ensure_rolling_pr.py",
"--repo-root",
"/tmp/repo",
"--workload",
"intune",
"--drift-branch",
"drift/intune",
"--baseline-branch",
"main",
"--pr-title",
"Intune drift review (rolling)",
],
):
with patch.object(self.module, "_query_prs", side_effect=[[], []]):
with patch.object(self.module, "_run_git"):
with patch.object(self.module, "_ref_has_commit", return_value=True):
with patch.object(self.module, "_workload_config_diff_exists", return_value=True):
with patch.object(self.module, "_tree_id_for_commitish", return_value="tree123"):
with patch.object(self.module, "_find_matching_abandoned_pr", return_value=(None, "")):
with patch.object(self.module, "_request_json", side_effect=request_json):
result = self.module.main()
self.assertEqual(result, 0)
self.assertEqual(len(created_bodies), 1)
self.assertTrue(created_bodies[0]["isDraft"])
def test_main_skips_active_pr_patch_when_already_up_to_date(self) -> None:
env = {
"SYSTEM_ACCESSTOKEN": "token",
"SYSTEM_COLLECTIONURI": "https://dev.azure.com/example",
"SYSTEM_TEAMPROJECT": "Project",
"BUILD_REPOSITORY_ID": "repo-id",
}
with patch.dict(os.environ, env, clear=False):
with patch.object(
sys,
"argv",
[
"ensure_rolling_pr.py",
"--repo-root",
"/tmp/repo",
"--workload",
"intune",
"--drift-branch",
"drift/intune",
"--baseline-branch",
"main",
"--pr-title",
"Intune drift review (rolling)",
],
):
with patch.object(
self.module,
"_query_prs",
return_value=[
{
"pullRequestId": 123,
"title": "Intune drift review (rolling)",
"description": "Existing description with summary",
"completionOptions": {"mergeStrategy": "rebase"},
"url": "https://dev.azure.com/example/_apis/git/repositories/repo/pullRequests/123",
}
],
):
with patch.object(self.module, "_request_json") as request_json:
result = self.module.main()
self.assertEqual(result, 0)
request_json.assert_not_called()
if __name__ == "__main__":
unittest.main()

View File

@@ -0,0 +1,252 @@
from __future__ import annotations
import importlib.util
import json
import tempfile
import unittest
from pathlib import Path
from types import SimpleNamespace
from unittest.mock import MagicMock, patch
MODULE_PATH = Path(__file__).resolve().parents[1] / "scripts" / "export_entra_baseline.py"
def load_module():
spec = importlib.util.spec_from_file_location("export_entra_baseline", MODULE_PATH)
if spec is None or spec.loader is None:
raise RuntimeError(f"Unable to load module from {MODULE_PATH}")
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
class ExportEntraBaselineTests(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
cls.module = load_module()
def _namespace(self, root: Path, fail_on_export_error: str) -> SimpleNamespace:
return SimpleNamespace(
root=str(root),
token="token-value",
include_named_locations="true",
include_authentication_strengths="false",
include_conditional_access="false",
include_enterprise_applications="false",
include_app_registrations="false",
enterprise_app_workers=1,
fail_on_export_error=fail_on_export_error,
previous_snapshot_ref="",
)
def test_requested_export_error_is_fatal_by_default(self) -> None:
with tempfile.TemporaryDirectory() as td:
root = Path(td) / "entra"
root.mkdir(parents=True, exist_ok=True)
args = self._namespace(root=root, fail_on_export_error="true")
with (
patch.object(self.module, "parse_args", return_value=args),
patch.object(self.module, "GraphClient") as graph_client_cls,
):
graph_client = MagicMock()
graph_client.get_object.return_value = ({"value": []}, None)
graph_client.get_collection.return_value = ([], "HTTP 500")
graph_client_cls.return_value = graph_client
result = self.module.main()
self.assertEqual(result, 2)
def test_requested_export_error_can_be_non_fatal_when_disabled(self) -> None:
with tempfile.TemporaryDirectory() as td:
root = Path(td) / "entra"
root.mkdir(parents=True, exist_ok=True)
args = self._namespace(root=root, fail_on_export_error="false")
with (
patch.object(self.module, "parse_args", return_value=args),
patch.object(self.module, "GraphClient") as graph_client_cls,
):
graph_client = MagicMock()
graph_client.get_object.return_value = ({"value": []}, None)
graph_client.get_collection.return_value = ([], "HTTP 500")
graph_client_cls.return_value = graph_client
result = self.module.main()
self.assertEqual(result, 0)
def test_normalize_resolution_error_suppresses_transient_dns_variants(self) -> None:
transient_samples = [
"<urlopen error [Errno -3] Temporary failure in name resolution>",
"Temporary failure resolving 'graph.microsoft.com'",
"Failed to resolve host graph.microsoft.com",
"getaddrinfo failed",
]
for sample in transient_samples:
with self.subTest(sample=sample):
self.assertEqual(self.module.normalize_resolution_error(sample), "")
def test_normalize_resolution_error_keeps_non_transient_http_error(self) -> None:
self.assertEqual(self.module.normalize_resolution_error("HTTP 403"), "HTTP 403")
def test_normalize_branch_name_ignores_unresolved_macro(self) -> None:
self.assertEqual(self.module._normalize_branch_name("$(DRIFT_BRANCH_ENTRA)"), "")
def test_required_resource_resolution_backfills_unresolved_from_previous(self) -> None:
current = [
{
"resourceAppId": "00000003-0000-0000-c000-000000000000",
"resourceDisplayName": "Unresolved",
"permissions": [
{
"id": "perm-id-1",
"type": "Scope",
"value": "",
"displayName": "",
"description": "",
}
],
}
]
previous = [
{
"resourceAppId": "00000003-0000-0000-c000-000000000000",
"resourceDisplayName": "Microsoft Graph",
"permissions": [
{
"id": "perm-id-1",
"type": "Scope",
"value": "User.Read.All",
"displayName": "Read all users' full profiles",
"description": "Allows the app to read full profiles.",
}
],
}
]
merged = self.module._merge_required_resource_access_resolution(current, previous)
self.assertEqual(merged[0]["resourceDisplayName"], "Microsoft Graph")
self.assertEqual(merged[0]["permissions"][0]["value"], "User.Read.All")
self.assertEqual(merged[0]["permissions"][0]["displayName"], "Read all users' full profiles")
unresolved_resources, unresolved_permissions = self.module._count_unresolved_required_permissions(merged)
self.assertEqual(unresolved_resources, 0)
self.assertEqual(unresolved_permissions, 0)
def test_app_role_resolution_backfills_unresolved_from_previous(self) -> None:
current = [
{
"resourceId": "resource-1",
"resourceDisplayName": "Unresolved",
"appRoleId": "role-1",
"appRoleValue": "",
"appRoleDisplayName": "",
"principalType": "ServicePrincipal",
}
]
previous = [
{
"resourceId": "resource-1",
"resourceDisplayName": "Office 365 Exchange Online",
"appRoleId": "role-1",
"appRoleValue": "Exchange.ManageAsApp",
"appRoleDisplayName": "Manage Exchange as application",
"principalType": "ServicePrincipal",
}
]
merged = self.module._merge_app_role_assignments_resolution(current, previous)
self.assertEqual(merged[0]["resourceDisplayName"], "Office 365 Exchange Online")
self.assertEqual(merged[0]["appRoleValue"], "Exchange.ManageAsApp")
self.assertEqual(merged[0]["appRoleDisplayName"], "Manage Exchange as application")
unresolved_resources, unresolved_roles = self.module._count_unresolved_app_role_assignments(merged)
self.assertEqual(unresolved_resources, 0)
self.assertEqual(unresolved_roles, 0)
def test_required_resource_access_uses_direct_appid_fallback_when_filter_returns_empty(self) -> None:
app = {
"requiredResourceAccess": [
{
"resourceAppId": "00000003-0000-0000-c000-000000000000",
"resourceAccess": [
{
"id": "e1fe6dd8-ba31-4d61-89e7-88639da4683d",
"type": "Scope",
}
],
}
]
}
client = MagicMock()
client.get_object.side_effect = [
({"value": []}, None),
(
{
"id": "sp-graph",
"appId": "00000003-0000-0000-c000-000000000000",
"displayName": "Microsoft Graph",
"appRoles": [],
"oauth2PermissionScopes": [
{
"id": "e1fe6dd8-ba31-4d61-89e7-88639da4683d",
"value": "User.Read",
"adminConsentDisplayName": "Sign in and read user profile",
"adminConsentDescription": "Allows sign-in and profile read.",
}
],
},
None,
),
]
resolved, unresolved_resources, unresolved_permissions, lookup_errors = self.module.resolve_required_resource_access(
app=app,
client=client,
resource_sp_by_appid={},
)
self.assertEqual(unresolved_resources, 0)
self.assertEqual(unresolved_permissions, 0)
self.assertEqual(lookup_errors, [])
self.assertEqual(resolved[0]["resourceDisplayName"], "Microsoft Graph")
self.assertEqual(resolved[0]["permissions"][0]["value"], "User.Read")
def test_load_resource_sp_cache_from_export_reads_enterprise_apps(self) -> None:
with tempfile.TemporaryDirectory() as td:
root = Path(td) / "entra"
export_dir = root / "Enterprise Applications"
export_dir.mkdir(parents=True, exist_ok=True)
payload = {
"id": "sp-graph",
"appId": "00000003-0000-0000-c000-000000000000",
"displayName": "Microsoft Graph",
"appRoles": [{"id": "role-1", "value": "Directory.Read.All"}],
"oauth2PermissionScopes": [{"id": "scope-1", "value": "User.Read"}],
}
(export_dir / "Microsoft Graph__sp-graph.json").write_text(json.dumps(payload), encoding="utf-8")
cache = self.module._load_resource_sp_cache_from_export(root)
self.assertIn("00000003-0000-0000-c000-000000000000", cache)
graph = cache["00000003-0000-0000-c000-000000000000"]
self.assertEqual(graph["displayName"], "Microsoft Graph")
self.assertEqual(graph["appRoles"][0]["value"], "Directory.Read.All")
self.assertEqual(graph["oauth2PermissionScopes"][0]["value"], "User.Read")
def test_load_resource_sp_cache_from_export_ignores_invalid_files(self) -> None:
with tempfile.TemporaryDirectory() as td:
root = Path(td) / "entra"
export_dir = root / "Enterprise Applications"
export_dir.mkdir(parents=True, exist_ok=True)
(export_dir / "invalid.json").write_text("{", encoding="utf-8")
(export_dir / "missing-appid.json").write_text(json.dumps({"id": "sp-only"}), encoding="utf-8")
cache = self.module._load_resource_sp_cache_from_export(root)
self.assertEqual(cache, {})
if __name__ == "__main__":
unittest.main()

View File

@@ -0,0 +1,164 @@
from __future__ import annotations
import importlib.util
import json
import subprocess
import tempfile
import unittest
from pathlib import Path
MODULE_PATH = Path(__file__).resolve().parents[1] / "scripts" / "filter_entra_enrichment_noise.py"
def load_module():
spec = importlib.util.spec_from_file_location("filter_entra_enrichment_noise", MODULE_PATH)
if spec is None or spec.loader is None:
raise RuntimeError(f"Unable to load module from {MODULE_PATH}")
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
def _git(repo: Path, *args: str) -> None:
subprocess.run(
["git", *args],
cwd=str(repo),
check=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
)
class FilterEntraEnrichmentNoiseTests(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
cls.module = load_module()
def test_is_enrichment_only_change_true(self) -> None:
old_text = json.dumps(
{
"displayName": "App",
"requiredResourceAccess": [{"resourceAppId": "00000003-0000-0000-c000-000000000000"}],
"requiredResourceAccessResolved": [{"resourceDisplayName": "Microsoft Graph"}],
"resolutionStatus": {"requiredResourceAccess": {"unresolvedPermissionCount": 0}},
}
)
new_text = json.dumps(
{
"displayName": "App",
"requiredResourceAccess": [{"resourceAppId": "00000003-0000-0000-c000-000000000000"}],
"requiredResourceAccessResolved": [{"resourceDisplayName": "Unresolved"}],
"resolutionStatus": {"requiredResourceAccess": {"unresolvedPermissionCount": 6}},
}
)
self.assertTrue(self.module._is_enrichment_only_change(old_text, new_text))
def test_is_enrichment_only_change_false_when_config_changes(self) -> None:
old_text = json.dumps(
{
"displayName": "App",
"requiredResourceAccess": [{"resourceAppId": "00000003-0000-0000-c000-000000000000"}],
}
)
new_text = json.dumps(
{
"displayName": "App",
"requiredResourceAccess": [{"resourceAppId": "11111111-0000-0000-c000-000000000000"}],
}
)
self.assertFalse(self.module._is_enrichment_only_change(old_text, new_text))
def test_filter_reverts_only_enrichment_changes(self) -> None:
with tempfile.TemporaryDirectory() as td:
repo = Path(td)
_git(repo, "init")
_git(repo, "config", "user.email", "tester@example.com")
_git(repo, "config", "user.name", "Tester")
workload_dir = repo / "tenant-state" / "entra" / "App Registrations"
workload_dir.mkdir(parents=True, exist_ok=True)
file_path = workload_dir / "Test App__id.json"
baseline = {
"displayName": "App",
"requiredResourceAccess": [{"resourceAppId": "00000003-0000-0000-c000-000000000000"}],
"requiredResourceAccessResolved": [{"resourceDisplayName": "Microsoft Graph"}],
"resolutionStatus": {"requiredResourceAccess": {"unresolvedPermissionCount": 0}},
}
file_path.write_text(json.dumps(baseline, indent=2) + "\n", encoding="utf-8")
_git(repo, "add", ".")
_git(repo, "commit", "-m", "baseline")
enrichment_only = {
"displayName": "App",
"requiredResourceAccess": [{"resourceAppId": "00000003-0000-0000-c000-000000000000"}],
"requiredResourceAccessResolved": [{"resourceDisplayName": "Unresolved"}],
"resolutionStatus": {"requiredResourceAccess": {"unresolvedPermissionCount": 6}},
}
file_path.write_text(json.dumps(enrichment_only, indent=2) + "\n", encoding="utf-8")
residual_before = self.module.find_enrichment_only_modified_files(
repo_root=repo,
workload_root="tenant-state/entra",
)
self.assertEqual(residual_before, ["tenant-state/entra/App Registrations/Test App__id.json"])
reverted = self.module.filter_enrichment_only_files(repo_root=repo, workload_root="tenant-state/entra")
self.assertEqual(reverted, ["tenant-state/entra/App Registrations/Test App__id.json"])
residual_after = self.module.find_enrichment_only_modified_files(
repo_root=repo,
workload_root="tenant-state/entra",
)
self.assertEqual(residual_after, [])
status = subprocess.run(
["git", "status", "--short"],
cwd=str(repo),
check=True,
capture_output=True,
text=True,
).stdout.strip()
self.assertEqual(status, "")
def test_filter_keeps_real_config_changes(self) -> None:
with tempfile.TemporaryDirectory() as td:
repo = Path(td)
_git(repo, "init")
_git(repo, "config", "user.email", "tester@example.com")
_git(repo, "config", "user.name", "Tester")
workload_dir = repo / "tenant-state" / "entra" / "App Registrations"
workload_dir.mkdir(parents=True, exist_ok=True)
file_path = workload_dir / "Test App__id.json"
baseline = {
"displayName": "App",
"requiredResourceAccess": [{"resourceAppId": "00000003-0000-0000-c000-000000000000"}],
"requiredResourceAccessResolved": [{"resourceDisplayName": "Microsoft Graph"}],
}
file_path.write_text(json.dumps(baseline, indent=2) + "\n", encoding="utf-8")
_git(repo, "add", ".")
_git(repo, "commit", "-m", "baseline")
config_changed = {
"displayName": "App",
"requiredResourceAccess": [{"resourceAppId": "11111111-0000-0000-c000-000000000000"}],
"requiredResourceAccessResolved": [{"resourceDisplayName": "Unresolved"}],
}
file_path.write_text(json.dumps(config_changed, indent=2) + "\n", encoding="utf-8")
reverted = self.module.filter_enrichment_only_files(repo_root=repo, workload_root="tenant-state/entra")
self.assertEqual(reverted, [])
status = subprocess.run(
["git", "status", "--short"],
cwd=str(repo),
check=True,
capture_output=True,
text=True,
).stdout
self.assertIn("Test App__id.json", status)
if __name__ == "__main__":
unittest.main()

View File

@@ -0,0 +1,109 @@
from __future__ import annotations
import importlib.util
import json
import subprocess
import tempfile
import unittest
from pathlib import Path
MODULE_PATH = Path(__file__).resolve().parents[1] / "scripts" / "filter_intune_partial_settings_noise.py"
def load_module():
spec = importlib.util.spec_from_file_location("filter_intune_partial_settings_noise", MODULE_PATH)
if spec is None or spec.loader is None:
raise RuntimeError(f"Unable to load module from {MODULE_PATH}")
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
def _git(repo: Path, *args: str) -> None:
subprocess.run(
["git", *args],
cwd=str(repo),
check=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
)
class FilterIntunePartialSettingsNoiseTests(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
cls.module = load_module()
def test_partial_payload_detection(self) -> None:
self.assertTrue(self.module._is_partial_settings_payload({"settingCount": 1}))
self.assertTrue(self.module._is_partial_settings_payload({"settingCount": 2, "settings": []}))
self.assertFalse(self.module._is_partial_settings_payload({"settingCount": 0, "settings": []}))
self.assertFalse(self.module._is_partial_settings_payload({"settingCount": 2, "settings": [{"id": "0"}]}))
def test_restore_partial_settings_from_baseline(self) -> None:
with tempfile.TemporaryDirectory() as td:
repo = Path(td)
_git(repo, "init")
_git(repo, "config", "user.email", "tester@example.com")
_git(repo, "config", "user.name", "Tester")
workload_dir = repo / "tenant-state" / "intune" / "Settings Catalog"
workload_dir.mkdir(parents=True, exist_ok=True)
file_path = workload_dir / "Policy__abc.json"
baseline = {
"name": "Policy",
"settingCount": 2,
"settings": [{"id": "0"}, {"id": "1"}],
}
file_path.write_text(json.dumps(baseline, indent=2) + "\n", encoding="utf-8")
_git(repo, "add", ".")
_git(repo, "commit", "-m", "baseline")
partial = {
"name": "Policy",
"settingCount": 2,
}
file_path.write_text(json.dumps(partial, indent=2) + "\n", encoding="utf-8")
restored, unresolved = self.module.restore_partial_settings_from_baseline(
repo_root=repo,
backup_root=repo / "tenant-state" / "intune",
baseline_ref="HEAD",
)
self.assertEqual(restored, ["tenant-state/intune/Settings Catalog/Policy__abc.json"])
self.assertEqual(unresolved, [])
payload = json.loads(file_path.read_text(encoding="utf-8"))
self.assertEqual(payload["settings"], [{"id": "0"}, {"id": "1"}])
def test_partial_settings_unresolved_without_baseline(self) -> None:
with tempfile.TemporaryDirectory() as td:
repo = Path(td)
_git(repo, "init")
_git(repo, "config", "user.email", "tester@example.com")
_git(repo, "config", "user.name", "Tester")
(repo / "README.md").write_text("test\n", encoding="utf-8")
_git(repo, "add", ".")
_git(repo, "commit", "-m", "init")
workload_dir = repo / "tenant-state" / "intune" / "Settings Catalog"
workload_dir.mkdir(parents=True, exist_ok=True)
file_path = workload_dir / "Policy__missing.json"
file_path.write_text(json.dumps({"settingCount": 4}, indent=2) + "\n", encoding="utf-8")
restored, unresolved = self.module.restore_partial_settings_from_baseline(
repo_root=repo,
backup_root=repo / "tenant-state" / "intune",
baseline_ref="HEAD",
)
self.assertEqual(restored, [])
self.assertEqual(unresolved, ["tenant-state/intune/Settings Catalog/Policy__missing.json"])
if __name__ == "__main__":
unittest.main()

View File

@@ -0,0 +1,103 @@
from __future__ import annotations
import base64
import importlib.util
import sys
import unittest
from pathlib import Path
from unittest.mock import patch
MODULE_PATH = Path(__file__).resolve().parents[1] / "scripts" / "queue_post_merge_restore.py"
def load_module():
# Preload common helper so the script can import it.
common_path = MODULE_PATH.parent / "common.py"
common_spec = importlib.util.spec_from_file_location("common", common_path)
if common_spec is not None and common_spec.loader is not None:
common_mod = importlib.util.module_from_spec(common_spec)
sys.modules["common"] = common_mod
common_spec.loader.exec_module(common_mod)
module_name = "queue_post_merge_restore"
spec = importlib.util.spec_from_file_location(module_name, MODULE_PATH)
if spec is None or spec.loader is None:
raise RuntimeError(f"Unable to load module from {MODULE_PATH}")
module = importlib.util.module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module)
return module
def _marker(path: str) -> str:
encoded = base64.urlsafe_b64encode(path.encode("utf-8")).decode("ascii").rstrip("=")
return f"Automation marker: AUTO-CHANGE-TICKET:{encoded}"
class QueuePostMergeRestoreTests(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
cls.module = load_module()
def test_ticket_path_from_content_decodes_marker(self) -> None:
path = "tenant-state/intune/Device Configurations/macOS - WiFi TEST_macOSWiFiConfiguration__id.json"
content = f"Header\n{_marker(path)}\nBody"
self.assertEqual(self.module._ticket_path_from_content(content), path)
def test_rejected_ticket_paths_uses_latest_decision(self) -> None:
accepted_path = "tenant-state/intune/Settings Catalog/A.json"
rejected_path = "tenant-state/intune/Settings Catalog/B.json"
threads = [
{
"comments": [
{"id": 1, "parentCommentId": 0, "content": _marker(accepted_path)},
{"id": 2, "parentCommentId": 0, "content": "/reject"},
{"id": 3, "parentCommentId": 0, "content": "/accept"},
]
},
{
"comments": [
{"id": 1, "parentCommentId": 0, "content": _marker(rejected_path)},
{"id": 2, "parentCommentId": 0, "content": "/accept"},
{"id": 3, "parentCommentId": 0, "content": "/reject"},
]
},
]
self.assertEqual(self.module._rejected_ticket_paths(threads), [rejected_path])
def test_queue_restore_pipeline_includes_selective_params(self) -> None:
captured: dict[str, object] = {}
def _fake_request(url: str, headers: dict[str, str], method: str = "GET", body: dict | None = None):
captured["url"] = url
captured["method"] = method
captured["body"] = body or {}
return {"id": 123}
with patch.object(self.module, "_request_json", side_effect=_fake_request):
self.module._queue_restore_pipeline(
collection_uri="https://dev.azure.com/org",
project="proj",
headers={"Authorization": "Bearer x"},
definition_id=42,
baseline_branch="main",
include_entra_update=False,
dry_run=False,
update_assignments=True,
remove_unmanaged=False,
max_workers=10,
exclude_csv="",
restore_mode="selective",
restore_paths_csv="tenant-state/intune/Device Configurations/macOS - WiFi TEST.json",
)
body = captured["body"]
self.assertIsInstance(body, dict)
template = body["templateParameters"]
self.assertEqual(template["restoreMode"], "selective")
self.assertIn("restorePathsCsv", template)
if __name__ == "__main__":
unittest.main()

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,157 @@
from __future__ import annotations
import json
import subprocess
import sys
import tempfile
import unittest
from pathlib import Path
SCRIPT_PATH = Path(__file__).resolve().parents[1] / "scripts" / "validate_backup_outputs.py"
def run_validator(*args: str) -> subprocess.CompletedProcess[str]:
cmd = [sys.executable, str(SCRIPT_PATH), *args]
return subprocess.run(cmd, check=False, text=True, capture_output=True)
class ValidateBackupOutputsTests(unittest.TestCase):
def test_intune_validation_passes_with_required_outputs(self) -> None:
with tempfile.TemporaryDirectory() as td:
base = Path(td)
root = base / "tenant-state" / "intune"
reports = base / "tenant-state" / "reports" / "intune"
(root / "Device Configurations").mkdir(parents=True, exist_ok=True)
reports.mkdir(parents=True, exist_ok=True)
(root / "Device Configurations" / "policy__id.json").write_text(
json.dumps({"id": "id-1", "displayName": "Policy"}) + "\n",
encoding="utf-8",
)
(reports / "policy-assignments.md").write_text("# report\n", encoding="utf-8")
(reports / "policy-assignments.csv").write_text("a,b\n", encoding="utf-8")
(reports / "object-inventory-all.csv").write_text("a,b\n", encoding="utf-8")
result = run_validator(
"--workload",
"intune",
"--mode",
"light",
"--root",
str(root),
"--reports-root",
str(reports),
)
self.assertEqual(result.returncode, 0, msg=result.stdout + result.stderr)
def test_intune_validation_fails_when_assignment_csv_missing(self) -> None:
with tempfile.TemporaryDirectory() as td:
base = Path(td)
root = base / "tenant-state" / "intune"
reports = base / "tenant-state" / "reports" / "intune"
(root / "Device Configurations").mkdir(parents=True, exist_ok=True)
reports.mkdir(parents=True, exist_ok=True)
(root / "Device Configurations" / "policy__id.json").write_text("{}", encoding="utf-8")
(reports / "policy-assignments.md").write_text("# report\n", encoding="utf-8")
(reports / "object-inventory-all.csv").write_text("a,b\n", encoding="utf-8")
result = run_validator(
"--workload",
"intune",
"--mode",
"full",
"--root",
str(root),
"--reports-root",
str(reports),
)
self.assertNotEqual(result.returncode, 0)
self.assertIn("Missing Intune assignment CSV report", result.stdout)
def test_entra_light_validation_allows_non_effective_enterprise_apps(self) -> None:
with tempfile.TemporaryDirectory() as td:
base = Path(td)
root = base / "tenant-state" / "entra"
reports = base / "tenant-state" / "reports" / "entra"
(root / "Named Locations").mkdir(parents=True, exist_ok=True)
reports.mkdir(parents=True, exist_ok=True)
(root / "Named Locations" / "Named Locations.md").write_text("# named\n", encoding="utf-8")
(reports / "object-inventory-all.csv").write_text("a,b\n", encoding="utf-8")
result = run_validator(
"--workload",
"entra",
"--mode",
"light",
"--root",
str(root),
"--reports-root",
str(reports),
"--include-named-locations",
"true",
"--include-enterprise-applications",
"true",
"--include-enterprise-applications-effective",
"false",
)
self.assertEqual(result.returncode, 0, msg=result.stdout + result.stderr)
def test_entra_light_validation_allows_non_effective_app_registrations(self) -> None:
with tempfile.TemporaryDirectory() as td:
base = Path(td)
root = base / "tenant-state" / "entra"
reports = base / "tenant-state" / "reports" / "entra"
(root / "Named Locations").mkdir(parents=True, exist_ok=True)
reports.mkdir(parents=True, exist_ok=True)
(root / "Named Locations" / "Named Locations.md").write_text("# named\n", encoding="utf-8")
(reports / "object-inventory-all.csv").write_text("a,b\n", encoding="utf-8")
result = run_validator(
"--workload",
"entra",
"--mode",
"light",
"--root",
str(root),
"--reports-root",
str(reports),
"--include-named-locations",
"true",
"--include-app-registrations",
"true",
"--include-app-registrations-effective",
"false",
)
self.assertEqual(result.returncode, 0, msg=result.stdout + result.stderr)
def test_entra_validation_fails_when_required_index_missing(self) -> None:
with tempfile.TemporaryDirectory() as td:
base = Path(td)
root = base / "tenant-state" / "entra"
reports = base / "tenant-state" / "reports" / "entra"
root.mkdir(parents=True, exist_ok=True)
reports.mkdir(parents=True, exist_ok=True)
(reports / "object-inventory-all.csv").write_text("a,b\n", encoding="utf-8")
result = run_validator(
"--workload",
"entra",
"--mode",
"full",
"--root",
str(root),
"--reports-root",
str(reports),
"--include-named-locations",
"true",
)
self.assertNotEqual(result.returncode, 0)
self.assertIn("Missing Entra export index for 'Named Locations'", result.stdout)
if __name__ == "__main__":
unittest.main()