1 Commits
v1.0.4 ... main

Author SHA1 Message Date
2c41eaca44 Sync from dev @ 497baf0
Source: main (497baf0)
Excluded: live tenant exports, generated artifacts, and dev-only tooling.
2026-04-21 22:21:43 +02:00
25 changed files with 2258 additions and 79 deletions

8
.gitignore vendored
View File

@@ -6,3 +6,11 @@ node_modules/
__pycache__/
*.py[cod]
*$py.class
package.json
package-lock.json
**/local.settings.json
# Azure Function deployment artifacts (copied/generated during zip build)
infra/change-probe/.python_packages/
infra/change-probe/scripts/
*.zip

View File

@@ -6,7 +6,7 @@ This repository tracks Git-based snapshots of Microsoft Intune and Entra ID conf
The implementation is centered on three Azure DevOps pipelines:
- `azure-pipelines.yml`: hourly backup/export pipeline with rolling PR management.
- `azure-pipelines.yml`: daily full backup/export pipeline with rolling PR management (previously hourly; now driven primarily by event-driven change probe).
- `azure-pipelines-review-sync.yml`: 20-minute reviewer-decision sync and post-merge remediation queue.
- `azure-pipelines-restore.yml`: manual or auto-queued restore pipeline for approved baseline rollback.
@@ -33,15 +33,18 @@ Workflow at a high level:
```
.
├── azure-pipelines.yml # Main hourly backup pipeline
├── azure-pipelines.yml # Main backup pipeline (daily snapshot + event-driven trigger)
├── azure-pipelines-review-sync.yml # 20-minute review sync
├── azure-pipelines-restore.yml # Baseline restore pipeline
├── scripts/ # Python automation helpers
├── tests/ # unittest coverage for scripts
├── tenant-state/ # Committed JSON exports and reports
├── tenant-state/ # Committed JSON exports and reports
│ ├── intune/
│ ├── entra/
│ └── reports/
├── infra/ # Azure Function App (change probe)
│ └── change-probe/
├── deploy/ # Infrastructure provisioning scripts
├── docs/ # Security review docs and roadmap
├── md2pdf/ # HTML/PDF styling and configs
├── prod-as-built.md # Generated as-built source
@@ -63,6 +66,8 @@ Workflow at a high level:
- `update_pr_review_summary.py`: refreshes PR descriptions with change counts, risk assessment, and optional AI narrative.
- `apply_reviewer_rejections.py`: processes `/reject` and `/accept` reviewer thread commands.
- `queue_post_merge_restore.py`: queues restore pipeline after merged PRs that contained `/reject` decisions.
- `probe_tenant_changes.py`: polls Intune/Entra audit logs via Graph, implements debouncer (idle → armed → cooldown), and decides whether to trigger a backup.
- `trigger_backup_pipeline.py`: thin ADO REST API wrapper to queue the backup pipeline on demand.
## Code Style and Conventions
@@ -106,6 +111,17 @@ pip3 install "IntuneCD==2.5.0"
For local development, only a Python 3 interpreter is required; scripts use the standard library except for the optional IntuneCD package.
### Change Probe (Event-Driven Backup Trigger)
Because Microsoft Graph change notifications and delta queries do not support Intune device management or Conditional Access resources, an audit-log polling architecture is used instead:
- **Azure Function App** (`infra/change-probe/`):
- `probe_timer`: 5-minute timer trigger. Loads debouncer state from Azure Table Storage, runs `probe_tenant_changes.py`, writes state back, and emits a queue message when the debouncer triggers.
- `queue_consumer`: queue trigger. Dequeues messages and calls `trigger_backup_pipeline.py` to queue the ADO backup pipeline.
- **Debouncer**: 15-minute quiet window (idle → armed) + 30-minute cooldown. Prevents backup storms during bulk changes.
- **State**: stored in Azure Table Storage (`ProbeState` table).
- **Provisioning**: `deploy/provision-change-probe.ps1` creates the Entra app, grants admin consent, provisions Resource Group / Storage Account / Function App, and configures app settings.
### Pipeline Jobs
- **Intune backup job** (`backup_intune`):

View File

@@ -13,7 +13,7 @@ Quick start:
1. Fork or import this repository into an Azure DevOps project.
2. Review `templates/variables-tenant.yml` and create a matching Azure DevOps Variable Group in your project (e.g. `vg-astral-tenant`).
3. Uncomment the variable group reference in the three pipeline YAMLs.
4. Run `deploy/bootstrap-tenant.ps1` to create the Azure AD app registration, assign Graph permissions, and configure the federated credential.
4. Run `deploy/provision-change-probe.ps1` to create the Azure AD app registration, assign Graph permissions, configure the federated credential, and optionally provision the event-driven change probe (Azure Function App).
5. Create the Azure DevOps service connection using the app registration details from the bootstrap script.
6. Import the three pipelines (`azure-pipelines.yml`, `azure-pipelines-review-sync.yml`, `azure-pipelines-restore.yml`) into Azure DevOps.
7. Run `deploy/validate-deployment.yml` to verify connectivity and permissions.
@@ -25,7 +25,7 @@ See [`deploy/onboarding-runbook.md`](deploy/onboarding-runbook.md) for the full
The implementation is centered on three Azure DevOps pipelines:
- `azure-pipelines.yml`: hourly backup/export pipeline with rolling PR management.
- `azure-pipelines.yml`: backup/export pipeline with rolling PR management. Runs daily at 02:00 to generate a full tenant snapshot, reports, and documentation artifacts, and is also triggered on-demand by the event-driven change probe.
- `azure-pipelines-review-sync.yml`: 20-minute reviewer-decision sync and post-merge remediation queue.
- `azure-pipelines-restore.yml`: manual or auto-queued restore pipeline for approved baseline rollback.
@@ -39,6 +39,8 @@ The main workflow is:
6. Refresh the PR description with deterministic change/risk summary and optional Azure OpenAI narrative.
7. Apply reviewer `/reject` or `/accept` decisions and queue restore when needed.
An **event-driven change probe** monitors Intune and Entra audit logs and triggers the backup pipeline when actual changes are detected, replacing the previous hourly polling model with a responsive event-driven approach.
This is an ex-post change-management model: admins can change settings in the Microsoft admin portals, and the repo turns those changes into auditable Git drift with a review and rollback path.
## Current Baseline Coverage
@@ -80,10 +82,12 @@ Current scope behavior:
- `azure-pipelines.yml`: backup/export, report generation, drift commit, rolling PR, and docs/artifact flow.
- `azure-pipelines-review-sync.yml`: reviewer decision sync and post-merge remediation helper.
- `azure-pipelines-restore.yml`: baseline restore pipeline with full or selective scope.
- `infra/change-probe/`: Azure Function App for event-driven change detection.
- `deploy/provision-change-probe.ps1`: unified provisioning script for the change probe infrastructure.
- `docs/m365-baseline-roadmap.md`: expansion roadmap beyond current workload scope.
- `docs/security-review-package.md`: implementation-focused security review package.
- `docs/security-review-questionnaire.md`: short-form security review answers.
- `scripts/`: export, reporting, PR automation, validation, and remediation helpers.
- `scripts/`: export, reporting, PR automation, validation, remediation helpers, and change probe logic.
- `tests/`: focused unit coverage for the Python helpers.
- `tenant-state/intune`: committed Intune JSON export.
- `tenant-state/entra`: committed Entra JSON export.
@@ -96,7 +100,7 @@ Current scope behavior:
### Main Backup Pipeline
`azure-pipelines.yml` runs hourly on `main`.
`azure-pipelines.yml` runs daily at 02:00 on `main` to generate a full tenant snapshot, reports, and documentation artifacts. It is also triggered on-demand by the change probe when drift is detected.
For Intune it:
@@ -143,10 +147,21 @@ It also supports optional Entra update when restore automation is triggered for
## Schedule And Run Modes
- Main backup schedule: hourly, `0 * * * *`, on `main`
- Main backup schedule: daily at 02:00, `0 2 * * *`, on `main` (full snapshot, reports, and docs)
- Change probe trigger: event-driven, on-demand via Azure Function App
- Review sync schedule: every 20 minutes, `*/20 * * * *`, on `main`
- Full mode: configured full-run hour (default 00:00) or manual queue with `forceFullRun=true`
- Light mode: every other scheduled hour
- Light mode: all probe-triggered runs except the daily full run
### Change Probe (Event-Driven Backup Trigger)
Because Microsoft Graph change notifications and delta queries do not support Intune device management or Conditional Access resources, an audit-log polling architecture is used:
- **`probe_timer`** (5-minute timer trigger): polls Intune and Entra audit logs via Microsoft Graph, evaluates a debouncer state machine (idle → armed → cooldown), and emits a queue message when the quiet window elapses.
- **`queue_consumer`** (queue trigger): dequeues messages and calls the Azure DevOps REST API to queue the backup pipeline.
- **Debouncer**: 15-minute quiet window + 30-minute cooldown prevents backup storms during bulk changes.
- **State**: stored in Azure Table Storage (`ProbeState` table).
- **Provisioning**: `deploy/provision-change-probe.ps1` creates the Entra app, grants admin consent, provisions Resource Group / Storage Account / Function App, and configures app settings.
Full mode adds:
@@ -251,6 +266,14 @@ Auto-remediation:
- `AUTO_REMEDIATE_MAX_WORKERS`
- `AUTO_REMEDIATE_EXCLUDE_CSV`
Change probe settings:
- `PROBE_APP_ID`
- `PROBE_APP_SECRET`
- `PROBE_QUIET_WINDOW_MINUTES` (default: 15)
- `PROBE_COOLDOWN_MINUTES` (default: 30)
- `GRAPH_TOKEN` (optional passthrough)
Azure OpenAI integration:
- `ENABLE_PR_AI_SUMMARY`
@@ -408,6 +431,28 @@ python3 ./scripts/validate_backup_outputs.py \
--reports-root ./tenant-state/reports/intune
```
Run the change probe locally:
```bash
python3 ./scripts/probe_tenant_changes.py \
--app-id "$PROBE_APP_ID" \
--app-secret "$PROBE_APP_SECRET" \
--tenant-id "$TENANT_ID" \
--state-file ./probe-state.json \
--output ./probe-result.json
```
Trigger the backup pipeline manually:
```bash
python3 ./scripts/trigger_backup_pipeline.py \
--organization cqre \
--project Intune \
--pipeline-id 1 \
--token "$ADO_TOKEN" \
--branch refs/heads/main
```
## Tests
The repository includes focused unit tests for:

View File

@@ -6,8 +6,8 @@ parameters:
default: false
schedules:
- cron: "0 * * * *"
displayName: "Hourly backup (full run at configured timezone)"
- cron: "0 2 * * *"
displayName: "Daily full backup and report generation"
branches:
include:
- main
@@ -369,6 +369,19 @@ jobs:
workingDirectory: "$(Build.SourcesDirectory)"
failOnStderr: true
- task: Bash@3
displayName: Revert formatting-only Intune JSON exports
inputs:
targetType: inline
script: |
set -euo pipefail
python3 "$(Build.SourcesDirectory)/scripts/filter_intune_formatting_noise.py" \
--repo-root "$(Build.SourcesDirectory)" \
--backup-root "$(Build.SourcesDirectory)/$(BACKUP_FOLDER)/$(INTUNE_BACKUP_SUBDIR)" \
--baseline-ref "origin/$(BASELINE_BRANCH)"
workingDirectory: "$(Build.SourcesDirectory)"
failOnStderr: true
- task: Bash@3
displayName: Resolve assignment group names
inputs:
@@ -1017,13 +1030,14 @@ jobs:
}
$backupStart = [DateTime]::ParseExact("$(BACKUP_START)", "yyyy.MM.dd:HH.mm.ss", $null).ToUniversalTime()
$filterDateTimeTo = Get-Date -Date $backupStart -Format "yyyy-MM-ddTHH:mm:ss"
$auditQueryEnd = $backupStart.AddMinutes(-10)
$filterDateTimeTo = Get-Date -Date $auditQueryEnd -Format "yyyy-MM-ddTHH:mm:ss"
$filter += "ActivityDateTime le $filterDateTimeTo`Z"
$eventFilter = $filter -join " and "
"`nGetting Intune event logs"
"`t- from: '$lastCommitDate' (UTC) to: '$backupStart' (UTC)"
"`t- from: '$lastCommitDate' (UTC) to: '$auditQueryEnd' (UTC)"
"`t- filter: $eventFilter"
$modificationEvent = Get-MgDeviceManagementAuditEvent -Filter $eventFilter -All

View File

@@ -24,6 +24,8 @@ Expected result: **zero matches** outside of this release checklist.
- [ ] `azure-pipelines-restore.yml` contains no hardcoded tenant domain, email, or service connection name.
- [ ] `azure-pipelines-review-sync.yml` contains no hardcoded tenant-specific values.
- [ ] `scripts/common.py` uses a generic fallback name (not `CQRE_Intune_Backupper`).
- [ ] `infra/change-probe/` contains no tenant-specific IDs, secrets, or connection strings.
- [ ] `infra/change-probe/local.settings.json` is excluded (only `.example` should exist).
- [ ] `tenant-state/` contains only placeholder files (`.gitkeep`, `README.md`).
- [ ] `prod-as-built.md` has been deleted.
- [ ] All markdown documentation uses generic examples (`contoso.onmicrosoft.com`, `astral-backup@contoso.com`, `sc-astral-backup`).

View File

@@ -130,6 +130,64 @@ After importing `azure-pipelines-restore.yml`, find its definition ID:
2. Set `forceFullRun=true` to get a complete initial snapshot.
3. Verify that `tenant-state/` is populated and a rolling PR is created.
## Step 11: Provision the event-driven change probe (optional but recommended)
The change probe replaces the previous hourly polling model with responsive, event-driven backup triggers.
### Option A: Automated provisioning
Run the unified provisioning script:
```powershell
.\deploy\provision-change-probe.ps1 `
-TenantName "contoso.onmicrosoft.com" `
-ResourceGroupName "rg-astral-probe" `
-Location "westeurope" `
-DeployFunctionApp
```
The script will create an Entra app, grant admin consent, provision Azure resources, and deploy the Function App.
### Option B: Manual provisioning
If you prefer manual setup:
1. **Create an app registration** in Entra ID for the probe.
2. **Grant admin consent** for:
- `DeviceManagementConfiguration.Read.All`
- `DeviceManagementApps.Read.All`
- `AuditLog.Read.All`
- `Directory.Read.All`
3. **Create a client secret** and note the value.
4. **Provision Azure resources**:
- Resource Group
- Storage Account (Standard LRS)
- Function App (Linux Consumption, Python 3.11)
5. **Configure Function App settings**:
| Setting | Value |
|---|---|
| `AzureWebJobsStorage` | Storage account connection string |
| `PROBE_APP_ID` | App registration client ID |
| `PROBE_APP_SECRET` | App registration client secret |
| `TENANT_ID` | Your Microsoft 365 tenant ID |
| `ADO_ORGANIZATION` | Your Azure DevOps org name |
| `ADO_PROJECT` | Your Azure DevOps project name |
| `ADO_PIPELINE_ID` | Definition ID of `azure-pipelines.yml` |
| `ADO_TOKEN` | Azure DevOps PAT with **Build (read & execute)** |
| `ADO_BRANCH` | `main` (or your baseline branch) |
6. **Deploy the function package** using `WEBSITE_RUN_FROM_PACKAGE` (see `infra/change-probe/README.md`).
### Verify the probe
1. Make a test change in Intune (e.g., create a temporary device configuration profile).
2. Wait 520 minutes for the audit log to propagate.
3. Check the `ProbeState` table in your Storage Account — the `singleton/default` entity should show `debouncer.state = armed`.
4. After the quiet window (default 15 min) elapses, a queue message will be emitted.
5. The `queue_consumer` will dequeue it and queue the backup pipeline.
6. Verify the pipeline run appears in Azure DevOps with reason `manual` (API-triggered runs show as manual).
> **Note:** The probe uses the same Entra app as the main backup pipeline. You can reuse the app registration created by `bootstrap-tenant.ps1` if you add the `AuditLog.Read.All` permission and create a client secret for it.
## Optional: progressive feature rollout
| Phase | What to enable |

578
deploy/provision-change-probe.ps1 Executable file
View File

@@ -0,0 +1,578 @@
#requires -Version 5.1
<#
.SYNOPSIS
One-stop provisioning script for the ASTRAL change probe.
.DESCRIPTION
This script handles the entire probe deployment in one pass:
1. Creates (or updates) a dedicated Entra app registration with Graph permissions.
2. Grants admin consent.
3. Provisions Azure resources (Resource Group, Storage Account, Function App).
4. Configures Function App settings.
5. Optionally deploys the function code if the Azure Functions Core Tools (func) are installed.
Any parameter omitted on the command line is prompted for interactively.
.PARAMETER AppDisplayName
Display name for the Entra app registration. Default: "ASTRAL Change Probe".
.PARAMETER ResourceGroup
Azure resource group name. Default: "rg-astral-probe".
.PARAMETER Location
Azure region. Default: "westeurope".
.PARAMETER SubscriptionId
Azure subscription ID. If omitted, the current default subscription is used.
.PARAMETER AdoOrganization
Azure DevOps organization name (e.g. "contoso").
.PARAMETER AdoProject
Azure DevOps project name.
.PARAMETER AdoPipelineId
Azure DevOps pipeline ID (numeric).
.PARAMETER AdoToken
Azure DevOps Personal Access Token with Build (Read & Execute) scope.
.PARAMETER AdoBranch
Git branch the pipeline should run against. Default: "main".
.PARAMETER QuietWindowMinutes
Debouncer quiet window. Default: 15.
.PARAMETER CooldownMinutes
Debouncer cooldown. Default: 30.
.EXAMPLE
.\provision-change-probe.ps1
.EXAMPLE
.\provision-change-probe.ps1 -AdoOrganization "cqre" -AdoProject "ASTRAL" -AdoPipelineId "42"
#>
[CmdletBinding()]
param (
[string]$AppDisplayName = "ASTRAL Change Probe",
[string]$ResourceGroup = "rg-astral-probe",
[string]$Location = "westeurope",
[string]$SubscriptionId = "",
[string]$AdoOrganization = "",
[string]$AdoProject = "",
[string]$AdoPipelineId = "",
[string]$AdoToken = "",
[string]$AdoBranch = "main",
[int]$QuietWindowMinutes = 15,
[int]$CooldownMinutes = 30
)
$ErrorActionPreference = "Stop"
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
function Get-OrPrompt {
param ([string]$Value, [string]$Prompt, [switch]$Sensitive)
if ($Value) { return $Value }
if ($Sensitive) {
return Read-Host -Prompt $Prompt -AsSecureString | ForEach-Object { [PSCredential]::New("x", $_).GetNetworkCredential().Password }
}
return Read-Host -Prompt $Prompt
}
function Test-Command {
param ([string]$Name)
return [bool](Get-Command $Name -ErrorAction SilentlyContinue)
}
function Invoke-AzCli {
param (
[string[]]$ArgumentList,
[switch]$NoRetry
)
# Clone the array so recursive calls don't double-append --subscription.
$argsCopy = @() + $ArgumentList
if ($SubscriptionId) {
$argsCopy += @("--subscription", $SubscriptionId)
}
# Suppress Python SyntaxWarnings that leak from the Azure CLI into stderr/stdout.
$env:PYTHONWARNINGS = "ignore"
$output = & az @argsCopy 2>&1
$env:PYTHONWARNINGS = ""
if ($LASTEXITCODE -ne 0) {
$outputStrings = @()
$hasSubNotFound = $false
foreach ($line in $output) {
$str = if ($line -is [string]) { $line } else { $line.ToString() }
$outputStrings += $str
if ($str -match "SubscriptionNotFound") { $hasSubNotFound = $true }
}
$outputString = $outputStrings -join "`n"
if ((-not $NoRetry) -and $hasSubNotFound) {
Write-Host "`nARM returned SubscriptionNotFound. Clearing token cache and re-authenticating..." -ForegroundColor Yellow
$subTenantId = Get-SubscriptionTenantId -SubId $SubscriptionId
$promptTenant = if ($subTenantId) { $subTenantId } else { $tenantId }
& az account clear | Out-Null
& az login --tenant $promptTenant | Out-Host
if ($LASTEXITCODE -ne 0) { throw "az login --tenant $promptTenant failed." }
# Explicitly set subscription and give token cache time to settle.
& az account set --subscription $SubscriptionId | Out-Null
Start-Sleep -Seconds 2
Invoke-AzCli -ArgumentList $ArgumentList -NoRetry
return
}
throw "az command failed: az $($argsCopy -join ' ')`n$outputString"
}
return $output
}
function Test-ModuleInstalled {
param ([string]$Name)
$mod = Get-Module -ListAvailable -Name $Name | Select-Object -First 1
if (-not $mod) {
Write-Host "Installing module: $Name" -ForegroundColor Cyan
Install-Module $Name -Scope CurrentUser -Force -AllowClobber
}
}
# ---------------------------------------------------------------------------
# Prerequisites
# ---------------------------------------------------------------------------
Write-Host "=== ASTRAL Change Probe Provisioning ===" -ForegroundColor Green
if (-not (Test-Command "az")) {
throw "Azure CLI (az) is not installed or not in PATH. Install from https://aka.ms/installazurecli"
}
Write-Host "Checking Microsoft Graph modules..." -ForegroundColor Cyan
Test-ModuleInstalled "Microsoft.Graph.Applications"
Import-Module Microsoft.Graph.Applications
# ---------------------------------------------------------------------------
# Interactive prompts
# ---------------------------------------------------------------------------
Write-Host "`n--- Azure DevOps Settings ---" -ForegroundColor Cyan
$AdoOrganization = Get-OrPrompt -Value $AdoOrganization -Prompt "Azure DevOps Organization (e.g. 'cqre')"
$AdoProject = Get-OrPrompt -Value $AdoProject -Prompt "Azure DevOps Project"
$AdoPipelineId = Get-OrPrompt -Value $AdoPipelineId -Prompt "Azure DevOps Pipeline ID (numeric)"
$AdoToken = Get-OrPrompt -Value $AdoToken -Prompt "Azure DevOps PAT (Build Read & Execute)" -Sensitive
# ---------------------------------------------------------------------------
# Graph authentication
# ---------------------------------------------------------------------------
Write-Host "`nConnecting to Microsoft Graph..." -ForegroundColor Cyan
Connect-MgGraph -Scopes "Application.ReadWrite.All","AppRoleAssignment.ReadWrite.All","Directory.Read.All" -NoWelcome
$tenant = Get-MgOrganization | Select-Object -First 1
Write-Host "Tenant: $($tenant.DisplayName) ($($tenant.Id))" -ForegroundColor Green
# ---------------------------------------------------------------------------
# App registration
# ---------------------------------------------------------------------------
$requiredPermissions = @(
"AuditLog.Read.All",
"DeviceManagementApps.Read.All",
"DeviceManagementConfiguration.Read.All",
"DeviceManagementManagedDevices.Read.All",
"DeviceManagementScripts.Read.All",
"DeviceManagementServiceConfig.Read.All"
)
$graphSp = Get-MgServicePrincipal -Filter "appId eq '00000003-0000-0000-c000-000000000000'"
if (-not $graphSp) { throw "Microsoft Graph service principal not found." }
$appRoles = @()
foreach ($permName in $requiredPermissions) {
$appRole = $graphSp.AppRoles | Where-Object { $_.Value -eq $permName } | Select-Object -First 1
if (-not $appRole) {
Write-Warning "Permission '$permName' not found. Skipping."
continue
}
$appRoles += $appRole
}
$resourceAccess = @()
foreach ($ar in $appRoles) {
$resourceAccess += @{ id = $ar.Id; type = "Role" }
}
$requiredResourceAccess = @(
@{
resourceAppId = $graphSp.AppId
resourceAccess = $resourceAccess
}
)
$existingApp = Get-MgApplication -Filter "displayName eq '$AppDisplayName'" | Select-Object -First 1
if ($existingApp) {
Write-Host "Found existing app registration: $($existingApp.AppId)" -ForegroundColor Yellow
$app = $existingApp
Update-MgApplication -ApplicationId $app.Id -RequiredResourceAccess $requiredResourceAccess
Write-Host "Updated required resource access." -ForegroundColor Green
} else {
Write-Host "Creating app registration: $AppDisplayName" -ForegroundColor Cyan
$app = New-MgApplication -DisplayName $AppDisplayName -SignInAudience "AzureADMyOrg" -RequiredResourceAccess $requiredResourceAccess
Write-Host "Created app registration. AppId: $($app.AppId)" -ForegroundColor Green
}
$sp = Get-MgServicePrincipal -Filter "appId eq '$($app.AppId)'" | Select-Object -First 1
if (-not $sp) {
Write-Host "Creating service principal..." -ForegroundColor Cyan
$sp = New-MgServicePrincipal -AppId $app.AppId
}
Write-Host "Granting admin consent..." -ForegroundColor Cyan
foreach ($ar in $appRoles) {
$existingAssignment = Get-MgServicePrincipalAppRoleAssignment -ServicePrincipalId $sp.Id | Where-Object { $_.AppRoleId -eq $ar.Id }
if (-not $existingAssignment) {
New-MgServicePrincipalAppRoleAssignment -ServicePrincipalId $sp.Id -PrincipalId $sp.Id -ResourceId $graphSp.Id -AppRoleId $ar.Id | Out-Null
}
}
Write-Host "Admin consent granted." -ForegroundColor Green
# Client secret
$secretDescription = "ChangeProbeSecret"
$appWithCreds = Get-MgApplication -ApplicationId $app.Id -Property "id,passwordCredentials"
$existingSecrets = $appWithCreds.PasswordCredentials | Where-Object { $_.DisplayName -eq $secretDescription }
foreach ($cred in $existingSecrets) {
Write-Host "Removing old client secret ($($cred.KeyId))..." -ForegroundColor Yellow
Remove-MgApplicationPassword -ApplicationId $app.Id -BodyParameter @{ "keyId" = $cred.KeyId }
}
Write-Host "Creating new client secret (valid 1 year)..." -ForegroundColor Cyan
$passwordCred = @{
displayName = $secretDescription
endDateTime = (Get-Date).AddYears(1).ToString("o")
}
$secret = Add-MgApplicationPassword -ApplicationId $app.Id -BodyParameter $passwordCred
$probeAppId = $app.AppId
$probeAppSecret = $secret.SecretText
$tenantId = $tenant.Id
# ---------------------------------------------------------------------------
# Azure authentication
# ---------------------------------------------------------------------------
Write-Host "`n--- Azure Resources ---" -ForegroundColor Cyan
function Ensure-AzLogin {
param ([string]$TenantId)
try {
$null = Invoke-AzCli -ArgumentList @("account", "show", "--output", "none")
} catch {
if ($_ -match "az login") {
$answer = Read-Host -Prompt "You are not logged in to Azure CLI. Run 'az login' now? [Y/n]"
if ($answer -eq "" -or $answer -match "^[Yy]") {
if ($TenantId) {
& az login --tenant $TenantId | Out-Host
} else {
& az login | Out-Host
}
if ($LASTEXITCODE -ne 0) {
throw "az login failed. Please run 'az login' manually and retry."
}
} else {
throw "Azure login required. Run 'az login' and retry."
}
} else {
throw
}
}
}
Ensure-AzLogin -TenantId $tenantId
function Select-Subscription {
param ([string]$CurrentId)
# Run az directly and filter out stderr warning objects so only stdout strings reach ConvertFrom-Json.
$lines = & az account list --output json 2>&1
$stringLines = $lines | Where-Object { $_ -is [string] }
if ($LASTEXITCODE -ne 0) {
$errorLines = $lines | Where-Object { $_ -is [System.Management.Automation.ErrorRecord] } | ForEach-Object { $_.ToString() }
throw "az account list failed:`n$($errorLines -join "`n")"
}
$subs = ($stringLines -join "`n") | ConvertFrom-Json
if ($subs.Count -eq 0) {
throw "No Azure subscriptions found. Ensure your account has access to at least one subscription."
}
if ($subs.Count -eq 1) {
$sub = $subs[0]
Invoke-AzCli -ArgumentList @("account", "set", "--subscription", $sub.id)
return $sub
}
Write-Host "`nAvailable subscriptions:" -ForegroundColor Cyan
for ($i = 0; $i -lt $subs.Count; $i++) {
$marker = if ($subs[$i].id -eq $CurrentId) { " (*)" } else { "" }
Write-Host " [$i] $($subs[$i].name) ($($subs[$i].id))$marker"
}
$selection = Read-Host -Prompt "Select subscription by number"
if (-not [int]::TryParse($selection, [ref]$null)) {
throw "Invalid selection. Aborting."
}
$chosen = $subs[[int]$selection]
if (-not $chosen) {
throw "Invalid selection. Aborting."
}
Invoke-AzCli -ArgumentList @("account", "set", "--subscription", $chosen.id)
return $chosen
}
$azLines = & az account show --output json 2>&1
$azStringLines = $azLines | Where-Object { $_ -is [string] }
if ($LASTEXITCODE -ne 0) {
$azErrorLines = $azLines | Where-Object { $_ -is [System.Management.Automation.ErrorRecord] } | ForEach-Object { $_.ToString() }
throw "az account show failed:`n$($azErrorLines -join "`n")"
}
$azAccount = ($azStringLines -join "`n") | ConvertFrom-Json
$currentSubId = $azAccount.id
function Get-SubscriptionTenantId {
param ([string]$SubId)
$lines = & az account list --output json 2>&1
$stringLines = $lines | Where-Object { $_ -is [string] }
$subs = ($stringLines -join "`n") | ConvertFrom-Json
$sub = $subs | Where-Object { $_.id -eq $SubId } | Select-Object -First 1
if ($sub) { return $sub.tenantId } else { return $null }
}
if ($SubscriptionId) {
Invoke-AzCli -ArgumentList @("account", "set", "--subscription", $SubscriptionId)
$subTenantId = Get-SubscriptionTenantId -SubId $SubscriptionId
$azTenantLines = & az account show --query tenantId --output tsv 2>&1 | Where-Object { $_ -is [string] }
$azTenantId = ($azTenantLines -join "").Trim()
if ($subTenantId -and $azTenantId -ne $subTenantId) {
Write-Host "`nSubscription '$SubscriptionId' belongs to tenant '$subTenantId' but current az context is '$azTenantId'." -ForegroundColor Yellow
Write-Host "Re-authenticating to the subscription's tenant..." -ForegroundColor Yellow
& az account clear | Out-Null
& az login --tenant $subTenantId | Out-Host
if ($LASTEXITCODE -ne 0) { throw "az login --tenant $subTenantId failed." }
Invoke-AzCli -ArgumentList @("account", "set", "--subscription", $SubscriptionId)
}
Write-Host "Using specified subscription: $SubscriptionId" -ForegroundColor Green
} else {
$chosenSub = Select-Subscription -CurrentId $currentSubId
$SubscriptionId = $chosenSub.id
$subTenantId = $chosenSub.tenantId
$azTenantLines = & az account show --query tenantId --output tsv 2>&1 | Where-Object { $_ -is [string] }
$azTenantId = ($azTenantLines -join "").Trim()
if ($subTenantId -and $azTenantId -ne $subTenantId) {
Write-Host "`nSubscription '$SubscriptionId' belongs to tenant '$subTenantId' but current az context is '$azTenantId'." -ForegroundColor Yellow
Write-Host "Re-authenticating to the subscription's tenant..." -ForegroundColor Yellow
& az account clear | Out-Null
& az login --tenant $subTenantId | Out-Host
if ($LASTEXITCODE -ne 0) { throw "az login --tenant $subTenantId failed." }
$chosenSub = Select-Subscription -CurrentId $SubscriptionId
$SubscriptionId = $chosenSub.id
}
Write-Host "Using subscription: $SubscriptionId" -ForegroundColor Green
}
# Validate the subscription is accessible for ARM operations (catches tenant mismatches).
try {
$null = Invoke-AzCli -ArgumentList @("group", "list", "--output", "none")
} catch {
if ($_ -match "SubscriptionNotFound") {
Write-Host "`nThe selected subscription is listed but ARM operations fail with 'SubscriptionNotFound'." -ForegroundColor Yellow
Write-Host "This usually means the subscription belongs to a different Entra tenant." -ForegroundColor Yellow
$subTenantId = Get-SubscriptionTenantId -SubId $SubscriptionId
$promptTenant = if ($subTenantId) { $subTenantId } else { $tenantId }
$answer = Read-Host -Prompt "Run 'az login --tenant $promptTenant' now and retry? [Y/n]"
if ($answer -eq "" -or $answer -match "^[Yy]") {
& az account clear | Out-Null
& az login --tenant $promptTenant | Out-Host
if ($LASTEXITCODE -ne 0) {
throw "az login --tenant failed. Please run it manually and retry."
}
$chosenSub = Select-Subscription -CurrentId $SubscriptionId
$SubscriptionId = $chosenSub.id
Write-Host "Using subscription: $SubscriptionId" -ForegroundColor Green
# Validate again
$null = Invoke-AzCli -ArgumentList @("group", "list", "--output", "none")
} else {
throw "Subscription validation failed. Run 'az login --tenant $promptTenant' and retry."
}
} else {
throw
}
}
# ---------------------------------------------------------------------------
# Resource Group
# ---------------------------------------------------------------------------
Write-Host "Ensuring resource group '$ResourceGroup'..." -ForegroundColor Cyan
Invoke-AzCli -ArgumentList @("group", "create", "--name", $ResourceGroup, "--location", $Location, "--output", "none")
# Quick diagnostic: confirm ARM can read back the RG in this subscription.
try {
$diag = Invoke-AzCli -ArgumentList @("group", "show", "--name", $ResourceGroup, "--query", "id", "--output", "tsv")
Write-Host "ARM context OK (RG id: $diag)" -ForegroundColor Green
} catch {
Write-Host "WARNING: ARM diagnostic failed: $_" -ForegroundColor Yellow
}
# ---------------------------------------------------------------------------
# Storage Account
# ---------------------------------------------------------------------------
$randomSuffix = [System.Guid]::NewGuid().ToString("n").Substring(0, 8)
$StorageName = "stastralprobe$randomSuffix"
$FunctionAppName = "func-astral-probe-$randomSuffix"
function Wait-ProviderRegistration {
param ([string]$Namespace)
$state = ""
$attempts = 0
while ($state -ne "Registered" -and $attempts -lt 30) {
$state = Invoke-AzCli -ArgumentList @("provider", "show", "--namespace", $Namespace, "--query", "registrationState", "--output", "tsv")
if ($state -eq "Registered") { break }
Start-Sleep -Seconds 10
$attempts++
}
if ($state -ne "Registered") {
throw "Timed out waiting for $Namespace provider to register."
}
}
Write-Host "Creating storage account '$StorageName'..." -ForegroundColor Cyan
# Ensure Microsoft.Storage provider is registered (required for new subscriptions).
$storageProv = Invoke-AzCli -ArgumentList @("provider", "show", "--namespace", "Microsoft.Storage", "--query", "registrationState", "--output", "tsv")
if ($storageProv -ne "Registered") {
Write-Host "Registering Microsoft.Storage provider..." -ForegroundColor Yellow
Invoke-AzCli -ArgumentList @("provider", "register", "--namespace", "Microsoft.Storage")
Wait-ProviderRegistration -Namespace "Microsoft.Storage"
Write-Host "Microsoft.Storage registered." -ForegroundColor Green
}
Invoke-AzCli -ArgumentList @(
"storage", "account", "create",
"--name", $StorageName,
"--resource-group", $ResourceGroup,
"--location", $Location,
"--sku", "Standard_LRS",
"--kind", "StorageV2",
"--output", "none"
)
$storageConnection = Invoke-AzCli -ArgumentList @(
"storage", "account", "show-connection-string",
"--name", $StorageName,
"--resource-group", $ResourceGroup,
"--query", "connectionString",
"--output", "tsv"
)
# ---------------------------------------------------------------------------
# Table and Queue
# ---------------------------------------------------------------------------
Write-Host "Creating Table and Queue..." -ForegroundColor Cyan
Invoke-AzCli -ArgumentList @("storage", "table", "create", "--name", "ProbeState", "--connection-string", $storageConnection, "--output", "none")
Invoke-AzCli -ArgumentList @("storage", "queue", "create", "--name", "backup-trigger-queue", "--connection-string", $storageConnection, "--output", "none")
# ---------------------------------------------------------------------------
# Function App
# ---------------------------------------------------------------------------
# Ensure Microsoft.Web provider is registered (required for Function Apps).
$webProv = Invoke-AzCli -ArgumentList @("provider", "show", "--namespace", "Microsoft.Web", "--query", "registrationState", "--output", "tsv")
if ($webProv -ne "Registered") {
Write-Host "Registering Microsoft.Web provider..." -ForegroundColor Yellow
Invoke-AzCli -ArgumentList @("provider", "register", "--namespace", "Microsoft.Web")
Wait-ProviderRegistration -Namespace "Microsoft.Web"
Write-Host "Microsoft.Web registered." -ForegroundColor Green
}
Write-Host "Creating Function App '$FunctionAppName'..." -ForegroundColor Cyan
Invoke-AzCli -ArgumentList @(
"functionapp", "create",
"--name", $FunctionAppName,
"--resource-group", $ResourceGroup,
"--storage-account", $StorageName,
"--consumption-plan-location", $Location,
"--os-type", "Linux",
"--runtime", "python",
"--runtime-version", "3.11",
"--functions-version", "4",
"--output", "none"
)
# ---------------------------------------------------------------------------
# App Settings
# ---------------------------------------------------------------------------
Write-Host "Configuring Function App settings..." -ForegroundColor Cyan
Invoke-AzCli -ArgumentList @(
"functionapp", "config", "appsettings", "set",
"--name", $FunctionAppName,
"--resource-group", $ResourceGroup,
"--settings",
"AzureWebJobsStorage=$storageConnection",
"FUNCTIONS_EXTENSION_VERSION=~4",
"FUNCTIONS_WORKER_RUNTIME=python",
"WEBSITE_RUN_FROM_PACKAGE=1",
"PROBE_APP_ID=$probeAppId",
"PROBE_APP_SECRET=$probeAppSecret",
"TENANT_ID=$tenantId",
"GRAPH_TOKEN=",
"ADO_ORGANIZATION=$AdoOrganization",
"ADO_PROJECT=$AdoProject",
"ADO_PIPELINE_ID=$AdoPipelineId",
"ADO_TOKEN=$AdoToken",
"ADO_BRANCH=$AdoBranch",
"PROBE_QUIET_WINDOW_MINUTES=$QuietWindowMinutes",
"PROBE_COOLDOWN_MINUTES=$CooldownMinutes",
"REPO_ROOT=/home/site/wwwroot",
"--output", "none"
)
# ---------------------------------------------------------------------------
# Optional: code deployment
# ---------------------------------------------------------------------------
$funcAvailable = Test-Command "func"
if ($funcAvailable) {
$repoRoot = Split-Path -Parent $PSScriptRoot
$probePath = Join-Path $repoRoot "infra" "change-probe"
if (Test-Path $probePath) {
$deployNow = Read-Host -Prompt "`nDeploy function code now? [Y/n]"
if ($deployNow -eq "" -or $deployNow -match "^[Yy]") {
Write-Host "Deploying function code..." -ForegroundColor Cyan
Push-Location $probePath
try {
& func azure functionapp publish $FunctionAppName
if ($LASTEXITCODE -ne 0) {
Write-Warning "Function deployment returned exit code $LASTEXITCODE. You can retry manually later."
}
} finally {
Pop-Location
}
}
}
} else {
Write-Host "`nAzure Functions Core Tools (func) not found. Skipping code deployment." -ForegroundColor Yellow
Write-Host "Install from https://github.com/Azure/azure-functions-core-tools#installing" -ForegroundColor Yellow
}
# ---------------------------------------------------------------------------
# Summary
# ---------------------------------------------------------------------------
Write-Host "`n=== Provisioning Complete ===" -ForegroundColor Green
Write-Host "Subscription: $SubscriptionId"
Write-Host "Resource Group: $ResourceGroup"
Write-Host "Storage Account: $StorageName"
Write-Host "Function App: $FunctionAppName"
Write-Host "App Registration: $probeAppId"
Write-Host "`nNext steps:"
Write-Host " - Verify the timer trigger in the Azure Portal or with:"
Write-Host " az functionapp function show --name $FunctionAppName --resource-group $ResourceGroup --function-name probe_timer"
Write-Host " - To redeploy code later:"
Write-Host " cd infra/change-probe && func azure functionapp publish $FunctionAppName"

View File

@@ -2,7 +2,7 @@
# ASTRAL Security Review Package
Prepared: 2026-03-27
Prepared: 2026-04-20
## Purpose
@@ -59,34 +59,43 @@ Important clarifications:
| Azure DevOps pipeline `azure-pipelines.yml` | Scheduled backup, drift commit, rolling PR management, documentation artifact publishing | Main execution path |
| Azure DevOps pipeline `azure-pipelines-review-sync.yml` | Processes reviewer `/reject` and `/accept` decisions and refreshes PR summaries | Uses Azure DevOps API token |
| Azure DevOps pipeline `azure-pipelines-restore.yml` | Restores approved baseline to tenant | Write-capable path |
| Azure Function App (`infra/change-probe`) | Event-driven probe: polls audit logs, debounces, triggers backup pipeline on demand | Outbound-only; uses separate Entra app registration |
| Azure Table Storage | Persists probe debouncer state (`ProbeState` table) | No sensitive tenant data |
| Azure Queue Storage | Receives trigger messages from probe timer for queue consumer | No sensitive tenant data |
| Azure DevOps Git repository | Stores approved baseline, drift branches, JSON exports, reports, docs | Primary configuration store |
| Microsoft Graph | Source of Intune and Entra configuration; optional target for restore | Production tenant access |
| Azure DevOps REST APIs | PR creation/update, review thread sync, restore queueing | Change-management control plane |
| Microsoft Graph | Source of Intune and Entra configuration; optional target for restore; audit log source for probe | Production tenant access |
| Azure DevOps REST APIs | PR creation/update, review thread sync, restore queueing, pipeline trigger | Change-management control plane |
| Optional Azure OpenAI | PR summary generation only | Optional data egress path |
### High-Level Flow
```mermaid
flowchart LR
A["Azure DevOps scheduled pipeline"] --> B["Federated service connection"]
B --> C["Microsoft Graph"]
A --> D["Git repo: main + drift branches"]
A --> E["Azure DevOps PR and thread APIs"]
A --> F["Build artifacts: markdown / HTML / PDF"]
A -. optional .-> G["Azure OpenAI"]
H["Reviewer in Azure DevOps"] --> E
E --> I["Rolling PR approval / rejection"]
I -. optional remediation .-> J["Restore pipeline"]
J --> C
A["Azure Function App<br/>probe_timer"] --> B["Microsoft Graph<br/>audit logs"]
A --> C["Azure Table Storage<br/>ProbeState"]
A --> D["Azure Queue Storage<br/>backup-trigger-queue"]
E["Azure Function App<br/>queue_consumer"] --> D
E --> F["Azure DevOps REST API<br/>queue pipeline run"]
G["Azure DevOps scheduled pipeline<br/>daily snapshot + reports"] --> H["Federated service connection"]
H --> B
G --> I["Git repo: main + drift branches"]
G --> J["Azure DevOps PR and thread APIs"]
G --> K["Build artifacts: markdown / HTML / PDF"]
G -. optional .-> L["Azure OpenAI"]
M["Reviewer in Azure DevOps"] --> J
J --> N["Rolling PR approval / rejection"]
N -. optional remediation .-> O["Restore pipeline"]
O --> B
```
## Deployment Model
### Backup and Review
The main pipeline runs hourly on `main`.
The main pipeline runs daily at 02:00 on `main` to generate a full tenant snapshot, reports, and documentation artifacts. The primary trigger is the event-driven change probe, which queues the pipeline on demand when drift is detected.
- Every hour: export Intune and Entra configuration, generate reports, commit drift to rolling workload branches, and update one rolling PR per workload.
- On change detection: the probe timer polls audit logs every 5 minutes. After a 15-minute quiet window with no new events, it queues the backup pipeline.
- Daily at 02:00: export Intune and Entra configuration, generate reports, commit drift to rolling workload branches, and update one rolling PR per workload.
- When delayed reviewer notifications are enabled, newly created rolling PRs are opened as Azure DevOps draft PRs, the automated summary is inserted, and the PR is then published for reviewer notification.
- At the configured full-run hour: perform the same work plus documentation artifact generation (Markdown, and optionally HTML/PDF if browser dependencies are available).
@@ -129,6 +138,7 @@ It supports:
| Generated reports | assignment inventories, object inventories, app inventories | Derived from exported configuration | `tenant-state/reports/**` and build artifacts |
| Documentation artifacts | split markdown, optional HTML/PDF | Derived from exported configuration | build artifacts |
| Review metadata | PR descriptions, review threads, accept/reject commands | Azure DevOps reviewers | Azure DevOps PR APIs |
| Probe state | debouncer state (timestamps, enum values) | Derived from audit log evaluation | Azure Table Storage (`ProbeState`) |
| Optional AI summary payload | sampled changed paths, semantic change descriptions, deterministic summary, fingerprints | Derived from repo diff | Azure OpenAI request payload |
### Data Sensitivity Notes
@@ -145,6 +155,8 @@ It supports:
The pipelines obtain a Microsoft Graph access token at runtime using the Azure DevOps service connection configured in `SERVICE_CONNECTION_NAME` (e.g. `sc-astral-backup`).
The change probe uses a **separate Entra app registration** (`ASTRAL Change Probe`) with its own client credentials to authenticate to Microsoft Graph for audit log polling. This app is created by `deploy/provision-change-probe.ps1` and is distinct from the pipeline service connection identity.
Observed controls in the implementation:
- token acquisition is performed at runtime with `Get-AzAccessToken`,
@@ -196,6 +208,20 @@ Read-oriented Graph application permissions documented in the repository:
- `RoleManagement.Read.Directory` or `Directory.Read.All` for richer enrichment
- `AuditLog.Read.All` if commit author attribution is desired
#### Change Probe Mode
The probe app registration requires these read-only Graph application permissions:
- `AuditLog.Read.All` (reads directory and Intune audit logs)
- `DeviceManagementApps.Read.All`
- `DeviceManagementConfiguration.Read.All`
- `DeviceManagementManagedDevices.Read.All`
- `Policy.Read.All`
- `Policy.Read.ConditionalAccess`
- `Application.Read.All`
The probe does **not** require write permissions. It only polls audit logs and queues the backup pipeline.
#### Restore Mode
Write-capable Graph application permissions documented in the repository:
@@ -219,6 +245,8 @@ Write-capable Graph application permissions documented in the repository:
- Required outbound destinations are:
- `graph.microsoft.com`
- Azure DevOps organization APIs
- Azure Table Storage (for probe state)
- Azure Queue Storage (for probe trigger messages)
- optional Azure OpenAI endpoint
- Python package registry for `IntuneCD`
- npm registry for `md-to-pdf`
@@ -229,6 +257,7 @@ Write-capable Graph application permissions documented in the repository:
- Graph tokens are obtained just-in-time rather than stored in the repository.
- The pipeline marks the Graph token as a secret variable.
- The implementation logs token claims and roles for diagnostics, but not the token value itself.
- The change probe app secret is stored as an Azure Function App setting (`PROBE_APP_SECRET`), not in the repository.
- Azure OpenAI uses a pipeline secret variable when enabled.
- The pipeline logic itself does not depend on repository-stored application secrets; separate secret scanning of exported tenant content is still recommended.
@@ -329,6 +358,7 @@ The following items are not fully solved by the repository alone and should be a
| --- | --- | --- |
| Restore capability | Supported by design; can change production tenant state | Keep restore manual only, or disable auto-remediation by default until operational controls are approved |
| Backup vs restore identity separation | Sample config uses the same service connection name in backup and restore pipelines | Use separate service principals: read-only for backup/review, write-enabled only for restore |
| Change probe identity separation | Probe uses a separate Entra app registration from the pipeline service connection | Keep probe app read-only; do not grant write permissions to the probe identity |
| Azure OpenAI egress | Optional and customer-configurable | Enable only when the organization approves the payload scope and Azure OpenAI deployment model |
| Artifact retention | Not defined in repo; inherited from Azure DevOps settings | Set explicit retention for builds, logs, and artifacts |
| Repo access model | Not defined in repo | Restrict repo and artifact access to administrators/reviewers only |
@@ -400,3 +430,7 @@ The statements in this document are based on the implementation in:
- `scripts/apply_reviewer_rejections.py`
- `scripts/queue_post_merge_restore.py`
- `scripts/export_entra_baseline.py`
- `scripts/probe_tenant_changes.py`
- `scripts/trigger_backup_pipeline.py`
- `infra/change-probe/probe_timer/__init__.py`
- `infra/change-probe/queue_consumer/__init__.py`

View File

@@ -2,7 +2,7 @@
# ASTRAL Security Review Questionnaire
Prepared: 2026-03-27
Prepared: 2026-04-20
This appendix is a shorter, copy/paste-friendly companion to the full ASTRAL security review package.
@@ -12,21 +12,21 @@ This appendix is a shorter, copy/paste-friendly companion to the full ASTRAL sec
| What deployment modes are supported? | The same repository can be operated in progressive modes: backup-only, review package, or full package with restore/remediation. AI is optional in all modes. |
| Is it a public-facing application? | No. It is an administrative pipeline workflow with no public UI or inbound application endpoint created by this repository. |
| Does it require inbound network access from the internet? | No. The implemented workflow is outbound-only over HTTPS. |
| What production systems does it access? | Microsoft Graph for Intune and Entra configuration, plus Azure DevOps APIs for pull request and pipeline operations. |
| What production systems does it access? | Microsoft Graph for Intune and Entra configuration and audit logs; Azure DevOps APIs for pull request and pipeline operations; Azure Storage (Table and Queue) for probe state and trigger messages. |
| Does it make production changes? | Backup and review pipelines are read-oriented against Microsoft Graph. The restore pipeline is write-capable and can apply approved baseline configuration back to the tenant when explicitly enabled and authorized. |
| What data is processed? | Administrative configuration data such as Intune policies, device configuration, enrollment profiles, apps, scripts, conditional access, named locations, authentication strengths, app registrations, and enterprise application metadata. |
| Does it process end-user business content? | It is not designed for business content. However, exported admin-authored scripts or custom payloads can contain sensitive operational data if the tenant already stores it there. |
| Where is data stored? | In the Azure DevOps Git repository, Azure DevOps pull requests/threads, build logs, and optional build artifacts such as markdown, HTML, and PDF documentation. |
| How does it authenticate to Microsoft Graph? | By obtaining a Microsoft Graph token at runtime through an Azure DevOps Azure service connection using workload identity / federated credential flow. |
| How does it authenticate to Azure DevOps APIs? | With `System.AccessToken` scoped to the pipeline identity. |
| Are long-lived secrets stored in the repository? | The pipeline logic does not require repository-stored application secrets. Runtime tokens are acquired during pipeline execution, but exported tenant content should still be treated as potentially sensitive and reviewed for embedded secrets in admin-authored scripts or custom payloads. |
| Are long-lived secrets stored in the repository? | The pipeline logic does not require repository-stored application secrets. The change probe app secret is stored in Azure Function App settings, not in the repository. Runtime tokens are acquired during pipeline execution, but exported tenant content should still be treated as potentially sensitive and reviewed for embedded secrets in admin-authored scripts or custom payloads. |
| How are secrets handled in the pipeline? | The Graph access token is set as a secret pipeline variable. The implementation logs token claims and granted roles for diagnostics, but not the token value. |
| What minimum permissions are required? | Read-only Microsoft Graph application permissions for backup/review, and additional write permissions only for restore. Exact permissions are listed in the full package. |
| Is there separation between read and write access? | The code supports a safe separation model. For production, create separate read-only and write-enabled service principals/connections so backup and restore use different identities. |
| What change-control mechanism exists? | Drift is committed to dedicated workload branches and reviewed through rolling pull requests into `main`. New rolling PRs can be created as drafts until the automated summary is inserted, and optional per-file change-ticket threads and reviewer `/reject` commands are supported. |
| Can reviewers block or scope changes? | Yes. Reviewers can approve the rolling PR, reject it, or reject individual file-level drift items through PR threads when that feature is enabled. |
| Is rollback supported? | Yes. The restore pipeline supports full restore, selective restore by file path, historical restore by Git ref, and dry-run mode. |
| What external network destinations are required? | Microsoft Graph, Azure DevOps APIs, optional Azure OpenAI, Python package registry for `IntuneCD`, npm registry for `md-to-pdf`, and optionally OS package repositories when browser dependencies are installed for HTML/PDF generation. |
| What external network destinations are required? | Microsoft Graph, Azure DevOps APIs, Azure Storage (Table and Queue), optional Azure OpenAI, Python package registry for `IntuneCD`, npm registry for `md-to-pdf`, and optionally OS package repositories when browser dependencies are installed for HTML/PDF generation. |
| Does the system send data to AI services? | Only if Azure OpenAI summary generation is explicitly configured. It is optional for the platform overall. |
| What AI service is intended? | A customer-controlled Azure OpenAI deployment configured through the Azure OpenAI endpoint and deployment variables, rather than an unrelated public AI service. |
| What data is sent to Azure OpenAI when enabled? | A reduced change-review payload containing changed paths, semantic summaries, deterministic summary text, and fingerprints derived from the repo diff. This is intended to support review summarization, not raw tenant-wide export ingestion. |

View File

@@ -0,0 +1,245 @@
# ASTRAL Change Probe
Event-driven backup trigger for ASTRAL. Monitors Intune and Entra ID audit logs via Microsoft Graph, debounces change bursts, and queues the Azure DevOps backup pipeline only when actual drift is detected.
## Why this exists
Microsoft Graph change notifications and delta queries do **not** support Intune device management or Conditional Access resources. The only viable event-driven approach is polling the Graph audit log APIs, which have a 515 minute propagation delay. This probe implements a debouncer on top of that polling to avoid backup storms during bulk changes.
## Architecture
```
┌─────────────────┐ 5 min ┌──────────────┐ quiet window ┌─────────────────┐
│ Timer Trigger │ ─────────────► │ probe_timer │ ─────────────────► │ backup-trigger │
│ (probe_timer) │ │ (debouncer) │ (15 min armed) │ -queue │
└─────────────────┘ └──────────────┘ └────────┬────────┘
│ │
│ load/save state │ dequeue
│ (Azure Table Storage) ▼
│ ┌─────────────────┐
│ │ queue_consumer │
└──────────────────────────────────────────────────────────────►│ (ADO REST API) │
└─────────────────┘
┌─────────────────┐
│ Azure DevOps │
│ backup pipeline│
└─────────────────┘
```
## Components
### `probe_timer` (Timer Trigger)
- **Schedule**: every 5 minutes (`0 */5 * * * *`)
- **Input**: `TimerRequest` from Functions runtime
- **Output**: queue message to `backup-trigger-queue` (via `func.Out[str]`)
- **Actions**:
1. Load debouncer state from Azure Table Storage (`ProbeState` / `singleton` / `default`).
2. Run `scripts/probe_tenant_changes.py` via subprocess.
3. Save updated state back to Table Storage.
4. If `trigger=true`, emit a queue message.
### `queue_consumer` (Queue Trigger)
- **Input**: `QueueMessage` from `backup-trigger-queue`
- **Actions**:
1. Parse JSON payload (`reason`, `checked_at`).
2. Call Azure DevOps REST API to queue the backup pipeline run.
3. Raise on failure so the Functions runtime handles retry and poison-queue logic.
### `scripts/probe_tenant_changes.py`
Standalone CLI script that can also be run locally. It:
- Queries Intune (`deviceManagement/auditEvents`) and Entra (`directoryAudits`) audit logs.
- Implements a three-state debouncer: `idle``armed``cooldown`.
- Returns JSON with `trigger`, `reason`, and `new_state`.
### `scripts/trigger_backup_pipeline.py`
Standalone CLI script that queues an Azure DevOps pipeline run via REST API. Can be used locally or from the queue consumer.
## Debouncer State Machine
| State | Condition to transition | Output |
|---|---|---|
| **idle** | Audit log shows a new change | → `armed` |
| **armed** | Quiet window elapsed (default 15 min) with no newer events | → `cooldown`, `trigger=true` |
| **armed** | Newer event arrives while armed | Stay `armed`, extend quiet window |
| **cooldown** | Cooldown elapsed (default 30 min) | → `idle` |
| **cooldown** | New event arrives | Stay `cooldown` (change is buffered until cooldown ends) |
## Configuration
All settings are provided via Function App application settings (environment variables):
| Setting | Required | Default | Description |
|---|---|---|---|
| `AzureWebJobsStorage` | Yes | — | Storage account connection string (tables + queues) |
| `PROBE_APP_ID` | Yes* | — | Entra app registration client ID |
| `PROBE_APP_SECRET` | Yes* | — | Entra app client secret |
| `TENANT_ID` | Yes* | — | Microsoft 365 tenant ID |
| `GRAPH_TOKEN` | No | — | Optional passthrough token ( skips client credentials flow ) |
| `ADO_ORGANIZATION` | Yes | — | Azure DevOps organization name |
| `ADO_PROJECT` | Yes | — | Azure DevOps project name |
| `ADO_PIPELINE_ID` | Yes | — | Backup pipeline definition ID |
| `ADO_TOKEN` | Yes | — | Azure DevOps PAT with **Build (read & execute)** |
| `ADO_BRANCH` | No | `main` | Git ref to queue the pipeline against |
| `PROBE_QUIET_WINDOW_MINUTES` | No | `15` | Minutes to wait for change burst to settle |
| `PROBE_COOLDOWN_MINUTES` | No | `30` | Minutes between successive triggers |
\* Required unless `GRAPH_TOKEN` is provided.
## Local Development
### Prerequisites
- Python 3.11+
- [Azure Functions Core Tools](https://learn.microsoft.com/en-us/azure/azure-functions/functions-run-local)
- An Azure Storage account (or Azurite for local emulation)
### Install dependencies
```bash
cd infra/change-probe
pip install -r requirements.txt
```
### Copy shared scripts
The probe reuses scripts from the repository root. Copy them into this directory before building or running locally:
```bash
cp ../../scripts/common.py scripts/
cp ../../scripts/probe_tenant_changes.py scripts/
cp ../../scripts/trigger_backup_pipeline.py scripts/
```
### Run locally
```bash
# Start Azurite (Storage emulator)
azurite --silent --location ./azurite --debug ./azurite/debug.log
# Copy local settings template
cp local.settings.json.example local.settings.json
# Edit local.settings.json with your values
# Start the Functions host
func start
```
### Run the probe script standalone
```bash
cd ../..
python3 scripts/probe_tenant_changes.py \
--client-id "$PROBE_APP_ID" \
--client-secret "$PROBE_APP_SECRET" \
--tenant-id "$TENANT_ID" \
--state-file ./probe-state.json \
--output ./probe-result.json
```
### Trigger the backup pipeline standalone
```bash
python3 scripts/trigger_backup_pipeline.py \
--organization "contoso" \
--project "Intune" \
--pipeline-id 1 \
--token "$ADO_TOKEN" \
--branch refs/heads/main
```
## Deployment
Use the unified provisioning script:
```powershell
.\deploy\provision-change-probe.ps1 `
-TenantName "contoso.onmicrosoft.com" `
-ResourceGroupName "rg-astral-probe" `
-Location "westeurope" `
-DeployFunctionApp
```
The script will:
1. Register an Entra app (or reuse an existing one).
2. Grant admin consent for Graph permissions.
3. Create a client secret.
4. Provision Resource Group, Storage Account, and Function App (Linux Consumption, Python 3.11).
5. Configure application settings.
6. Build and deploy the function package.
### Manual deployment (zip package)
If you prefer to deploy manually:
```bash
cd infra/change-probe
# Copy shared scripts into the package directory
cp ../../scripts/common.py scripts/
cp ../../scripts/probe_tenant_changes.py scripts/
cp ../../scripts/trigger_backup_pipeline.py scripts/
# Install production dependencies into the package
pip install -r requirements.txt --target .python_packages/lib/site-packages
# Build the zip (Linux Consumption requires .python_packages/lib/site-packages, NOT python3.11/)
zip -r function-package.zip \
probe_timer/ queue_consumer/ scripts/ .python_packages/ \
host.json requirements.txt \
-x "*.pyc" -x "__pycache__/*"
# Upload and set WEBSITE_RUN_FROM_PACKAGE
az functionapp deployment source config-zip \
--resource-group rg-astral-probe \
--name func-astral-probe \
--src function-package.zip
```
## Permissions
### Entra App (Graph access)
The probe requires the same read permissions as the main backup pipeline:
- `DeviceManagementConfiguration.Read.All`
- `DeviceManagementApps.Read.All`
- `AuditLog.Read.All`
- `Directory.Read.All`
### Azure DevOps PAT
The `ADO_TOKEN` must have:
- **Build** → *Read & execute*
## Monitoring
Check the `ProbeState` table for current debouncer state:
```bash
az storage entity query --table-name ProbeState --account-name <storage>
```
Check the queue depth:
```bash
az storage queue list --account-name <storage>
```
## Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
| Timer fires but no state update | `schedule_status["last"]` case mismatch (fixed in current version) | Ensure deployed code uses `.get("Last")` |
| Probe script `ModuleNotFoundError` | Bundled packages in wrong path | Use `.python_packages/lib/site-packages`, not `python3.11/site-packages` |
| Queue message lands in poison queue | `ADO_TOKEN` missing or invalid | Verify token in Function App settings and restart |
| Probe never triggers | No audit events in Graph window | Normal if tenant is idle; verify `AuditLog.Read.All` permission |
| Duplicate pipeline runs | Multiple messages queued | Check debouncer state; cooldown should prevent this |

View File

@@ -0,0 +1,15 @@
{
"version": "2.0",
"logging": {
"applicationInsights": {
"samplingSettings": {
"isEnabled": true,
"excludedTypes": "Request"
}
}
},
"extensionBundle": {
"id": "Microsoft.Azure.Functions.ExtensionBundle",
"version": "[4.*, 5.0.0)"
}
}

View File

@@ -0,0 +1,19 @@
{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": "UseDevelopmentStorage=true",
"FUNCTIONS_WORKER_RUNTIME": "python",
"PROBE_APP_ID": "",
"PROBE_APP_SECRET": "",
"TENANT_ID": "",
"GRAPH_TOKEN": "",
"ADO_ORGANIZATION": "",
"ADO_PROJECT": "",
"ADO_PIPELINE_ID": "",
"ADO_TOKEN": "",
"ADO_BRANCH": "main",
"PROBE_QUIET_WINDOW_MINUTES": "15",
"PROBE_COOLDOWN_MINUTES": "30",
"REPO_ROOT": "../../"
}
}

View File

@@ -0,0 +1,137 @@
#!/usr/bin/env python3
"""Azure Function timer trigger that probes tenant audit logs and queues a backup run when changes are detected."""
from __future__ import annotations
import json
import logging
import os
import subprocess
import sys
from typing import Any
import azure.functions as func
from azure.data.tables import TableServiceClient
_TABLE_NAME = "ProbeState"
_PARTITION_KEY = "singleton"
_ROW_KEY = "default"
def _repo_root() -> str:
"""Resolve the repository root so we can invoke scripts/probe_tenant_changes.py."""
env_root = os.environ.get("REPO_ROOT", "").strip()
if env_root:
return os.path.abspath(env_root)
return os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
def _load_state(connection_string: str) -> dict[str, Any]:
"""Load persisted probe state from Azure Table Storage."""
try:
service = TableServiceClient.from_connection_string(conn_str=connection_string)
table = service.get_table_client(table_name=_TABLE_NAME)
entity = table.get_entity(partition_key=_PARTITION_KEY, row_key=_ROW_KEY)
raw = entity.get("state", "{}")
return json.loads(raw) if isinstance(raw, str) else dict(raw)
except Exception as exc:
logging.warning(f"Unable to load state from Table Storage ({exc}); starting fresh.")
return {}
def _save_state(connection_string: str, state: dict[str, Any]) -> None:
"""Persist probe state to Azure Table Storage."""
service = TableServiceClient.from_connection_string(conn_str=connection_string)
table = service.get_table_client(table_name=_TABLE_NAME)
table.upsert_entity(
{
"PartitionKey": _PARTITION_KEY,
"RowKey": _ROW_KEY,
"state": json.dumps(state),
}
)
def main(mytimer: func.TimerRequest, msg: func.Out[str]) -> None:
utc_now = mytimer.schedule_status.get("Last", "n/a") if mytimer.schedule_status else "n/a"
logging.info(f"Probe timer triggered at {utc_now}")
client_id = os.environ.get("PROBE_APP_ID", "").strip()
client_secret = os.environ.get("PROBE_APP_SECRET", "").strip()
tenant_id = os.environ.get("TENANT_ID", "").strip()
token = os.environ.get("GRAPH_TOKEN", "").strip()
auth_args: list[str] = []
if token:
auth_args = ["--token", token]
elif client_id and client_secret and tenant_id:
auth_args = [
"--client-id", client_id,
"--client-secret", client_secret,
"--tenant-id", tenant_id,
]
else:
logging.error("No Graph authentication configured (PROBE_APP_ID/SECRET/TENANT_ID or GRAPH_TOKEN).")
return
connection_string = os.environ.get("AzureWebJobsStorage", "").strip()
if not connection_string:
logging.error("AzureWebJobsStorage connection string is missing.")
return
state = _load_state(connection_string)
state_json = json.dumps(state) if state else ""
quiet_window = os.environ.get("PROBE_QUIET_WINDOW_MINUTES", "15")
cooldown = os.environ.get("PROBE_COOLDOWN_MINUTES", "30")
probe_script = os.path.join(_repo_root(), "scripts", "probe_tenant_changes.py")
if not os.path.exists(probe_script):
logging.error(f"Probe script not found at {probe_script}")
return
cmd = [
sys.executable,
probe_script,
*auth_args,
"--quiet-window-minutes", quiet_window,
"--cooldown-minutes", cooldown,
]
if state_json:
cmd.extend(["--state-json", state_json])
logging.info(f"Running probe script: {probe_script}")
try:
result = subprocess.run(cmd, capture_output=True, text=True, timeout=60)
except subprocess.TimeoutExpired:
logging.error("Probe script timed out after 60 seconds.")
return
except Exception as exc:
logging.error(f"Failed to run probe script ({exc}).")
return
if result.returncode != 0:
logging.error(f"Probe script failed (exit {result.returncode}): {result.stderr}")
return
try:
output = json.loads(result.stdout)
except json.JSONDecodeError as exc:
logging.error(f"Probe script returned invalid JSON ({exc}): {result.stdout[:500]}")
return
new_state = output.get("new_state", state)
_save_state(connection_string, new_state)
trigger = output.get("trigger", False)
reason = output.get("reason", "no reason given")
logging.info(f"Probe result: trigger={trigger}, reason={reason}")
if trigger:
queue_payload = json.dumps(
{
"reason": reason,
"checked_at": output.get("checked_at", ""),
}
)
msg.set(queue_payload)
logging.info("Queued backup trigger message.")

View File

@@ -0,0 +1,18 @@
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "mytimer",
"type": "timerTrigger",
"direction": "in",
"schedule": "0 */5 * * * *"
},
{
"name": "msg",
"type": "queue",
"direction": "out",
"queueName": "backup-trigger-queue",
"connection": "AzureWebJobsStorage"
}
]
}

View File

@@ -0,0 +1,77 @@
#!/usr/bin/env python3
"""Azure Function queue trigger that calls the Azure DevOps REST API to queue a backup pipeline run."""
from __future__ import annotations
import json
import logging
import os
import subprocess
import sys
import azure.functions as func
def _repo_root() -> str:
"""Resolve the repository root so we can invoke scripts/trigger_backup_pipeline.py."""
env_root = os.environ.get("REPO_ROOT", "").strip()
if env_root:
return os.path.abspath(env_root)
return os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
def main(msg: func.QueueMessage) -> None:
body = msg.get_body().decode("utf-8")
logging.info(f"Queue consumer received message: {body}")
org = os.environ.get("ADO_ORGANIZATION", "").strip()
project = os.environ.get("ADO_PROJECT", "").strip()
pipeline_id = os.environ.get("ADO_PIPELINE_ID", "").strip()
token = os.environ.get("ADO_TOKEN", "").strip()
branch = os.environ.get("ADO_BRANCH", "main").strip()
if not all([org, project, pipeline_id, token]):
logging.error("Missing one or more ADO configuration variables (ADO_ORGANIZATION, ADO_PROJECT, ADO_PIPELINE_ID, ADO_TOKEN).")
# Re-raising causes the Functions runtime to retry the message after the visibility timeout.
raise RuntimeError("Incomplete ADO configuration")
trigger_script = os.path.join(_repo_root(), "scripts", "trigger_backup_pipeline.py")
if not os.path.exists(trigger_script):
logging.error(f"Trigger script not found at {trigger_script}")
raise RuntimeError("Trigger script missing")
cmd = [
sys.executable,
trigger_script,
"--organization",
org,
"--project",
project,
"--pipeline-id",
pipeline_id,
"--token",
token,
"--branch",
branch,
]
logging.info(f"Triggering ADO pipeline {pipeline_id} ...")
try:
result = subprocess.run(
cmd,
capture_output=True,
text=True,
timeout=60,
)
except subprocess.TimeoutExpired:
logging.error("Trigger script timed out after 60 seconds.")
raise
except Exception as exc:
logging.error(f"Failed to run trigger script ({exc}).")
raise
if result.returncode != 0:
logging.error(f"Trigger script failed (exit {result.returncode}): {result.stderr}")
raise RuntimeError(f"Trigger script failed: {result.stderr}")
logging.info(f"Trigger script succeeded: {result.stdout.strip()}")

View File

@@ -0,0 +1,12 @@
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "msg",
"type": "queueTrigger",
"direction": "in",
"queueName": "backup-trigger-queue",
"connection": "AzureWebJobsStorage"
}
]
}

View File

@@ -0,0 +1,3 @@
azure-functions
azure-data-tables
azure-storage-queue

View File

@@ -150,7 +150,8 @@ def _fetch_directory_audits(
"$top": "999",
"$select": "activityDateTime,activityDisplayName,category,result,initiatedBy,targetResources",
}
filter_parts = [f"activityDateTime le {_format_filter_datetime(backup_start)}"]
audit_end = backup_start - dt.timedelta(minutes=10)
filter_parts = [f"activityDateTime le {_format_filter_datetime(audit_end)}"]
if last_commit_date is not None:
filter_parts.append(f"activityDateTime ge {_format_filter_datetime(last_commit_date)}")
params["$filter"] = " and ".join(filter_parts)

View File

@@ -114,6 +114,15 @@ def request_json(
except urllib.error.HTTPError as exc:
last_error = exc
if exc.code not in retry_codes or attempt == max_retries:
body = ""
try:
body = exc.read().decode("utf-8", errors="replace")[:2048]
except Exception:
pass
if body:
raise RuntimeError(
f"{method} {url} failed: HTTP Error {exc.code}: {exc.reason}{body}"
) from exc
raise
retry_after = _get_retry_after_seconds(exc)
sleep = retry_after if retry_after is not None else (2 ** attempt)

View File

@@ -325,28 +325,10 @@ def _current_pr_merge_strategy(pr: dict[str, Any]) -> str:
def _build_description(workload: str, drift_branch: str, baseline_branch: str, build_number: str, build_id: str) -> str:
is_entra = workload.lower() == "entra"
lead = "Rolling Entra drift PR created by backup pipeline." if is_entra else "Rolling drift PR created by backup pipeline."
lead = "Rolling Entra drift PR backup pipeline" if is_entra else "Rolling drift PR backup pipeline"
return (
f"{lead}\n\n"
f"- Source branch: `{drift_branch}`\n"
f"- Target branch: `{baseline_branch}`\n"
f"- Last pipeline run: `{build_number}` (BuildId: {build_id})\n\n"
"The automated review summary is generated immediately after PR creation and inserted "
"above the reviewer actions section.\n\n"
"## Reviewer Quick Actions\n\n"
"### 1) Accept all changes\n"
"- Merge PR to accept drift into baseline.\n\n"
"### 2) Reject whole PR and revert\n"
"- Set reviewer vote to **Reject**.\n"
"- Abandon PR.\n"
"- Auto-remediation queues restore (if `AUTO_REMEDIATE_ON_PR_REJECTION=true`).\n\n"
"### 3) Reject only selected policy changes\n"
"- In each `Change Needed` policy thread, comment `/reject` for changes you do not want.\n"
"- Optional: use `/accept` for changes you want to keep.\n"
"- Wait for review-sync pipeline (about 5 minutes) to update PR diff.\n"
"- Merge remaining accepted changes.\n"
"- Post-merge auto-remediation queues restore to reconcile tenant to merged baseline "
"(if `AUTO_REMEDIATE_AFTER_MERGE=true`)."
f"{lead} run `{build_number}` (build {build_id})\n\n"
f"Source: `{drift_branch}` → Target: `{baseline_branch}`\n"
)

View File

@@ -0,0 +1,102 @@
#!/usr/bin/env python3
"""Revert Intune JSON exports that differ from baseline only in formatting or key ordering."""
from __future__ import annotations
import argparse
import json
import subprocess
import sys
from pathlib import Path
def _run_git_show(repo_root: Path, ref: str, rel_path: str) -> str | None:
proc = subprocess.run(
["git", "show", f"{ref}:{rel_path}"],
cwd=str(repo_root),
check=False,
capture_output=True,
)
if proc.returncode != 0:
return None
return proc.stdout.decode("utf-8", errors="replace")
def revert_formatting_only_changes(
repo_root: Path,
backup_root: Path,
baseline_ref: str,
) -> tuple[list[str], list[str]]:
reverted: list[str] = []
kept: list[str] = []
for file_path in sorted(backup_root.rglob("*.json")):
rel_path = file_path.relative_to(repo_root).as_posix()
baseline_text = _run_git_show(repo_root, baseline_ref, rel_path)
if not baseline_text:
# New file — nothing to revert against
continue
try:
current_text = file_path.read_text(encoding="utf-8")
current_payload = json.loads(current_text)
baseline_payload = json.loads(baseline_text)
except Exception:
kept.append(rel_path)
continue
if current_payload == baseline_payload:
file_path.write_text(baseline_text, encoding="utf-8")
reverted.append(rel_path)
else:
kept.append(rel_path)
return reverted, kept
def main() -> int:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--repo-root", required=True)
parser.add_argument(
"--backup-root",
default="tenant-state/intune",
help="Path to Intune backup root (default: tenant-state/intune).",
)
parser.add_argument(
"--baseline-ref",
default="HEAD",
help="Git ref used as baseline for comparison (default: HEAD).",
)
args = parser.parse_args()
repo_root = Path(args.repo_root).resolve()
backup_root = Path(args.backup_root)
if not backup_root.is_absolute():
backup_root = repo_root / backup_root
backup_root = backup_root.resolve()
if not backup_root.exists():
print(f"Backup root not found: {backup_root}")
return 0
reverted, kept = revert_formatting_only_changes(
repo_root=repo_root,
backup_root=backup_root,
baseline_ref=args.baseline_ref,
)
if reverted:
print(f"Reverted {len(reverted)} formatting-only Intune JSON export(s) to baseline:")
for path in reverted:
print(f" - {path}")
else:
print("No formatting-only Intune JSON exports detected.")
if kept:
print(f"Files with actual semantic changes (kept): {len(kept)}")
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -0,0 +1,444 @@
#!/usr/bin/env python3
"""Probe tenant audit logs to detect configuration changes and decide whether to trigger a backup pipeline.
This script is designed to run inside an Azure Function timer trigger or locally for testing.
It queries Microsoft Graph audit endpoints for the cheapest possible signal that a configuration
change occurred since the last check, then applies a debouncer so that a burst of changes during
an admin sprint results in a single backup run after a configurable quiet window.
Usage (local testing):
python3 scripts/probe_tenant_changes.py \
--token "$GRAPH_TOKEN" \
--state-path ./probe-state.json \
--quiet-window-minutes 15 \
--cooldown-minutes 30
Usage (Azure Function wrapper):
python3 scripts/probe_tenant_changes.py \
--token "$GRAPH_TOKEN" \
--state-json '{"intune":{"last_check":"2026-04-20T10:00:00+00:00"},...}' \
--quiet-window-minutes 15 \
--cooldown-minutes 30
"""
from __future__ import annotations
import argparse
import datetime as dt
import json
import os
import pathlib
import sys
import urllib.parse
from typing import Any
# scripts/ is not guaranteed to be on PYTHONPATH when loaded by the Function wrapper,
# so we tolerate a relative import failure and fall back to an absolute import.
try:
from scripts.common import request_json
except ImportError:
from common import request_json # type: ignore[no-redef]
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
_INTUNE_AUDIT_URL = "https://graph.microsoft.com/beta/deviceManagement/auditEvents"
_ENTRA_AUDIT_URL = "https://graph.microsoft.com/v1.0/auditLogs/directoryAudits"
# Target resource types in Entra that map to the categories exported by export_entra_baseline.py.
_ENTRA_TARGET_TYPES = (
"ConditionalAccessPolicy",
"NamedLocation",
"AuthenticationStrengthPolicy",
"Application",
"ServicePrincipal",
)
_DEFAULT_STATE: dict[str, Any] = {
"intune": {"last_check": None},
"entra": {"last_check": None},
"debouncer": {
"state": "idle",
"first_event_at": None,
"trigger_after": None,
"cooldown_until": None,
},
}
# ---------------------------------------------------------------------------
# Token acquisition
# ---------------------------------------------------------------------------
def _acquire_graph_token(client_id: str, client_secret: str, tenant_id: str) -> str:
"""Acquire a Graph access token via client credentials flow."""
url = f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token"
body = urllib.parse.urlencode(
{
"client_id": client_id,
"client_secret": client_secret,
"scope": "https://graph.microsoft.com/.default",
"grant_type": "client_credentials",
}
).encode("utf-8")
headers = {"Content-Type": "application/x-www-form-urlencoded"}
req = urllib.request.Request(url, data=body, headers=headers, method="POST")
with urllib.request.urlopen(req, timeout=30) as resp:
payload = json.loads(resp.read().decode("utf-8"))
access_token = payload.get("access_token")
if not access_token:
raise RuntimeError("Token endpoint did not return an access_token.")
return str(access_token)
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--token", default="", help="Microsoft Graph bearer token (direct).")
parser.add_argument("--client-id", default="", help="Entra app client ID (alternative to --token).")
parser.add_argument("--client-secret", default="", help="Entra app client secret (alternative to --token).")
parser.add_argument("--tenant-id", default="", help="Entra tenant ID (alternative to --token).")
parser.add_argument(
"--state-path",
default="",
help="Path to a local JSON state file (used for local testing).",
)
parser.add_argument(
"--state-json",
default="",
help="Raw JSON state string (used when the caller manages persistence, e.g. Azure Table Storage).",
)
parser.add_argument(
"--quiet-window-minutes",
type=int,
default=15,
help="Minutes of silence after the last detected change before triggering a backup.",
)
parser.add_argument(
"--cooldown-minutes",
type=int,
default=30,
help="Minimum minutes between two triggered backup runs.",
)
parser.add_argument(
"--now",
default="",
help="Override the current time (ISO 8601). Useful for tests.",
)
return parser.parse_args()
# ---------------------------------------------------------------------------
# State helpers
# ---------------------------------------------------------------------------
def _load_state(path: str, json_str: str) -> dict[str, Any]:
if json_str:
return json.loads(json_str)
if path:
p = pathlib.Path(path)
if p.exists():
return json.loads(p.read_text(encoding="utf-8"))
return json.loads(json.dumps(_DEFAULT_STATE))
def _save_state(path: str, state: dict[str, Any]) -> None:
if path:
pathlib.Path(path).write_text(
json.dumps(state, indent=2, ensure_ascii=False) + "\n",
encoding="utf-8",
)
def _parse_iso(value: str | None) -> dt.datetime | None:
if not value:
return None
try:
parsed = dt.datetime.fromisoformat(value.replace("Z", "+00:00"))
return parsed.astimezone(dt.timezone.utc)
except ValueError:
return None
def _format_iso(value: dt.datetime) -> str:
return value.astimezone(dt.timezone.utc).isoformat().replace("+00:00", "Z")
# ---------------------------------------------------------------------------
# Graph queries
# ---------------------------------------------------------------------------
def _build_intune_filter(since: dt.datetime, until: dt.datetime) -> str:
since_str = since.strftime("%Y-%m-%dT%H:%M:%SZ")
until_str = until.strftime("%Y-%m-%dT%H:%M:%SZ")
return (
f"activityDateTime ge {since_str}"
f" and activityDateTime le {until_str}"
f" and activityResult eq 'Success'"
f" and ActivityOperationType ne 'Get'"
)
def _build_entra_filter(since: dt.datetime, until: dt.datetime) -> str:
since_str = since.strftime("%Y-%m-%dT%H:%M:%SZ")
until_str = until.strftime("%Y-%m-%dT%H:%M:%SZ")
type_clauses = " or ".join(
f"targetResources/any(t: t/type eq '{t}')" for t in _ENTRA_TARGET_TYPES
)
return (
f"activityDateTime ge {since_str}"
f" and activityDateTime le {until_str}"
f" and result eq 'success'"
f" and ({type_clauses})"
)
def _fetch_latest_event(url: str, token: str) -> dict[str, Any] | None:
"""Return the single latest matching audit event, or None if nothing found."""
try:
payload = request_json(url, token=token, timeout=30, max_retries=2)
except Exception as exc:
# Defensive: log and treat as no event so a transient Graph failure does
# not wedge the debouncer in an armed state forever.
print(f"Warning: Graph query failed ({exc})", file=sys.stderr)
return None
value = payload.get("value")
if isinstance(value, list) and value:
event = value[0]
if isinstance(event, dict):
return event
return None
def _get_latest_intune_event(
token: str, since: dt.datetime, until: dt.datetime
) -> dict[str, Any] | None:
filter_str = _build_intune_filter(since, until)
params = {
"$filter": filter_str,
"$orderby": "activityDateTime desc",
"$top": "1",
"$select": "id,activityDateTime,activityType,activityOperationType",
}
url = f"{_INTUNE_AUDIT_URL}?{urllib.parse.urlencode(params)}"
return _fetch_latest_event(url, token)
def _get_latest_entra_event(
token: str, since: dt.datetime, until: dt.datetime
) -> dict[str, Any] | None:
filter_str = _build_entra_filter(since, until)
params = {
"$filter": filter_str,
"$orderby": "activityDateTime desc",
"$top": "1",
"$select": "id,activityDateTime,activityDisplayName",
}
url = f"{_ENTRA_AUDIT_URL}?{urllib.parse.urlencode(params)}"
return _fetch_latest_event(url, token)
# ---------------------------------------------------------------------------
# Debouncer
# ---------------------------------------------------------------------------
def _evaluate_debouncer(
state: dict[str, Any],
intune_event: dict[str, Any] | None,
entra_event: dict[str, Any] | None,
now: dt.datetime,
quiet_window: dt.timedelta,
cooldown: dt.timedelta,
) -> tuple[bool, dict[str, Any], str]:
"""Return (should_trigger, updated_state, human_readable_reason)."""
deb = dict(state.get("debouncer") or {})
deb_state = str(deb.get("state") or "idle")
# Extract event timestamps if present
intune_time: dt.datetime | None = None
entra_time: dt.datetime | None = None
if intune_event:
intune_time = _parse_iso(intune_event.get("activityDateTime"))
if entra_event:
entra_time = _parse_iso(entra_event.get("activityDateTime"))
latest_event_time = max(
(t for t in (intune_time, entra_time) if t is not None), default=None
)
# ------------------------------------------------------------------
# Cooldown check
# ------------------------------------------------------------------
if deb_state == "cooldown":
cooldown_until = _parse_iso(deb.get("cooldown_until"))
if cooldown_until is not None and now < cooldown_until:
reason = (
f"In cooldown until {_format_iso(cooldown_until)}; "
f"{int(intune_event is not None) + int(entra_event is not None)} event(s) ignored."
)
return False, state, reason
# Cooldown expired → fall through to idle logic
deb = {
"state": "idle",
"first_event_at": None,
"trigger_after": None,
"cooldown_until": None,
}
deb_state = "idle"
# ------------------------------------------------------------------
# Idle or armed
# ------------------------------------------------------------------
if latest_event_time is None:
# No changes in this window
if deb_state == "armed":
trigger_after = _parse_iso(deb.get("trigger_after"))
if trigger_after is not None and now >= trigger_after:
# Quiet window satisfied — fire
deb = {
"state": "cooldown",
"first_event_at": None,
"trigger_after": None,
"cooldown_until": _format_iso(now + cooldown),
}
reason = "Quiet window satisfied; no new events since last check."
state["debouncer"] = deb
return True, state, reason
# Still waiting
reason = f"Armed, waiting for quiet window until {_format_iso(trigger_after)}."
state["debouncer"] = deb
return False, state, reason
# Idle, no changes
reason = "No changes detected."
state["debouncer"] = deb
return False, state, reason
# There is at least one new event
if deb_state == "idle":
# First change in a while — arm the debouncer
trigger_after = now + quiet_window
deb = {
"state": "armed",
"first_event_at": _format_iso(latest_event_time),
"trigger_after": _format_iso(trigger_after),
"cooldown_until": None,
}
reason = (
f"Change detected at {_format_iso(latest_event_time)}; "
f"armed, trigger scheduled for {_format_iso(trigger_after)}."
)
state["debouncer"] = deb
return False, state, reason
if deb_state == "armed":
# Extend the quiet window because activity is still ongoing
trigger_after = now + quiet_window
first_event = deb.get("first_event_at") or _format_iso(latest_event_time)
deb = {
"state": "armed",
"first_event_at": first_event,
"trigger_after": _format_iso(trigger_after),
"cooldown_until": None,
}
workloads: list[str] = []
if intune_event:
workloads.append("intune")
if entra_event:
workloads.append("entra")
reason = (
f"Additional change detected at {_format_iso(latest_event_time)} "
f"({'/'.join(workloads)}); quiet window extended to {_format_iso(trigger_after)}."
)
state["debouncer"] = deb
return False, state, reason
# Defensive fallback
reason = f"Unexpected debouncer state '{deb_state}'; resetting to idle."
state["debouncer"] = {
"state": "idle",
"first_event_at": None,
"trigger_after": None,
"cooldown_until": None,
}
return False, state, reason
# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------
def main() -> int:
args = parse_args()
token = args.token.strip()
if not token:
if args.client_id and args.client_secret and args.tenant_id:
token = _acquire_graph_token(args.client_id, args.client_secret, args.tenant_id)
else:
print(
"ERROR: Provide --token, or all three of --client-id, --client-secret, --tenant-id.",
file=sys.stderr,
)
raise SystemExit(1)
quiet_window = dt.timedelta(minutes=args.quiet_window_minutes)
cooldown = dt.timedelta(minutes=args.cooldown_minutes)
now = _parse_iso(args.now) or dt.datetime.now(dt.timezone.utc)
# Truncate to second for cleaner output
now = now.replace(microsecond=0)
state = _load_state(args.state_path, args.state_json)
# Initialise missing last_check values to a safe default (24 hours ago).
# This prevents a brand-new state file from scanning the entire audit log history.
default_since = now - dt.timedelta(hours=24)
intune_since = _parse_iso(state.get("intune", {}).get("last_check")) or default_since
entra_since = _parse_iso(state.get("entra", {}).get("last_check")) or default_since
# ------------------------------------------------------------------
# Query Graph
# ------------------------------------------------------------------
intune_event = _get_latest_intune_event(token, intune_since, now)
entra_event = _get_latest_entra_event(token, entra_since, now)
# ------------------------------------------------------------------
# Debounce
# ------------------------------------------------------------------
trigger, state, reason = _evaluate_debouncer(
state, intune_event, entra_event, now, quiet_window, cooldown
)
# ------------------------------------------------------------------
# Advance watermarks regardless of trigger decision so the next run
# does not re-scan the same window.
# ------------------------------------------------------------------
state.setdefault("intune", {})["last_check"] = _format_iso(now)
state.setdefault("entra", {})["last_check"] = _format_iso(now)
_save_state(args.state_path, state)
# ------------------------------------------------------------------
# Emit decision
# ------------------------------------------------------------------
result = {
"trigger": trigger,
"reason": reason,
"checked_at": _format_iso(now),
"intune_event": intune_event,
"entra_event": entra_event,
"new_state": state,
}
print(json.dumps(result, indent=2, ensure_ascii=False))
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,86 @@
#!/usr/bin/env python3
"""Trigger an Azure DevOps pipeline run via REST API.
Intended to be invoked from the queue-consumer Azure Function or locally for testing.
Usage:
python3 scripts/trigger_backup_pipeline.py \
--organization "my-org" \
--project "my-project" \
--pipeline-id 123 \
--token "$ADO_PAT" \
--branch "main" \
--parameters '{"forceFullRun": false}'
"""
from __future__ import annotations
import argparse
import json
import sys
from typing import Any
try:
from scripts.common import request_json
except ImportError:
from common import request_json # type: ignore[no-redef]
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--organization", required=True)
parser.add_argument("--project", required=True)
parser.add_argument("--pipeline-id", type=int, required=True)
parser.add_argument("--token", required=True, help="Azure DevOps PAT or OAuth token.")
parser.add_argument("--branch", default="main", help="Git ref to run against.")
parser.add_argument(
"--parameters",
default="{}",
help='JSON object of pipeline template parameters (e.g. \'{"forceFullRun": true}\').',
)
return parser.parse_args()
def main() -> int:
args = parse_args()
base_url = (
f"https://dev.azure.com/{args.organization}/{args.project}"
f"/_apis/pipelines/{args.pipeline_id}/runs?api-version=7.1"
)
body: dict[str, Any] = {
"resources": {
"repositories": {
"self": {"refName": f"refs/heads/{args.branch.lstrip('refs/heads/')}"}
}
},
}
params = json.loads(args.parameters)
if isinstance(params, dict) and params:
body["templateParameters"] = params
# ADO REST API accepts Basic auth with an empty username and the PAT as password.
import base64
encoded = base64.b64encode(f":{args.token}".encode("utf-8")).decode("utf-8")
auth_header = f"Basic {encoded}"
print(f"Triggering pipeline {args.pipeline_id} on branch {args.branch} ...")
response = request_json(
base_url,
method="POST",
body=body,
headers={"Authorization": auth_header},
timeout=30,
max_retries=2,
)
run_id = response.get("id")
run_url = response.get("url")
print(f"Queued run id={run_id} url={run_url}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -54,6 +54,14 @@ TICKET_BLOCK_END = "<!-- AUTO-CHANGE-TICKETS:END -->"
AUTO_TICKET_THREAD_PREFIX = "AUTO-CHANGE-TICKET:"
AUTO_AI_REVIEW_THREAD_PREFIX = "AUTO-AI-REVIEW:"
COMPACT_AI_THREAD_NOTE = "_Full AI reviewer narrative is posted in a dedicated PR thread due PR description limits._"
AUTO_DETERMINISTIC_THREAD_PREFIX = "AUTO-DETERMINISTIC-SUMMARY:"
COMPACT_DETERMINISTIC_THREAD_NOTE = (
"_Full deterministic summary (including Top Risk Items) is posted in a dedicated PR thread "
"due to Azure DevOps description size limits._"
)
ADO_PR_DESCRIPTION_MAX_LEN = 4000
AUTO_REVIEWER_GUIDE_THREAD_PREFIX = "AUTO-REVIEWER-GUIDE:"
COMPACT_REVIEWER_GUIDE_NOTE = "> 📋 Full **reviewer guide** is posted in a dedicated PR thread."
THREAD_STATUS_ACTIVE = 1
THREAD_STATUS_FIXED = 2
@@ -2035,6 +2043,29 @@ def _compact_deterministic_summary(deterministic_summary: str) -> str:
return deterministic_summary[:idx].strip()
def _compact_reviewer_guide(description: str) -> str:
"""Replace the legacy long reviewer guide with a compact reference."""
description = description or ""
marker = "## Reviewer Quick Actions"
idx = description.find(marker)
if idx == -1:
return description
prefix = description[:idx].rstrip()
if not prefix:
return COMPACT_REVIEWER_GUIDE_NOTE + "\n"
return prefix + "\n\n" + COMPACT_REVIEWER_GUIDE_NOTE + "\n"
def _append_reviewer_guide_note(description: str) -> str:
"""Append the compact reviewer guide note if not already present."""
description = description or ""
if COMPACT_REVIEWER_GUIDE_NOTE in description:
return description
if description.endswith("\n"):
return description + COMPACT_REVIEWER_GUIDE_NOTE + "\n"
return description + "\n\n" + COMPACT_REVIEWER_GUIDE_NOTE + "\n"
def _remove_marked_block(description: str, start_marker: str, end_marker: str) -> str:
description = description or ""
pattern = re.compile(
@@ -2273,6 +2304,185 @@ def _sync_full_ai_review_thread(
return True
def _deterministic_thread_marker(workload: str) -> str:
return f"Automation marker: {AUTO_DETERMINISTIC_THREAD_PREFIX}{workload.strip().lower()}"
def _build_full_deterministic_thread_content(workload: str, deterministic_summary: str) -> str:
marker = _deterministic_thread_marker(workload)
return (
"Automated review summary (full)\n\n"
"PR description uses a compact review summary because of Azure DevOps description size limits.\n\n"
f"{deterministic_summary}\n\n"
f"{marker}"
).strip()
def _create_deterministic_thread(
repo_api: str,
pr_id: int,
token: str,
workload: str,
deterministic_summary: str,
) -> None:
content = _build_full_deterministic_thread_content(workload, deterministic_summary)
_request_json(
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
token=token,
method="POST",
body={
"comments": [
{
"parentCommentId": 0,
"content": content,
"commentType": 1,
}
],
"status": THREAD_STATUS_ACTIVE,
},
)
def _sync_deterministic_thread(
repo_api: str,
pr_id: int,
token: str,
workload: str,
deterministic_summary: str,
) -> bool:
marker = _deterministic_thread_marker(workload)
desired_content = _build_full_deterministic_thread_content(workload, deterministic_summary)
threads_payload = _request_json(
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
token=token,
)
threads = threads_payload.get("value", []) if isinstance(threads_payload, dict) else []
thread = _find_marked_thread(threads, marker)
if thread is None:
_create_deterministic_thread(repo_api, pr_id, token, workload, deterministic_summary)
return True
comments = thread.get("comments", []) if isinstance(thread.get("comments"), list) else []
if _thread_has_matching_comment(comments, desired_content):
return False
thread_id = _thread_id(thread)
if thread_id <= 0:
_create_deterministic_thread(repo_api, pr_id, token, workload, deterministic_summary)
return True
if _is_thread_resolved(thread):
_set_thread_status(repo_api, pr_id, thread_id, token, THREAD_STATUS_ACTIVE)
_add_thread_comment(repo_api, pr_id, thread_id, token, desired_content)
return True
def _close_deterministic_thread(
repo_api: str,
pr_id: int,
token: str,
workload: str,
) -> bool:
marker = _deterministic_thread_marker(workload)
threads_payload = _request_json(
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
token=token,
)
threads = threads_payload.get("value", []) if isinstance(threads_payload, dict) else []
thread = _find_marked_thread(threads, marker)
if thread is None:
return False
thread_id = _thread_id(thread)
if thread_id <= 0:
return False
if _is_thread_resolved(thread):
return False
_set_thread_status(repo_api, pr_id, thread_id, token, THREAD_STATUS_CLOSED)
return True
def _reviewer_guide_thread_marker(workload: str) -> str:
return f"Automation marker: {AUTO_REVIEWER_GUIDE_THREAD_PREFIX}{workload.strip().lower()}"
def _build_full_reviewer_guide_thread_content(workload: str) -> str:
marker = _reviewer_guide_thread_marker(workload)
return (
"## Reviewer Quick Actions\n\n"
"### 1) Accept all changes\n"
"- Merge PR to accept drift into baseline.\n\n"
"### 2) Reject whole PR and revert\n"
"- Set reviewer vote to **Reject**.\n"
"- Abandon PR.\n"
"- Auto-remediation queues restore (if `AUTO_REMEDIATE_ON_PR_REJECTION=true`).\n\n"
"### 3) Reject only selected policy changes\n"
"- In each `Change Needed` policy thread, comment `/reject` for changes you do not want.\n"
"- Optional: use `/accept` for changes you want to keep.\n"
"- Wait for review-sync pipeline (about 5 minutes) to update PR diff.\n"
"- Merge remaining accepted changes.\n"
"- Post-merge auto-remediation queues restore to reconcile tenant to merged baseline "
"(if `AUTO_REMEDIATE_AFTER_MERGE=true`).\n\n"
f"{marker}"
).strip()
def _create_reviewer_guide_thread(
repo_api: str,
pr_id: int,
token: str,
workload: str,
) -> None:
content = _build_full_reviewer_guide_thread_content(workload)
_request_json(
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
token=token,
method="POST",
body={
"comments": [
{
"parentCommentId": 0,
"content": content,
"commentType": 1,
}
],
"status": THREAD_STATUS_ACTIVE,
},
)
def _sync_reviewer_guide_thread(
repo_api: str,
pr_id: int,
token: str,
workload: str,
) -> bool:
marker = _reviewer_guide_thread_marker(workload)
desired_content = _build_full_reviewer_guide_thread_content(workload)
threads_payload = _request_json(
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
token=token,
)
threads = threads_payload.get("value", []) if isinstance(threads_payload, dict) else []
thread = _find_marked_thread(threads, marker)
if thread is None:
_create_reviewer_guide_thread(repo_api, pr_id, token, workload)
return True
comments = thread.get("comments", []) if isinstance(thread.get("comments"), list) else []
if _thread_has_matching_comment(comments, desired_content):
return False
thread_id = _thread_id(thread)
if thread_id <= 0:
_create_reviewer_guide_thread(repo_api, pr_id, token, workload)
return True
if _is_thread_resolved(thread):
_set_thread_status(repo_api, pr_id, thread_id, token, THREAD_STATUS_ACTIVE)
_add_thread_comment(repo_api, pr_id, thread_id, token, desired_content)
return True
def _set_thread_status(
repo_api: str,
pr_id: int,
@@ -2530,12 +2740,16 @@ def main() -> int:
)
full_pr = _request_json(f"{repo_api}/pullrequests/{pr_id}?api-version=7.1", token=token)
current_description = full_pr.get("description", "")
current_description = full_pr.get("description") or ""
pr_is_draft = bool(full_pr.get("isDraft"))
existing_fingerprint = _existing_change_fingerprint(current_description)
existing_summary_version = _existing_summary_version(current_description)
current_auto_body = _auto_block_body(current_description)
deterministic_already_present = deterministic in current_auto_body if current_auto_body else False
compact_deterministic = _compact_deterministic_summary(deterministic)
deterministic_already_present = (
(deterministic in current_auto_body)
or (compact_deterministic in current_auto_body)
) if current_auto_body else False
ai_fallback_in_current_block = _auto_block_contains_ai_fallback(current_auto_body)
refresh_on_fallback = _env_bool("PR_AI_FORCE_REFRESH_ON_FALLBACK", default=True)
if existing_fingerprint and existing_fingerprint == changes_fingerprint:
@@ -2549,7 +2763,7 @@ def main() -> int:
repo_api=repo_api,
token=token,
pr_id=int(pr_id),
title=full_pr.get("title", pr.get("title", f"{args.workload} drift review (rolling)")),
title=full_pr.get("title") or pr.get("title") or f"{args.workload} drift review (rolling)",
description=current_description,
is_draft=pr_is_draft,
)
@@ -2625,29 +2839,44 @@ def main() -> int:
updated_description = _upsert_auto_block(current_description, auto_block)
# Cleanup legacy description-based ticket checklist if present.
updated_description = _remove_marked_block(updated_description, TICKET_BLOCK_START, TICKET_BLOCK_END)
# Strip legacy long reviewer guide and ensure compact note is present.
updated_description = _compact_reviewer_guide(updated_description)
updated_description = _append_reviewer_guide_note(updated_description)
patch_url = f"{repo_api}/pullrequests/{pr_id}?api-version=7.1"
patch_title = full_pr.get("title", pr.get("title", f"{args.workload} drift review (rolling)"))
patch_title = full_pr.get("title") or pr.get("title") or f"{args.workload} drift review (rolling)"
summary_updated = False
final_description = current_description
description_compacted = False
print(
f"DEBUG summary: pr_id={pr_id} workload={args.workload} "
f"status={full_pr.get('status')} isDraft={full_pr.get('isDraft')} "
f"mergeStatus={full_pr.get('mergeStatus')} title_len={len(patch_title)} "
f"current_desc_len={len(current_description or '')} updated_desc_len={len(updated_description or '')}"
)
# Proactively compact if we are near the Azure DevOps PR description limit.
if len(updated_description) > (ADO_PR_DESCRIPTION_MAX_LEN - 100):
description_compacted = True
if updated_description != current_description:
try:
_request_json(
patch_url,
token=token,
method="PATCH",
body={
"title": patch_title,
"description": updated_description,
},
)
summary_updated = True
final_description = updated_description
except RuntimeError as exc:
if not _is_description_limit_error(exc):
raise
description_compacted = True
if not description_compacted:
try:
_request_json(
patch_url,
token=token,
method="PATCH",
body={
"title": patch_title,
"description": updated_description,
},
)
summary_updated = True
final_description = updated_description
except RuntimeError as exc:
if not _is_description_limit_error(exc):
raise
description_compacted = True
if description_compacted:
compact_ai_block = ""
if ai_summary:
compact_ai_block = "\n### AI Reviewer Narrative\n" + COMPACT_AI_THREAD_NOTE
@@ -2660,6 +2889,8 @@ def main() -> int:
"",
f"- **Summary Version:** `{AUTO_SUMMARY_VERSION}`",
_compact_deterministic_summary(deterministic),
"",
COMPACT_DETERMINISTIC_THREAD_NOTE,
compact_ai_block,
AUTO_BLOCK_END,
]
@@ -2670,10 +2901,11 @@ def main() -> int:
)
if compact_description == updated_description:
raise
print(
"WARNING: Full PR summary update failed; retrying with compact summary block. "
f"Reason: {exc}"
)
if not summary_updated:
print(
"INFO: Full PR summary exceeds Azure DevOps description limit; "
"using compact summary in description and posting full details to a PR thread."
)
try:
_request_json(
patch_url,
@@ -2697,6 +2929,7 @@ def main() -> int:
f"- **Summary Version:** `{AUTO_SUMMARY_VERSION}`",
_compact_deterministic_summary(deterministic),
"",
COMPACT_DETERMINISTIC_THREAD_NOTE,
COMPACT_AI_THREAD_NOTE,
AUTO_BLOCK_END,
]
@@ -2720,6 +2953,34 @@ def main() -> int:
else:
final_description = updated_description
if description_compacted:
try:
thread_updated = _sync_deterministic_thread(
repo_api=repo_api,
pr_id=int(pr_id),
token=token,
workload=args.workload,
deterministic_summary=deterministic,
)
if thread_updated:
print(f"Updated full deterministic summary thread for PR #{pr_id} ({args.workload}).")
else:
print(f"Full deterministic summary thread already up to date for PR #{pr_id} ({args.workload}).")
except Exception as exc:
print(f"WARNING: Failed to sync full deterministic summary thread for PR #{pr_id}: {exc}")
else:
try:
closed = _close_deterministic_thread(
repo_api=repo_api,
pr_id=int(pr_id),
token=token,
workload=args.workload,
)
if closed:
print(f"Closed full deterministic summary thread for PR #{pr_id} ({args.workload}) because description now fits.")
except Exception as exc:
print(f"WARNING: Failed to close deterministic summary thread for PR #{pr_id}: {exc}")
if summary_updated:
print(f"Updated automated review summary for PR #{pr_id} ({args.workload}).")
else:
@@ -2739,6 +3000,19 @@ def main() -> int:
print(f"Full AI reviewer narrative thread already up to date for PR #{pr_id} ({args.workload}).")
except Exception as exc:
print(f"WARNING: Failed to sync full AI reviewer narrative thread for PR #{pr_id}: {exc}")
try:
guide_updated = _sync_reviewer_guide_thread(
repo_api=repo_api,
pr_id=int(pr_id),
token=token,
workload=args.workload,
)
if guide_updated:
print(f"Updated reviewer guide thread for PR #{pr_id} ({args.workload}).")
else:
print(f"Reviewer guide thread already up to date for PR #{pr_id} ({args.workload}).")
except Exception as exc:
print(f"WARNING: Failed to sync reviewer guide thread for PR #{pr_id}: {exc}")
if _publish_draft_pr(
repo_api=repo_api,
token=token,

View File

@@ -1,4 +1,4 @@
# tenant-state
This directory is populated automatically by the ASTRAL pipeline.
This directory is populated automatically by the ASTRAL backup pipeline.
Do not place manual files here; they will be overwritten on the next export.