Sync from dev @ 497baf0
Source: main (497baf0) Excluded: live tenant exports, generated artifacts, and dev-only tooling.
This commit is contained in:
8
.gitignore
vendored
8
.gitignore
vendored
@@ -6,3 +6,11 @@ node_modules/
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
package.json
|
||||
package-lock.json
|
||||
**/local.settings.json
|
||||
|
||||
# Azure Function deployment artifacts (copied/generated during zip build)
|
||||
infra/change-probe/.python_packages/
|
||||
infra/change-probe/scripts/
|
||||
*.zip
|
||||
|
||||
22
AGENTS.md
22
AGENTS.md
@@ -6,7 +6,7 @@ This repository tracks Git-based snapshots of Microsoft Intune and Entra ID conf
|
||||
|
||||
The implementation is centered on three Azure DevOps pipelines:
|
||||
|
||||
- `azure-pipelines.yml`: hourly backup/export pipeline with rolling PR management.
|
||||
- `azure-pipelines.yml`: daily full backup/export pipeline with rolling PR management (previously hourly; now driven primarily by event-driven change probe).
|
||||
- `azure-pipelines-review-sync.yml`: 20-minute reviewer-decision sync and post-merge remediation queue.
|
||||
- `azure-pipelines-restore.yml`: manual or auto-queued restore pipeline for approved baseline rollback.
|
||||
|
||||
@@ -33,15 +33,18 @@ Workflow at a high level:
|
||||
|
||||
```
|
||||
.
|
||||
├── azure-pipelines.yml # Main hourly backup pipeline
|
||||
├── azure-pipelines.yml # Main backup pipeline (daily snapshot + event-driven trigger)
|
||||
├── azure-pipelines-review-sync.yml # 20-minute review sync
|
||||
├── azure-pipelines-restore.yml # Baseline restore pipeline
|
||||
├── scripts/ # Python automation helpers
|
||||
├── tests/ # unittest coverage for scripts
|
||||
├── tenant-state/ # Committed JSON exports and reports
|
||||
├── tenant-state/ # Committed JSON exports and reports
|
||||
│ ├── intune/
|
||||
│ ├── entra/
|
||||
│ └── reports/
|
||||
├── infra/ # Azure Function App (change probe)
|
||||
│ └── change-probe/
|
||||
├── deploy/ # Infrastructure provisioning scripts
|
||||
├── docs/ # Security review docs and roadmap
|
||||
├── md2pdf/ # HTML/PDF styling and configs
|
||||
├── prod-as-built.md # Generated as-built source
|
||||
@@ -63,6 +66,8 @@ Workflow at a high level:
|
||||
- `update_pr_review_summary.py`: refreshes PR descriptions with change counts, risk assessment, and optional AI narrative.
|
||||
- `apply_reviewer_rejections.py`: processes `/reject` and `/accept` reviewer thread commands.
|
||||
- `queue_post_merge_restore.py`: queues restore pipeline after merged PRs that contained `/reject` decisions.
|
||||
- `probe_tenant_changes.py`: polls Intune/Entra audit logs via Graph, implements debouncer (idle → armed → cooldown), and decides whether to trigger a backup.
|
||||
- `trigger_backup_pipeline.py`: thin ADO REST API wrapper to queue the backup pipeline on demand.
|
||||
|
||||
## Code Style and Conventions
|
||||
|
||||
@@ -106,6 +111,17 @@ pip3 install "IntuneCD==2.5.0"
|
||||
|
||||
For local development, only a Python 3 interpreter is required; scripts use the standard library except for the optional IntuneCD package.
|
||||
|
||||
### Change Probe (Event-Driven Backup Trigger)
|
||||
|
||||
Because Microsoft Graph change notifications and delta queries do not support Intune device management or Conditional Access resources, an audit-log polling architecture is used instead:
|
||||
|
||||
- **Azure Function App** (`infra/change-probe/`):
|
||||
- `probe_timer`: 5-minute timer trigger. Loads debouncer state from Azure Table Storage, runs `probe_tenant_changes.py`, writes state back, and emits a queue message when the debouncer triggers.
|
||||
- `queue_consumer`: queue trigger. Dequeues messages and calls `trigger_backup_pipeline.py` to queue the ADO backup pipeline.
|
||||
- **Debouncer**: 15-minute quiet window (idle → armed) + 30-minute cooldown. Prevents backup storms during bulk changes.
|
||||
- **State**: stored in Azure Table Storage (`ProbeState` table).
|
||||
- **Provisioning**: `deploy/provision-change-probe.ps1` creates the Entra app, grants admin consent, provisions Resource Group / Storage Account / Function App, and configures app settings.
|
||||
|
||||
### Pipeline Jobs
|
||||
|
||||
- **Intune backup job** (`backup_intune`):
|
||||
|
||||
57
README.md
57
README.md
@@ -13,7 +13,7 @@ Quick start:
|
||||
1. Fork or import this repository into an Azure DevOps project.
|
||||
2. Review `templates/variables-tenant.yml` and create a matching Azure DevOps Variable Group in your project (e.g. `vg-astral-tenant`).
|
||||
3. Uncomment the variable group reference in the three pipeline YAMLs.
|
||||
4. Run `deploy/bootstrap-tenant.ps1` to create the Azure AD app registration, assign Graph permissions, and configure the federated credential.
|
||||
4. Run `deploy/provision-change-probe.ps1` to create the Azure AD app registration, assign Graph permissions, configure the federated credential, and optionally provision the event-driven change probe (Azure Function App).
|
||||
5. Create the Azure DevOps service connection using the app registration details from the bootstrap script.
|
||||
6. Import the three pipelines (`azure-pipelines.yml`, `azure-pipelines-review-sync.yml`, `azure-pipelines-restore.yml`) into Azure DevOps.
|
||||
7. Run `deploy/validate-deployment.yml` to verify connectivity and permissions.
|
||||
@@ -25,7 +25,7 @@ See [`deploy/onboarding-runbook.md`](deploy/onboarding-runbook.md) for the full
|
||||
|
||||
The implementation is centered on three Azure DevOps pipelines:
|
||||
|
||||
- `azure-pipelines.yml`: hourly backup/export pipeline with rolling PR management.
|
||||
- `azure-pipelines.yml`: backup/export pipeline with rolling PR management. Runs daily at 02:00 to generate a full tenant snapshot, reports, and documentation artifacts, and is also triggered on-demand by the event-driven change probe.
|
||||
- `azure-pipelines-review-sync.yml`: 20-minute reviewer-decision sync and post-merge remediation queue.
|
||||
- `azure-pipelines-restore.yml`: manual or auto-queued restore pipeline for approved baseline rollback.
|
||||
|
||||
@@ -39,6 +39,8 @@ The main workflow is:
|
||||
6. Refresh the PR description with deterministic change/risk summary and optional Azure OpenAI narrative.
|
||||
7. Apply reviewer `/reject` or `/accept` decisions and queue restore when needed.
|
||||
|
||||
An **event-driven change probe** monitors Intune and Entra audit logs and triggers the backup pipeline when actual changes are detected, replacing the previous hourly polling model with a responsive event-driven approach.
|
||||
|
||||
This is an ex-post change-management model: admins can change settings in the Microsoft admin portals, and the repo turns those changes into auditable Git drift with a review and rollback path.
|
||||
|
||||
## Current Baseline Coverage
|
||||
@@ -80,10 +82,12 @@ Current scope behavior:
|
||||
- `azure-pipelines.yml`: backup/export, report generation, drift commit, rolling PR, and docs/artifact flow.
|
||||
- `azure-pipelines-review-sync.yml`: reviewer decision sync and post-merge remediation helper.
|
||||
- `azure-pipelines-restore.yml`: baseline restore pipeline with full or selective scope.
|
||||
- `infra/change-probe/`: Azure Function App for event-driven change detection.
|
||||
- `deploy/provision-change-probe.ps1`: unified provisioning script for the change probe infrastructure.
|
||||
- `docs/m365-baseline-roadmap.md`: expansion roadmap beyond current workload scope.
|
||||
- `docs/security-review-package.md`: implementation-focused security review package.
|
||||
- `docs/security-review-questionnaire.md`: short-form security review answers.
|
||||
- `scripts/`: export, reporting, PR automation, validation, and remediation helpers.
|
||||
- `scripts/`: export, reporting, PR automation, validation, remediation helpers, and change probe logic.
|
||||
- `tests/`: focused unit coverage for the Python helpers.
|
||||
- `tenant-state/intune`: committed Intune JSON export.
|
||||
- `tenant-state/entra`: committed Entra JSON export.
|
||||
@@ -96,7 +100,7 @@ Current scope behavior:
|
||||
|
||||
### Main Backup Pipeline
|
||||
|
||||
`azure-pipelines.yml` runs hourly on `main`.
|
||||
`azure-pipelines.yml` runs daily at 02:00 on `main` to generate a full tenant snapshot, reports, and documentation artifacts. It is also triggered on-demand by the change probe when drift is detected.
|
||||
|
||||
For Intune it:
|
||||
|
||||
@@ -143,10 +147,21 @@ It also supports optional Entra update when restore automation is triggered for
|
||||
|
||||
## Schedule And Run Modes
|
||||
|
||||
- Main backup schedule: hourly, `0 * * * *`, on `main`
|
||||
- Main backup schedule: daily at 02:00, `0 2 * * *`, on `main` (full snapshot, reports, and docs)
|
||||
- Change probe trigger: event-driven, on-demand via Azure Function App
|
||||
- Review sync schedule: every 20 minutes, `*/20 * * * *`, on `main`
|
||||
- Full mode: configured full-run hour (default 00:00) or manual queue with `forceFullRun=true`
|
||||
- Light mode: every other scheduled hour
|
||||
- Light mode: all probe-triggered runs except the daily full run
|
||||
|
||||
### Change Probe (Event-Driven Backup Trigger)
|
||||
|
||||
Because Microsoft Graph change notifications and delta queries do not support Intune device management or Conditional Access resources, an audit-log polling architecture is used:
|
||||
|
||||
- **`probe_timer`** (5-minute timer trigger): polls Intune and Entra audit logs via Microsoft Graph, evaluates a debouncer state machine (idle → armed → cooldown), and emits a queue message when the quiet window elapses.
|
||||
- **`queue_consumer`** (queue trigger): dequeues messages and calls the Azure DevOps REST API to queue the backup pipeline.
|
||||
- **Debouncer**: 15-minute quiet window + 30-minute cooldown prevents backup storms during bulk changes.
|
||||
- **State**: stored in Azure Table Storage (`ProbeState` table).
|
||||
- **Provisioning**: `deploy/provision-change-probe.ps1` creates the Entra app, grants admin consent, provisions Resource Group / Storage Account / Function App, and configures app settings.
|
||||
|
||||
Full mode adds:
|
||||
|
||||
@@ -251,6 +266,14 @@ Auto-remediation:
|
||||
- `AUTO_REMEDIATE_MAX_WORKERS`
|
||||
- `AUTO_REMEDIATE_EXCLUDE_CSV`
|
||||
|
||||
Change probe settings:
|
||||
|
||||
- `PROBE_APP_ID`
|
||||
- `PROBE_APP_SECRET`
|
||||
- `PROBE_QUIET_WINDOW_MINUTES` (default: 15)
|
||||
- `PROBE_COOLDOWN_MINUTES` (default: 30)
|
||||
- `GRAPH_TOKEN` (optional passthrough)
|
||||
|
||||
Azure OpenAI integration:
|
||||
|
||||
- `ENABLE_PR_AI_SUMMARY`
|
||||
@@ -408,6 +431,28 @@ python3 ./scripts/validate_backup_outputs.py \
|
||||
--reports-root ./tenant-state/reports/intune
|
||||
```
|
||||
|
||||
Run the change probe locally:
|
||||
|
||||
```bash
|
||||
python3 ./scripts/probe_tenant_changes.py \
|
||||
--app-id "$PROBE_APP_ID" \
|
||||
--app-secret "$PROBE_APP_SECRET" \
|
||||
--tenant-id "$TENANT_ID" \
|
||||
--state-file ./probe-state.json \
|
||||
--output ./probe-result.json
|
||||
```
|
||||
|
||||
Trigger the backup pipeline manually:
|
||||
|
||||
```bash
|
||||
python3 ./scripts/trigger_backup_pipeline.py \
|
||||
--organization cqre \
|
||||
--project Intune \
|
||||
--pipeline-id 1 \
|
||||
--token "$ADO_TOKEN" \
|
||||
--branch refs/heads/main
|
||||
```
|
||||
|
||||
## Tests
|
||||
|
||||
The repository includes focused unit tests for:
|
||||
|
||||
@@ -6,8 +6,8 @@ parameters:
|
||||
default: false
|
||||
|
||||
schedules:
|
||||
- cron: "0 * * * *"
|
||||
displayName: "Hourly backup (full run at configured timezone)"
|
||||
- cron: "0 2 * * *"
|
||||
displayName: "Daily full backup and report generation"
|
||||
branches:
|
||||
include:
|
||||
- main
|
||||
@@ -369,6 +369,19 @@ jobs:
|
||||
workingDirectory: "$(Build.SourcesDirectory)"
|
||||
failOnStderr: true
|
||||
|
||||
- task: Bash@3
|
||||
displayName: Revert formatting-only Intune JSON exports
|
||||
inputs:
|
||||
targetType: inline
|
||||
script: |
|
||||
set -euo pipefail
|
||||
python3 "$(Build.SourcesDirectory)/scripts/filter_intune_formatting_noise.py" \
|
||||
--repo-root "$(Build.SourcesDirectory)" \
|
||||
--backup-root "$(Build.SourcesDirectory)/$(BACKUP_FOLDER)/$(INTUNE_BACKUP_SUBDIR)" \
|
||||
--baseline-ref "origin/$(BASELINE_BRANCH)"
|
||||
workingDirectory: "$(Build.SourcesDirectory)"
|
||||
failOnStderr: true
|
||||
|
||||
- task: Bash@3
|
||||
displayName: Resolve assignment group names
|
||||
inputs:
|
||||
@@ -1017,13 +1030,14 @@ jobs:
|
||||
}
|
||||
|
||||
$backupStart = [DateTime]::ParseExact("$(BACKUP_START)", "yyyy.MM.dd:HH.mm.ss", $null).ToUniversalTime()
|
||||
$filterDateTimeTo = Get-Date -Date $backupStart -Format "yyyy-MM-ddTHH:mm:ss"
|
||||
$auditQueryEnd = $backupStart.AddMinutes(-10)
|
||||
$filterDateTimeTo = Get-Date -Date $auditQueryEnd -Format "yyyy-MM-ddTHH:mm:ss"
|
||||
$filter += "ActivityDateTime le $filterDateTimeTo`Z"
|
||||
|
||||
$eventFilter = $filter -join " and "
|
||||
|
||||
"`nGetting Intune event logs"
|
||||
"`t- from: '$lastCommitDate' (UTC) to: '$backupStart' (UTC)"
|
||||
"`t- from: '$lastCommitDate' (UTC) to: '$auditQueryEnd' (UTC)"
|
||||
"`t- filter: $eventFilter"
|
||||
$modificationEvent = Get-MgDeviceManagementAuditEvent -Filter $eventFilter -All
|
||||
|
||||
|
||||
@@ -24,6 +24,8 @@ Expected result: **zero matches** outside of this release checklist.
|
||||
- [ ] `azure-pipelines-restore.yml` contains no hardcoded tenant domain, email, or service connection name.
|
||||
- [ ] `azure-pipelines-review-sync.yml` contains no hardcoded tenant-specific values.
|
||||
- [ ] `scripts/common.py` uses a generic fallback name (not `CQRE_Intune_Backupper`).
|
||||
- [ ] `infra/change-probe/` contains no tenant-specific IDs, secrets, or connection strings.
|
||||
- [ ] `infra/change-probe/local.settings.json` is excluded (only `.example` should exist).
|
||||
- [ ] `tenant-state/` contains only placeholder files (`.gitkeep`, `README.md`).
|
||||
- [ ] `prod-as-built.md` has been deleted.
|
||||
- [ ] All markdown documentation uses generic examples (`contoso.onmicrosoft.com`, `astral-backup@contoso.com`, `sc-astral-backup`).
|
||||
|
||||
@@ -130,6 +130,64 @@ After importing `azure-pipelines-restore.yml`, find its definition ID:
|
||||
2. Set `forceFullRun=true` to get a complete initial snapshot.
|
||||
3. Verify that `tenant-state/` is populated and a rolling PR is created.
|
||||
|
||||
## Step 11: Provision the event-driven change probe (optional but recommended)
|
||||
|
||||
The change probe replaces the previous hourly polling model with responsive, event-driven backup triggers.
|
||||
|
||||
### Option A: Automated provisioning
|
||||
|
||||
Run the unified provisioning script:
|
||||
|
||||
```powershell
|
||||
.\deploy\provision-change-probe.ps1 `
|
||||
-TenantName "contoso.onmicrosoft.com" `
|
||||
-ResourceGroupName "rg-astral-probe" `
|
||||
-Location "westeurope" `
|
||||
-DeployFunctionApp
|
||||
```
|
||||
|
||||
The script will create an Entra app, grant admin consent, provision Azure resources, and deploy the Function App.
|
||||
|
||||
### Option B: Manual provisioning
|
||||
|
||||
If you prefer manual setup:
|
||||
|
||||
1. **Create an app registration** in Entra ID for the probe.
|
||||
2. **Grant admin consent** for:
|
||||
- `DeviceManagementConfiguration.Read.All`
|
||||
- `DeviceManagementApps.Read.All`
|
||||
- `AuditLog.Read.All`
|
||||
- `Directory.Read.All`
|
||||
3. **Create a client secret** and note the value.
|
||||
4. **Provision Azure resources**:
|
||||
- Resource Group
|
||||
- Storage Account (Standard LRS)
|
||||
- Function App (Linux Consumption, Python 3.11)
|
||||
5. **Configure Function App settings**:
|
||||
| Setting | Value |
|
||||
|---|---|
|
||||
| `AzureWebJobsStorage` | Storage account connection string |
|
||||
| `PROBE_APP_ID` | App registration client ID |
|
||||
| `PROBE_APP_SECRET` | App registration client secret |
|
||||
| `TENANT_ID` | Your Microsoft 365 tenant ID |
|
||||
| `ADO_ORGANIZATION` | Your Azure DevOps org name |
|
||||
| `ADO_PROJECT` | Your Azure DevOps project name |
|
||||
| `ADO_PIPELINE_ID` | Definition ID of `azure-pipelines.yml` |
|
||||
| `ADO_TOKEN` | Azure DevOps PAT with **Build (read & execute)** |
|
||||
| `ADO_BRANCH` | `main` (or your baseline branch) |
|
||||
6. **Deploy the function package** using `WEBSITE_RUN_FROM_PACKAGE` (see `infra/change-probe/README.md`).
|
||||
|
||||
### Verify the probe
|
||||
|
||||
1. Make a test change in Intune (e.g., create a temporary device configuration profile).
|
||||
2. Wait 5–20 minutes for the audit log to propagate.
|
||||
3. Check the `ProbeState` table in your Storage Account — the `singleton/default` entity should show `debouncer.state = armed`.
|
||||
4. After the quiet window (default 15 min) elapses, a queue message will be emitted.
|
||||
5. The `queue_consumer` will dequeue it and queue the backup pipeline.
|
||||
6. Verify the pipeline run appears in Azure DevOps with reason `manual` (API-triggered runs show as manual).
|
||||
|
||||
> **Note:** The probe uses the same Entra app as the main backup pipeline. You can reuse the app registration created by `bootstrap-tenant.ps1` if you add the `AuditLog.Read.All` permission and create a client secret for it.
|
||||
|
||||
## Optional: progressive feature rollout
|
||||
|
||||
| Phase | What to enable |
|
||||
|
||||
578
deploy/provision-change-probe.ps1
Executable file
578
deploy/provision-change-probe.ps1
Executable file
@@ -0,0 +1,578 @@
|
||||
#requires -Version 5.1
|
||||
<#
|
||||
.SYNOPSIS
|
||||
One-stop provisioning script for the ASTRAL change probe.
|
||||
|
||||
.DESCRIPTION
|
||||
This script handles the entire probe deployment in one pass:
|
||||
1. Creates (or updates) a dedicated Entra app registration with Graph permissions.
|
||||
2. Grants admin consent.
|
||||
3. Provisions Azure resources (Resource Group, Storage Account, Function App).
|
||||
4. Configures Function App settings.
|
||||
5. Optionally deploys the function code if the Azure Functions Core Tools (func) are installed.
|
||||
|
||||
Any parameter omitted on the command line is prompted for interactively.
|
||||
|
||||
.PARAMETER AppDisplayName
|
||||
Display name for the Entra app registration. Default: "ASTRAL Change Probe".
|
||||
|
||||
.PARAMETER ResourceGroup
|
||||
Azure resource group name. Default: "rg-astral-probe".
|
||||
|
||||
.PARAMETER Location
|
||||
Azure region. Default: "westeurope".
|
||||
|
||||
.PARAMETER SubscriptionId
|
||||
Azure subscription ID. If omitted, the current default subscription is used.
|
||||
|
||||
.PARAMETER AdoOrganization
|
||||
Azure DevOps organization name (e.g. "contoso").
|
||||
|
||||
.PARAMETER AdoProject
|
||||
Azure DevOps project name.
|
||||
|
||||
.PARAMETER AdoPipelineId
|
||||
Azure DevOps pipeline ID (numeric).
|
||||
|
||||
.PARAMETER AdoToken
|
||||
Azure DevOps Personal Access Token with Build (Read & Execute) scope.
|
||||
|
||||
.PARAMETER AdoBranch
|
||||
Git branch the pipeline should run against. Default: "main".
|
||||
|
||||
.PARAMETER QuietWindowMinutes
|
||||
Debouncer quiet window. Default: 15.
|
||||
|
||||
.PARAMETER CooldownMinutes
|
||||
Debouncer cooldown. Default: 30.
|
||||
|
||||
.EXAMPLE
|
||||
.\provision-change-probe.ps1
|
||||
|
||||
.EXAMPLE
|
||||
.\provision-change-probe.ps1 -AdoOrganization "cqre" -AdoProject "ASTRAL" -AdoPipelineId "42"
|
||||
#>
|
||||
[CmdletBinding()]
|
||||
param (
|
||||
[string]$AppDisplayName = "ASTRAL Change Probe",
|
||||
[string]$ResourceGroup = "rg-astral-probe",
|
||||
[string]$Location = "westeurope",
|
||||
[string]$SubscriptionId = "",
|
||||
[string]$AdoOrganization = "",
|
||||
[string]$AdoProject = "",
|
||||
[string]$AdoPipelineId = "",
|
||||
[string]$AdoToken = "",
|
||||
[string]$AdoBranch = "main",
|
||||
[int]$QuietWindowMinutes = 15,
|
||||
[int]$CooldownMinutes = 30
|
||||
)
|
||||
|
||||
$ErrorActionPreference = "Stop"
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
function Get-OrPrompt {
|
||||
param ([string]$Value, [string]$Prompt, [switch]$Sensitive)
|
||||
if ($Value) { return $Value }
|
||||
if ($Sensitive) {
|
||||
return Read-Host -Prompt $Prompt -AsSecureString | ForEach-Object { [PSCredential]::New("x", $_).GetNetworkCredential().Password }
|
||||
}
|
||||
return Read-Host -Prompt $Prompt
|
||||
}
|
||||
|
||||
function Test-Command {
|
||||
param ([string]$Name)
|
||||
return [bool](Get-Command $Name -ErrorAction SilentlyContinue)
|
||||
}
|
||||
|
||||
function Invoke-AzCli {
|
||||
param (
|
||||
[string[]]$ArgumentList,
|
||||
[switch]$NoRetry
|
||||
)
|
||||
# Clone the array so recursive calls don't double-append --subscription.
|
||||
$argsCopy = @() + $ArgumentList
|
||||
if ($SubscriptionId) {
|
||||
$argsCopy += @("--subscription", $SubscriptionId)
|
||||
}
|
||||
# Suppress Python SyntaxWarnings that leak from the Azure CLI into stderr/stdout.
|
||||
$env:PYTHONWARNINGS = "ignore"
|
||||
$output = & az @argsCopy 2>&1
|
||||
$env:PYTHONWARNINGS = ""
|
||||
if ($LASTEXITCODE -ne 0) {
|
||||
$outputStrings = @()
|
||||
$hasSubNotFound = $false
|
||||
foreach ($line in $output) {
|
||||
$str = if ($line -is [string]) { $line } else { $line.ToString() }
|
||||
$outputStrings += $str
|
||||
if ($str -match "SubscriptionNotFound") { $hasSubNotFound = $true }
|
||||
}
|
||||
$outputString = $outputStrings -join "`n"
|
||||
if ((-not $NoRetry) -and $hasSubNotFound) {
|
||||
Write-Host "`nARM returned SubscriptionNotFound. Clearing token cache and re-authenticating..." -ForegroundColor Yellow
|
||||
$subTenantId = Get-SubscriptionTenantId -SubId $SubscriptionId
|
||||
$promptTenant = if ($subTenantId) { $subTenantId } else { $tenantId }
|
||||
& az account clear | Out-Null
|
||||
& az login --tenant $promptTenant | Out-Host
|
||||
if ($LASTEXITCODE -ne 0) { throw "az login --tenant $promptTenant failed." }
|
||||
# Explicitly set subscription and give token cache time to settle.
|
||||
& az account set --subscription $SubscriptionId | Out-Null
|
||||
Start-Sleep -Seconds 2
|
||||
Invoke-AzCli -ArgumentList $ArgumentList -NoRetry
|
||||
return
|
||||
}
|
||||
throw "az command failed: az $($argsCopy -join ' ')`n$outputString"
|
||||
}
|
||||
return $output
|
||||
}
|
||||
|
||||
function Test-ModuleInstalled {
|
||||
param ([string]$Name)
|
||||
$mod = Get-Module -ListAvailable -Name $Name | Select-Object -First 1
|
||||
if (-not $mod) {
|
||||
Write-Host "Installing module: $Name" -ForegroundColor Cyan
|
||||
Install-Module $Name -Scope CurrentUser -Force -AllowClobber
|
||||
}
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Prerequisites
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Host "=== ASTRAL Change Probe Provisioning ===" -ForegroundColor Green
|
||||
|
||||
if (-not (Test-Command "az")) {
|
||||
throw "Azure CLI (az) is not installed or not in PATH. Install from https://aka.ms/installazurecli"
|
||||
}
|
||||
|
||||
Write-Host "Checking Microsoft Graph modules..." -ForegroundColor Cyan
|
||||
Test-ModuleInstalled "Microsoft.Graph.Applications"
|
||||
Import-Module Microsoft.Graph.Applications
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Interactive prompts
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Host "`n--- Azure DevOps Settings ---" -ForegroundColor Cyan
|
||||
$AdoOrganization = Get-OrPrompt -Value $AdoOrganization -Prompt "Azure DevOps Organization (e.g. 'cqre')"
|
||||
$AdoProject = Get-OrPrompt -Value $AdoProject -Prompt "Azure DevOps Project"
|
||||
$AdoPipelineId = Get-OrPrompt -Value $AdoPipelineId -Prompt "Azure DevOps Pipeline ID (numeric)"
|
||||
$AdoToken = Get-OrPrompt -Value $AdoToken -Prompt "Azure DevOps PAT (Build Read & Execute)" -Sensitive
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Graph authentication
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Host "`nConnecting to Microsoft Graph..." -ForegroundColor Cyan
|
||||
Connect-MgGraph -Scopes "Application.ReadWrite.All","AppRoleAssignment.ReadWrite.All","Directory.Read.All" -NoWelcome
|
||||
|
||||
$tenant = Get-MgOrganization | Select-Object -First 1
|
||||
Write-Host "Tenant: $($tenant.DisplayName) ($($tenant.Id))" -ForegroundColor Green
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# App registration
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
$requiredPermissions = @(
|
||||
"AuditLog.Read.All",
|
||||
"DeviceManagementApps.Read.All",
|
||||
"DeviceManagementConfiguration.Read.All",
|
||||
"DeviceManagementManagedDevices.Read.All",
|
||||
"DeviceManagementScripts.Read.All",
|
||||
"DeviceManagementServiceConfig.Read.All"
|
||||
)
|
||||
|
||||
$graphSp = Get-MgServicePrincipal -Filter "appId eq '00000003-0000-0000-c000-000000000000'"
|
||||
if (-not $graphSp) { throw "Microsoft Graph service principal not found." }
|
||||
|
||||
$appRoles = @()
|
||||
foreach ($permName in $requiredPermissions) {
|
||||
$appRole = $graphSp.AppRoles | Where-Object { $_.Value -eq $permName } | Select-Object -First 1
|
||||
if (-not $appRole) {
|
||||
Write-Warning "Permission '$permName' not found. Skipping."
|
||||
continue
|
||||
}
|
||||
$appRoles += $appRole
|
||||
}
|
||||
|
||||
$resourceAccess = @()
|
||||
foreach ($ar in $appRoles) {
|
||||
$resourceAccess += @{ id = $ar.Id; type = "Role" }
|
||||
}
|
||||
|
||||
$requiredResourceAccess = @(
|
||||
@{
|
||||
resourceAppId = $graphSp.AppId
|
||||
resourceAccess = $resourceAccess
|
||||
}
|
||||
)
|
||||
|
||||
$existingApp = Get-MgApplication -Filter "displayName eq '$AppDisplayName'" | Select-Object -First 1
|
||||
if ($existingApp) {
|
||||
Write-Host "Found existing app registration: $($existingApp.AppId)" -ForegroundColor Yellow
|
||||
$app = $existingApp
|
||||
Update-MgApplication -ApplicationId $app.Id -RequiredResourceAccess $requiredResourceAccess
|
||||
Write-Host "Updated required resource access." -ForegroundColor Green
|
||||
} else {
|
||||
Write-Host "Creating app registration: $AppDisplayName" -ForegroundColor Cyan
|
||||
$app = New-MgApplication -DisplayName $AppDisplayName -SignInAudience "AzureADMyOrg" -RequiredResourceAccess $requiredResourceAccess
|
||||
Write-Host "Created app registration. AppId: $($app.AppId)" -ForegroundColor Green
|
||||
}
|
||||
|
||||
$sp = Get-MgServicePrincipal -Filter "appId eq '$($app.AppId)'" | Select-Object -First 1
|
||||
if (-not $sp) {
|
||||
Write-Host "Creating service principal..." -ForegroundColor Cyan
|
||||
$sp = New-MgServicePrincipal -AppId $app.AppId
|
||||
}
|
||||
|
||||
Write-Host "Granting admin consent..." -ForegroundColor Cyan
|
||||
foreach ($ar in $appRoles) {
|
||||
$existingAssignment = Get-MgServicePrincipalAppRoleAssignment -ServicePrincipalId $sp.Id | Where-Object { $_.AppRoleId -eq $ar.Id }
|
||||
if (-not $existingAssignment) {
|
||||
New-MgServicePrincipalAppRoleAssignment -ServicePrincipalId $sp.Id -PrincipalId $sp.Id -ResourceId $graphSp.Id -AppRoleId $ar.Id | Out-Null
|
||||
}
|
||||
}
|
||||
Write-Host "Admin consent granted." -ForegroundColor Green
|
||||
|
||||
# Client secret
|
||||
$secretDescription = "ChangeProbeSecret"
|
||||
$appWithCreds = Get-MgApplication -ApplicationId $app.Id -Property "id,passwordCredentials"
|
||||
$existingSecrets = $appWithCreds.PasswordCredentials | Where-Object { $_.DisplayName -eq $secretDescription }
|
||||
foreach ($cred in $existingSecrets) {
|
||||
Write-Host "Removing old client secret ($($cred.KeyId))..." -ForegroundColor Yellow
|
||||
Remove-MgApplicationPassword -ApplicationId $app.Id -BodyParameter @{ "keyId" = $cred.KeyId }
|
||||
}
|
||||
|
||||
Write-Host "Creating new client secret (valid 1 year)..." -ForegroundColor Cyan
|
||||
$passwordCred = @{
|
||||
displayName = $secretDescription
|
||||
endDateTime = (Get-Date).AddYears(1).ToString("o")
|
||||
}
|
||||
$secret = Add-MgApplicationPassword -ApplicationId $app.Id -BodyParameter $passwordCred
|
||||
$probeAppId = $app.AppId
|
||||
$probeAppSecret = $secret.SecretText
|
||||
$tenantId = $tenant.Id
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Azure authentication
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Host "`n--- Azure Resources ---" -ForegroundColor Cyan
|
||||
|
||||
function Ensure-AzLogin {
|
||||
param ([string]$TenantId)
|
||||
try {
|
||||
$null = Invoke-AzCli -ArgumentList @("account", "show", "--output", "none")
|
||||
} catch {
|
||||
if ($_ -match "az login") {
|
||||
$answer = Read-Host -Prompt "You are not logged in to Azure CLI. Run 'az login' now? [Y/n]"
|
||||
if ($answer -eq "" -or $answer -match "^[Yy]") {
|
||||
if ($TenantId) {
|
||||
& az login --tenant $TenantId | Out-Host
|
||||
} else {
|
||||
& az login | Out-Host
|
||||
}
|
||||
if ($LASTEXITCODE -ne 0) {
|
||||
throw "az login failed. Please run 'az login' manually and retry."
|
||||
}
|
||||
} else {
|
||||
throw "Azure login required. Run 'az login' and retry."
|
||||
}
|
||||
} else {
|
||||
throw
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Ensure-AzLogin -TenantId $tenantId
|
||||
|
||||
function Select-Subscription {
|
||||
param ([string]$CurrentId)
|
||||
# Run az directly and filter out stderr warning objects so only stdout strings reach ConvertFrom-Json.
|
||||
$lines = & az account list --output json 2>&1
|
||||
$stringLines = $lines | Where-Object { $_ -is [string] }
|
||||
if ($LASTEXITCODE -ne 0) {
|
||||
$errorLines = $lines | Where-Object { $_ -is [System.Management.Automation.ErrorRecord] } | ForEach-Object { $_.ToString() }
|
||||
throw "az account list failed:`n$($errorLines -join "`n")"
|
||||
}
|
||||
$subs = ($stringLines -join "`n") | ConvertFrom-Json
|
||||
if ($subs.Count -eq 0) {
|
||||
throw "No Azure subscriptions found. Ensure your account has access to at least one subscription."
|
||||
}
|
||||
if ($subs.Count -eq 1) {
|
||||
$sub = $subs[0]
|
||||
Invoke-AzCli -ArgumentList @("account", "set", "--subscription", $sub.id)
|
||||
return $sub
|
||||
}
|
||||
Write-Host "`nAvailable subscriptions:" -ForegroundColor Cyan
|
||||
for ($i = 0; $i -lt $subs.Count; $i++) {
|
||||
$marker = if ($subs[$i].id -eq $CurrentId) { " (*)" } else { "" }
|
||||
Write-Host " [$i] $($subs[$i].name) ($($subs[$i].id))$marker"
|
||||
}
|
||||
$selection = Read-Host -Prompt "Select subscription by number"
|
||||
if (-not [int]::TryParse($selection, [ref]$null)) {
|
||||
throw "Invalid selection. Aborting."
|
||||
}
|
||||
$chosen = $subs[[int]$selection]
|
||||
if (-not $chosen) {
|
||||
throw "Invalid selection. Aborting."
|
||||
}
|
||||
Invoke-AzCli -ArgumentList @("account", "set", "--subscription", $chosen.id)
|
||||
return $chosen
|
||||
}
|
||||
|
||||
$azLines = & az account show --output json 2>&1
|
||||
$azStringLines = $azLines | Where-Object { $_ -is [string] }
|
||||
if ($LASTEXITCODE -ne 0) {
|
||||
$azErrorLines = $azLines | Where-Object { $_ -is [System.Management.Automation.ErrorRecord] } | ForEach-Object { $_.ToString() }
|
||||
throw "az account show failed:`n$($azErrorLines -join "`n")"
|
||||
}
|
||||
$azAccount = ($azStringLines -join "`n") | ConvertFrom-Json
|
||||
$currentSubId = $azAccount.id
|
||||
|
||||
function Get-SubscriptionTenantId {
|
||||
param ([string]$SubId)
|
||||
$lines = & az account list --output json 2>&1
|
||||
$stringLines = $lines | Where-Object { $_ -is [string] }
|
||||
$subs = ($stringLines -join "`n") | ConvertFrom-Json
|
||||
$sub = $subs | Where-Object { $_.id -eq $SubId } | Select-Object -First 1
|
||||
if ($sub) { return $sub.tenantId } else { return $null }
|
||||
}
|
||||
|
||||
if ($SubscriptionId) {
|
||||
Invoke-AzCli -ArgumentList @("account", "set", "--subscription", $SubscriptionId)
|
||||
$subTenantId = Get-SubscriptionTenantId -SubId $SubscriptionId
|
||||
$azTenantLines = & az account show --query tenantId --output tsv 2>&1 | Where-Object { $_ -is [string] }
|
||||
$azTenantId = ($azTenantLines -join "").Trim()
|
||||
if ($subTenantId -and $azTenantId -ne $subTenantId) {
|
||||
Write-Host "`nSubscription '$SubscriptionId' belongs to tenant '$subTenantId' but current az context is '$azTenantId'." -ForegroundColor Yellow
|
||||
Write-Host "Re-authenticating to the subscription's tenant..." -ForegroundColor Yellow
|
||||
& az account clear | Out-Null
|
||||
& az login --tenant $subTenantId | Out-Host
|
||||
if ($LASTEXITCODE -ne 0) { throw "az login --tenant $subTenantId failed." }
|
||||
Invoke-AzCli -ArgumentList @("account", "set", "--subscription", $SubscriptionId)
|
||||
}
|
||||
Write-Host "Using specified subscription: $SubscriptionId" -ForegroundColor Green
|
||||
} else {
|
||||
$chosenSub = Select-Subscription -CurrentId $currentSubId
|
||||
$SubscriptionId = $chosenSub.id
|
||||
$subTenantId = $chosenSub.tenantId
|
||||
$azTenantLines = & az account show --query tenantId --output tsv 2>&1 | Where-Object { $_ -is [string] }
|
||||
$azTenantId = ($azTenantLines -join "").Trim()
|
||||
if ($subTenantId -and $azTenantId -ne $subTenantId) {
|
||||
Write-Host "`nSubscription '$SubscriptionId' belongs to tenant '$subTenantId' but current az context is '$azTenantId'." -ForegroundColor Yellow
|
||||
Write-Host "Re-authenticating to the subscription's tenant..." -ForegroundColor Yellow
|
||||
& az account clear | Out-Null
|
||||
& az login --tenant $subTenantId | Out-Host
|
||||
if ($LASTEXITCODE -ne 0) { throw "az login --tenant $subTenantId failed." }
|
||||
$chosenSub = Select-Subscription -CurrentId $SubscriptionId
|
||||
$SubscriptionId = $chosenSub.id
|
||||
}
|
||||
Write-Host "Using subscription: $SubscriptionId" -ForegroundColor Green
|
||||
}
|
||||
|
||||
# Validate the subscription is accessible for ARM operations (catches tenant mismatches).
|
||||
try {
|
||||
$null = Invoke-AzCli -ArgumentList @("group", "list", "--output", "none")
|
||||
} catch {
|
||||
if ($_ -match "SubscriptionNotFound") {
|
||||
Write-Host "`nThe selected subscription is listed but ARM operations fail with 'SubscriptionNotFound'." -ForegroundColor Yellow
|
||||
Write-Host "This usually means the subscription belongs to a different Entra tenant." -ForegroundColor Yellow
|
||||
$subTenantId = Get-SubscriptionTenantId -SubId $SubscriptionId
|
||||
$promptTenant = if ($subTenantId) { $subTenantId } else { $tenantId }
|
||||
$answer = Read-Host -Prompt "Run 'az login --tenant $promptTenant' now and retry? [Y/n]"
|
||||
if ($answer -eq "" -or $answer -match "^[Yy]") {
|
||||
& az account clear | Out-Null
|
||||
& az login --tenant $promptTenant | Out-Host
|
||||
if ($LASTEXITCODE -ne 0) {
|
||||
throw "az login --tenant failed. Please run it manually and retry."
|
||||
}
|
||||
$chosenSub = Select-Subscription -CurrentId $SubscriptionId
|
||||
$SubscriptionId = $chosenSub.id
|
||||
Write-Host "Using subscription: $SubscriptionId" -ForegroundColor Green
|
||||
# Validate again
|
||||
$null = Invoke-AzCli -ArgumentList @("group", "list", "--output", "none")
|
||||
} else {
|
||||
throw "Subscription validation failed. Run 'az login --tenant $promptTenant' and retry."
|
||||
}
|
||||
} else {
|
||||
throw
|
||||
}
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Resource Group
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Host "Ensuring resource group '$ResourceGroup'..." -ForegroundColor Cyan
|
||||
Invoke-AzCli -ArgumentList @("group", "create", "--name", $ResourceGroup, "--location", $Location, "--output", "none")
|
||||
|
||||
# Quick diagnostic: confirm ARM can read back the RG in this subscription.
|
||||
try {
|
||||
$diag = Invoke-AzCli -ArgumentList @("group", "show", "--name", $ResourceGroup, "--query", "id", "--output", "tsv")
|
||||
Write-Host "ARM context OK (RG id: $diag)" -ForegroundColor Green
|
||||
} catch {
|
||||
Write-Host "WARNING: ARM diagnostic failed: $_" -ForegroundColor Yellow
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Storage Account
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
$randomSuffix = [System.Guid]::NewGuid().ToString("n").Substring(0, 8)
|
||||
$StorageName = "stastralprobe$randomSuffix"
|
||||
$FunctionAppName = "func-astral-probe-$randomSuffix"
|
||||
|
||||
function Wait-ProviderRegistration {
|
||||
param ([string]$Namespace)
|
||||
$state = ""
|
||||
$attempts = 0
|
||||
while ($state -ne "Registered" -and $attempts -lt 30) {
|
||||
$state = Invoke-AzCli -ArgumentList @("provider", "show", "--namespace", $Namespace, "--query", "registrationState", "--output", "tsv")
|
||||
if ($state -eq "Registered") { break }
|
||||
Start-Sleep -Seconds 10
|
||||
$attempts++
|
||||
}
|
||||
if ($state -ne "Registered") {
|
||||
throw "Timed out waiting for $Namespace provider to register."
|
||||
}
|
||||
}
|
||||
|
||||
Write-Host "Creating storage account '$StorageName'..." -ForegroundColor Cyan
|
||||
|
||||
# Ensure Microsoft.Storage provider is registered (required for new subscriptions).
|
||||
$storageProv = Invoke-AzCli -ArgumentList @("provider", "show", "--namespace", "Microsoft.Storage", "--query", "registrationState", "--output", "tsv")
|
||||
if ($storageProv -ne "Registered") {
|
||||
Write-Host "Registering Microsoft.Storage provider..." -ForegroundColor Yellow
|
||||
Invoke-AzCli -ArgumentList @("provider", "register", "--namespace", "Microsoft.Storage")
|
||||
Wait-ProviderRegistration -Namespace "Microsoft.Storage"
|
||||
Write-Host "Microsoft.Storage registered." -ForegroundColor Green
|
||||
}
|
||||
|
||||
Invoke-AzCli -ArgumentList @(
|
||||
"storage", "account", "create",
|
||||
"--name", $StorageName,
|
||||
"--resource-group", $ResourceGroup,
|
||||
"--location", $Location,
|
||||
"--sku", "Standard_LRS",
|
||||
"--kind", "StorageV2",
|
||||
"--output", "none"
|
||||
)
|
||||
|
||||
$storageConnection = Invoke-AzCli -ArgumentList @(
|
||||
"storage", "account", "show-connection-string",
|
||||
"--name", $StorageName,
|
||||
"--resource-group", $ResourceGroup,
|
||||
"--query", "connectionString",
|
||||
"--output", "tsv"
|
||||
)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Table and Queue
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Host "Creating Table and Queue..." -ForegroundColor Cyan
|
||||
Invoke-AzCli -ArgumentList @("storage", "table", "create", "--name", "ProbeState", "--connection-string", $storageConnection, "--output", "none")
|
||||
Invoke-AzCli -ArgumentList @("storage", "queue", "create", "--name", "backup-trigger-queue", "--connection-string", $storageConnection, "--output", "none")
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Function App
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Ensure Microsoft.Web provider is registered (required for Function Apps).
|
||||
$webProv = Invoke-AzCli -ArgumentList @("provider", "show", "--namespace", "Microsoft.Web", "--query", "registrationState", "--output", "tsv")
|
||||
if ($webProv -ne "Registered") {
|
||||
Write-Host "Registering Microsoft.Web provider..." -ForegroundColor Yellow
|
||||
Invoke-AzCli -ArgumentList @("provider", "register", "--namespace", "Microsoft.Web")
|
||||
Wait-ProviderRegistration -Namespace "Microsoft.Web"
|
||||
Write-Host "Microsoft.Web registered." -ForegroundColor Green
|
||||
}
|
||||
|
||||
Write-Host "Creating Function App '$FunctionAppName'..." -ForegroundColor Cyan
|
||||
Invoke-AzCli -ArgumentList @(
|
||||
"functionapp", "create",
|
||||
"--name", $FunctionAppName,
|
||||
"--resource-group", $ResourceGroup,
|
||||
"--storage-account", $StorageName,
|
||||
"--consumption-plan-location", $Location,
|
||||
"--os-type", "Linux",
|
||||
"--runtime", "python",
|
||||
"--runtime-version", "3.11",
|
||||
"--functions-version", "4",
|
||||
"--output", "none"
|
||||
)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# App Settings
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Host "Configuring Function App settings..." -ForegroundColor Cyan
|
||||
Invoke-AzCli -ArgumentList @(
|
||||
"functionapp", "config", "appsettings", "set",
|
||||
"--name", $FunctionAppName,
|
||||
"--resource-group", $ResourceGroup,
|
||||
"--settings",
|
||||
"AzureWebJobsStorage=$storageConnection",
|
||||
"FUNCTIONS_EXTENSION_VERSION=~4",
|
||||
"FUNCTIONS_WORKER_RUNTIME=python",
|
||||
"WEBSITE_RUN_FROM_PACKAGE=1",
|
||||
"PROBE_APP_ID=$probeAppId",
|
||||
"PROBE_APP_SECRET=$probeAppSecret",
|
||||
"TENANT_ID=$tenantId",
|
||||
"GRAPH_TOKEN=",
|
||||
"ADO_ORGANIZATION=$AdoOrganization",
|
||||
"ADO_PROJECT=$AdoProject",
|
||||
"ADO_PIPELINE_ID=$AdoPipelineId",
|
||||
"ADO_TOKEN=$AdoToken",
|
||||
"ADO_BRANCH=$AdoBranch",
|
||||
"PROBE_QUIET_WINDOW_MINUTES=$QuietWindowMinutes",
|
||||
"PROBE_COOLDOWN_MINUTES=$CooldownMinutes",
|
||||
"REPO_ROOT=/home/site/wwwroot",
|
||||
"--output", "none"
|
||||
)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Optional: code deployment
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
$funcAvailable = Test-Command "func"
|
||||
if ($funcAvailable) {
|
||||
$repoRoot = Split-Path -Parent $PSScriptRoot
|
||||
$probePath = Join-Path $repoRoot "infra" "change-probe"
|
||||
if (Test-Path $probePath) {
|
||||
$deployNow = Read-Host -Prompt "`nDeploy function code now? [Y/n]"
|
||||
if ($deployNow -eq "" -or $deployNow -match "^[Yy]") {
|
||||
Write-Host "Deploying function code..." -ForegroundColor Cyan
|
||||
Push-Location $probePath
|
||||
try {
|
||||
& func azure functionapp publish $FunctionAppName
|
||||
if ($LASTEXITCODE -ne 0) {
|
||||
Write-Warning "Function deployment returned exit code $LASTEXITCODE. You can retry manually later."
|
||||
}
|
||||
} finally {
|
||||
Pop-Location
|
||||
}
|
||||
}
|
||||
}
|
||||
} else {
|
||||
Write-Host "`nAzure Functions Core Tools (func) not found. Skipping code deployment." -ForegroundColor Yellow
|
||||
Write-Host "Install from https://github.com/Azure/azure-functions-core-tools#installing" -ForegroundColor Yellow
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Summary
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Host "`n=== Provisioning Complete ===" -ForegroundColor Green
|
||||
Write-Host "Subscription: $SubscriptionId"
|
||||
Write-Host "Resource Group: $ResourceGroup"
|
||||
Write-Host "Storage Account: $StorageName"
|
||||
Write-Host "Function App: $FunctionAppName"
|
||||
Write-Host "App Registration: $probeAppId"
|
||||
Write-Host "`nNext steps:"
|
||||
Write-Host " - Verify the timer trigger in the Azure Portal or with:"
|
||||
Write-Host " az functionapp function show --name $FunctionAppName --resource-group $ResourceGroup --function-name probe_timer"
|
||||
Write-Host " - To redeploy code later:"
|
||||
Write-Host " cd infra/change-probe && func azure functionapp publish $FunctionAppName"
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
# ASTRAL Security Review Package
|
||||
|
||||
Prepared: 2026-03-27
|
||||
Prepared: 2026-04-20
|
||||
|
||||
## Purpose
|
||||
|
||||
@@ -59,34 +59,43 @@ Important clarifications:
|
||||
| Azure DevOps pipeline `azure-pipelines.yml` | Scheduled backup, drift commit, rolling PR management, documentation artifact publishing | Main execution path |
|
||||
| Azure DevOps pipeline `azure-pipelines-review-sync.yml` | Processes reviewer `/reject` and `/accept` decisions and refreshes PR summaries | Uses Azure DevOps API token |
|
||||
| Azure DevOps pipeline `azure-pipelines-restore.yml` | Restores approved baseline to tenant | Write-capable path |
|
||||
| Azure Function App (`infra/change-probe`) | Event-driven probe: polls audit logs, debounces, triggers backup pipeline on demand | Outbound-only; uses separate Entra app registration |
|
||||
| Azure Table Storage | Persists probe debouncer state (`ProbeState` table) | No sensitive tenant data |
|
||||
| Azure Queue Storage | Receives trigger messages from probe timer for queue consumer | No sensitive tenant data |
|
||||
| Azure DevOps Git repository | Stores approved baseline, drift branches, JSON exports, reports, docs | Primary configuration store |
|
||||
| Microsoft Graph | Source of Intune and Entra configuration; optional target for restore | Production tenant access |
|
||||
| Azure DevOps REST APIs | PR creation/update, review thread sync, restore queueing | Change-management control plane |
|
||||
| Microsoft Graph | Source of Intune and Entra configuration; optional target for restore; audit log source for probe | Production tenant access |
|
||||
| Azure DevOps REST APIs | PR creation/update, review thread sync, restore queueing, pipeline trigger | Change-management control plane |
|
||||
| Optional Azure OpenAI | PR summary generation only | Optional data egress path |
|
||||
|
||||
### High-Level Flow
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A["Azure DevOps scheduled pipeline"] --> B["Federated service connection"]
|
||||
B --> C["Microsoft Graph"]
|
||||
A --> D["Git repo: main + drift branches"]
|
||||
A --> E["Azure DevOps PR and thread APIs"]
|
||||
A --> F["Build artifacts: markdown / HTML / PDF"]
|
||||
A -. optional .-> G["Azure OpenAI"]
|
||||
H["Reviewer in Azure DevOps"] --> E
|
||||
E --> I["Rolling PR approval / rejection"]
|
||||
I -. optional remediation .-> J["Restore pipeline"]
|
||||
J --> C
|
||||
A["Azure Function App<br/>probe_timer"] --> B["Microsoft Graph<br/>audit logs"]
|
||||
A --> C["Azure Table Storage<br/>ProbeState"]
|
||||
A --> D["Azure Queue Storage<br/>backup-trigger-queue"]
|
||||
E["Azure Function App<br/>queue_consumer"] --> D
|
||||
E --> F["Azure DevOps REST API<br/>queue pipeline run"]
|
||||
G["Azure DevOps scheduled pipeline<br/>daily snapshot + reports"] --> H["Federated service connection"]
|
||||
H --> B
|
||||
G --> I["Git repo: main + drift branches"]
|
||||
G --> J["Azure DevOps PR and thread APIs"]
|
||||
G --> K["Build artifacts: markdown / HTML / PDF"]
|
||||
G -. optional .-> L["Azure OpenAI"]
|
||||
M["Reviewer in Azure DevOps"] --> J
|
||||
J --> N["Rolling PR approval / rejection"]
|
||||
N -. optional remediation .-> O["Restore pipeline"]
|
||||
O --> B
|
||||
```
|
||||
|
||||
## Deployment Model
|
||||
|
||||
### Backup and Review
|
||||
|
||||
The main pipeline runs hourly on `main`.
|
||||
The main pipeline runs daily at 02:00 on `main` to generate a full tenant snapshot, reports, and documentation artifacts. The primary trigger is the event-driven change probe, which queues the pipeline on demand when drift is detected.
|
||||
|
||||
- Every hour: export Intune and Entra configuration, generate reports, commit drift to rolling workload branches, and update one rolling PR per workload.
|
||||
- On change detection: the probe timer polls audit logs every 5 minutes. After a 15-minute quiet window with no new events, it queues the backup pipeline.
|
||||
- Daily at 02:00: export Intune and Entra configuration, generate reports, commit drift to rolling workload branches, and update one rolling PR per workload.
|
||||
- When delayed reviewer notifications are enabled, newly created rolling PRs are opened as Azure DevOps draft PRs, the automated summary is inserted, and the PR is then published for reviewer notification.
|
||||
- At the configured full-run hour: perform the same work plus documentation artifact generation (Markdown, and optionally HTML/PDF if browser dependencies are available).
|
||||
|
||||
@@ -129,6 +138,7 @@ It supports:
|
||||
| Generated reports | assignment inventories, object inventories, app inventories | Derived from exported configuration | `tenant-state/reports/**` and build artifacts |
|
||||
| Documentation artifacts | split markdown, optional HTML/PDF | Derived from exported configuration | build artifacts |
|
||||
| Review metadata | PR descriptions, review threads, accept/reject commands | Azure DevOps reviewers | Azure DevOps PR APIs |
|
||||
| Probe state | debouncer state (timestamps, enum values) | Derived from audit log evaluation | Azure Table Storage (`ProbeState`) |
|
||||
| Optional AI summary payload | sampled changed paths, semantic change descriptions, deterministic summary, fingerprints | Derived from repo diff | Azure OpenAI request payload |
|
||||
|
||||
### Data Sensitivity Notes
|
||||
@@ -145,6 +155,8 @@ It supports:
|
||||
|
||||
The pipelines obtain a Microsoft Graph access token at runtime using the Azure DevOps service connection configured in `SERVICE_CONNECTION_NAME` (e.g. `sc-astral-backup`).
|
||||
|
||||
The change probe uses a **separate Entra app registration** (`ASTRAL Change Probe`) with its own client credentials to authenticate to Microsoft Graph for audit log polling. This app is created by `deploy/provision-change-probe.ps1` and is distinct from the pipeline service connection identity.
|
||||
|
||||
Observed controls in the implementation:
|
||||
|
||||
- token acquisition is performed at runtime with `Get-AzAccessToken`,
|
||||
@@ -196,6 +208,20 @@ Read-oriented Graph application permissions documented in the repository:
|
||||
- `RoleManagement.Read.Directory` or `Directory.Read.All` for richer enrichment
|
||||
- `AuditLog.Read.All` if commit author attribution is desired
|
||||
|
||||
#### Change Probe Mode
|
||||
|
||||
The probe app registration requires these read-only Graph application permissions:
|
||||
|
||||
- `AuditLog.Read.All` (reads directory and Intune audit logs)
|
||||
- `DeviceManagementApps.Read.All`
|
||||
- `DeviceManagementConfiguration.Read.All`
|
||||
- `DeviceManagementManagedDevices.Read.All`
|
||||
- `Policy.Read.All`
|
||||
- `Policy.Read.ConditionalAccess`
|
||||
- `Application.Read.All`
|
||||
|
||||
The probe does **not** require write permissions. It only polls audit logs and queues the backup pipeline.
|
||||
|
||||
#### Restore Mode
|
||||
|
||||
Write-capable Graph application permissions documented in the repository:
|
||||
@@ -219,6 +245,8 @@ Write-capable Graph application permissions documented in the repository:
|
||||
- Required outbound destinations are:
|
||||
- `graph.microsoft.com`
|
||||
- Azure DevOps organization APIs
|
||||
- Azure Table Storage (for probe state)
|
||||
- Azure Queue Storage (for probe trigger messages)
|
||||
- optional Azure OpenAI endpoint
|
||||
- Python package registry for `IntuneCD`
|
||||
- npm registry for `md-to-pdf`
|
||||
@@ -229,6 +257,7 @@ Write-capable Graph application permissions documented in the repository:
|
||||
- Graph tokens are obtained just-in-time rather than stored in the repository.
|
||||
- The pipeline marks the Graph token as a secret variable.
|
||||
- The implementation logs token claims and roles for diagnostics, but not the token value itself.
|
||||
- The change probe app secret is stored as an Azure Function App setting (`PROBE_APP_SECRET`), not in the repository.
|
||||
- Azure OpenAI uses a pipeline secret variable when enabled.
|
||||
- The pipeline logic itself does not depend on repository-stored application secrets; separate secret scanning of exported tenant content is still recommended.
|
||||
|
||||
@@ -329,6 +358,7 @@ The following items are not fully solved by the repository alone and should be a
|
||||
| --- | --- | --- |
|
||||
| Restore capability | Supported by design; can change production tenant state | Keep restore manual only, or disable auto-remediation by default until operational controls are approved |
|
||||
| Backup vs restore identity separation | Sample config uses the same service connection name in backup and restore pipelines | Use separate service principals: read-only for backup/review, write-enabled only for restore |
|
||||
| Change probe identity separation | Probe uses a separate Entra app registration from the pipeline service connection | Keep probe app read-only; do not grant write permissions to the probe identity |
|
||||
| Azure OpenAI egress | Optional and customer-configurable | Enable only when the organization approves the payload scope and Azure OpenAI deployment model |
|
||||
| Artifact retention | Not defined in repo; inherited from Azure DevOps settings | Set explicit retention for builds, logs, and artifacts |
|
||||
| Repo access model | Not defined in repo | Restrict repo and artifact access to administrators/reviewers only |
|
||||
@@ -400,3 +430,7 @@ The statements in this document are based on the implementation in:
|
||||
- `scripts/apply_reviewer_rejections.py`
|
||||
- `scripts/queue_post_merge_restore.py`
|
||||
- `scripts/export_entra_baseline.py`
|
||||
- `scripts/probe_tenant_changes.py`
|
||||
- `scripts/trigger_backup_pipeline.py`
|
||||
- `infra/change-probe/probe_timer/__init__.py`
|
||||
- `infra/change-probe/queue_consumer/__init__.py`
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
# ASTRAL Security Review Questionnaire
|
||||
|
||||
Prepared: 2026-03-27
|
||||
Prepared: 2026-04-20
|
||||
|
||||
This appendix is a shorter, copy/paste-friendly companion to the full ASTRAL security review package.
|
||||
|
||||
@@ -12,21 +12,21 @@ This appendix is a shorter, copy/paste-friendly companion to the full ASTRAL sec
|
||||
| What deployment modes are supported? | The same repository can be operated in progressive modes: backup-only, review package, or full package with restore/remediation. AI is optional in all modes. |
|
||||
| Is it a public-facing application? | No. It is an administrative pipeline workflow with no public UI or inbound application endpoint created by this repository. |
|
||||
| Does it require inbound network access from the internet? | No. The implemented workflow is outbound-only over HTTPS. |
|
||||
| What production systems does it access? | Microsoft Graph for Intune and Entra configuration, plus Azure DevOps APIs for pull request and pipeline operations. |
|
||||
| What production systems does it access? | Microsoft Graph for Intune and Entra configuration and audit logs; Azure DevOps APIs for pull request and pipeline operations; Azure Storage (Table and Queue) for probe state and trigger messages. |
|
||||
| Does it make production changes? | Backup and review pipelines are read-oriented against Microsoft Graph. The restore pipeline is write-capable and can apply approved baseline configuration back to the tenant when explicitly enabled and authorized. |
|
||||
| What data is processed? | Administrative configuration data such as Intune policies, device configuration, enrollment profiles, apps, scripts, conditional access, named locations, authentication strengths, app registrations, and enterprise application metadata. |
|
||||
| Does it process end-user business content? | It is not designed for business content. However, exported admin-authored scripts or custom payloads can contain sensitive operational data if the tenant already stores it there. |
|
||||
| Where is data stored? | In the Azure DevOps Git repository, Azure DevOps pull requests/threads, build logs, and optional build artifacts such as markdown, HTML, and PDF documentation. |
|
||||
| How does it authenticate to Microsoft Graph? | By obtaining a Microsoft Graph token at runtime through an Azure DevOps Azure service connection using workload identity / federated credential flow. |
|
||||
| How does it authenticate to Azure DevOps APIs? | With `System.AccessToken` scoped to the pipeline identity. |
|
||||
| Are long-lived secrets stored in the repository? | The pipeline logic does not require repository-stored application secrets. Runtime tokens are acquired during pipeline execution, but exported tenant content should still be treated as potentially sensitive and reviewed for embedded secrets in admin-authored scripts or custom payloads. |
|
||||
| Are long-lived secrets stored in the repository? | The pipeline logic does not require repository-stored application secrets. The change probe app secret is stored in Azure Function App settings, not in the repository. Runtime tokens are acquired during pipeline execution, but exported tenant content should still be treated as potentially sensitive and reviewed for embedded secrets in admin-authored scripts or custom payloads. |
|
||||
| How are secrets handled in the pipeline? | The Graph access token is set as a secret pipeline variable. The implementation logs token claims and granted roles for diagnostics, but not the token value. |
|
||||
| What minimum permissions are required? | Read-only Microsoft Graph application permissions for backup/review, and additional write permissions only for restore. Exact permissions are listed in the full package. |
|
||||
| Is there separation between read and write access? | The code supports a safe separation model. For production, create separate read-only and write-enabled service principals/connections so backup and restore use different identities. |
|
||||
| What change-control mechanism exists? | Drift is committed to dedicated workload branches and reviewed through rolling pull requests into `main`. New rolling PRs can be created as drafts until the automated summary is inserted, and optional per-file change-ticket threads and reviewer `/reject` commands are supported. |
|
||||
| Can reviewers block or scope changes? | Yes. Reviewers can approve the rolling PR, reject it, or reject individual file-level drift items through PR threads when that feature is enabled. |
|
||||
| Is rollback supported? | Yes. The restore pipeline supports full restore, selective restore by file path, historical restore by Git ref, and dry-run mode. |
|
||||
| What external network destinations are required? | Microsoft Graph, Azure DevOps APIs, optional Azure OpenAI, Python package registry for `IntuneCD`, npm registry for `md-to-pdf`, and optionally OS package repositories when browser dependencies are installed for HTML/PDF generation. |
|
||||
| What external network destinations are required? | Microsoft Graph, Azure DevOps APIs, Azure Storage (Table and Queue), optional Azure OpenAI, Python package registry for `IntuneCD`, npm registry for `md-to-pdf`, and optionally OS package repositories when browser dependencies are installed for HTML/PDF generation. |
|
||||
| Does the system send data to AI services? | Only if Azure OpenAI summary generation is explicitly configured. It is optional for the platform overall. |
|
||||
| What AI service is intended? | A customer-controlled Azure OpenAI deployment configured through the Azure OpenAI endpoint and deployment variables, rather than an unrelated public AI service. |
|
||||
| What data is sent to Azure OpenAI when enabled? | A reduced change-review payload containing changed paths, semantic summaries, deterministic summary text, and fingerprints derived from the repo diff. This is intended to support review summarization, not raw tenant-wide export ingestion. |
|
||||
|
||||
245
infra/change-probe/README.md
Normal file
245
infra/change-probe/README.md
Normal file
@@ -0,0 +1,245 @@
|
||||
# ASTRAL Change Probe
|
||||
|
||||
Event-driven backup trigger for ASTRAL. Monitors Intune and Entra ID audit logs via Microsoft Graph, debounces change bursts, and queues the Azure DevOps backup pipeline only when actual drift is detected.
|
||||
|
||||
## Why this exists
|
||||
|
||||
Microsoft Graph change notifications and delta queries do **not** support Intune device management or Conditional Access resources. The only viable event-driven approach is polling the Graph audit log APIs, which have a 5–15 minute propagation delay. This probe implements a debouncer on top of that polling to avoid backup storms during bulk changes.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────┐ 5 min ┌──────────────┐ quiet window ┌─────────────────┐
|
||||
│ Timer Trigger │ ─────────────► │ probe_timer │ ─────────────────► │ backup-trigger │
|
||||
│ (probe_timer) │ │ (debouncer) │ (15 min armed) │ -queue │
|
||||
└─────────────────┘ └──────────────┘ └────────┬────────┘
|
||||
│ │
|
||||
│ load/save state │ dequeue
|
||||
│ (Azure Table Storage) ▼
|
||||
│ ┌─────────────────┐
|
||||
│ │ queue_consumer │
|
||||
└──────────────────────────────────────────────────────────────►│ (ADO REST API) │
|
||||
└─────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ Azure DevOps │
|
||||
│ backup pipeline│
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
## Components
|
||||
|
||||
### `probe_timer` (Timer Trigger)
|
||||
|
||||
- **Schedule**: every 5 minutes (`0 */5 * * * *`)
|
||||
- **Input**: `TimerRequest` from Functions runtime
|
||||
- **Output**: queue message to `backup-trigger-queue` (via `func.Out[str]`)
|
||||
- **Actions**:
|
||||
1. Load debouncer state from Azure Table Storage (`ProbeState` / `singleton` / `default`).
|
||||
2. Run `scripts/probe_tenant_changes.py` via subprocess.
|
||||
3. Save updated state back to Table Storage.
|
||||
4. If `trigger=true`, emit a queue message.
|
||||
|
||||
### `queue_consumer` (Queue Trigger)
|
||||
|
||||
- **Input**: `QueueMessage` from `backup-trigger-queue`
|
||||
- **Actions**:
|
||||
1. Parse JSON payload (`reason`, `checked_at`).
|
||||
2. Call Azure DevOps REST API to queue the backup pipeline run.
|
||||
3. Raise on failure so the Functions runtime handles retry and poison-queue logic.
|
||||
|
||||
### `scripts/probe_tenant_changes.py`
|
||||
|
||||
Standalone CLI script that can also be run locally. It:
|
||||
|
||||
- Queries Intune (`deviceManagement/auditEvents`) and Entra (`directoryAudits`) audit logs.
|
||||
- Implements a three-state debouncer: `idle` → `armed` → `cooldown`.
|
||||
- Returns JSON with `trigger`, `reason`, and `new_state`.
|
||||
|
||||
### `scripts/trigger_backup_pipeline.py`
|
||||
|
||||
Standalone CLI script that queues an Azure DevOps pipeline run via REST API. Can be used locally or from the queue consumer.
|
||||
|
||||
## Debouncer State Machine
|
||||
|
||||
| State | Condition to transition | Output |
|
||||
|---|---|---|
|
||||
| **idle** | Audit log shows a new change | → `armed` |
|
||||
| **armed** | Quiet window elapsed (default 15 min) with no newer events | → `cooldown`, `trigger=true` |
|
||||
| **armed** | Newer event arrives while armed | Stay `armed`, extend quiet window |
|
||||
| **cooldown** | Cooldown elapsed (default 30 min) | → `idle` |
|
||||
| **cooldown** | New event arrives | Stay `cooldown` (change is buffered until cooldown ends) |
|
||||
|
||||
## Configuration
|
||||
|
||||
All settings are provided via Function App application settings (environment variables):
|
||||
|
||||
| Setting | Required | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `AzureWebJobsStorage` | Yes | — | Storage account connection string (tables + queues) |
|
||||
| `PROBE_APP_ID` | Yes* | — | Entra app registration client ID |
|
||||
| `PROBE_APP_SECRET` | Yes* | — | Entra app client secret |
|
||||
| `TENANT_ID` | Yes* | — | Microsoft 365 tenant ID |
|
||||
| `GRAPH_TOKEN` | No | — | Optional passthrough token ( skips client credentials flow ) |
|
||||
| `ADO_ORGANIZATION` | Yes | — | Azure DevOps organization name |
|
||||
| `ADO_PROJECT` | Yes | — | Azure DevOps project name |
|
||||
| `ADO_PIPELINE_ID` | Yes | — | Backup pipeline definition ID |
|
||||
| `ADO_TOKEN` | Yes | — | Azure DevOps PAT with **Build (read & execute)** |
|
||||
| `ADO_BRANCH` | No | `main` | Git ref to queue the pipeline against |
|
||||
| `PROBE_QUIET_WINDOW_MINUTES` | No | `15` | Minutes to wait for change burst to settle |
|
||||
| `PROBE_COOLDOWN_MINUTES` | No | `30` | Minutes between successive triggers |
|
||||
|
||||
\* Required unless `GRAPH_TOKEN` is provided.
|
||||
|
||||
## Local Development
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Python 3.11+
|
||||
- [Azure Functions Core Tools](https://learn.microsoft.com/en-us/azure/azure-functions/functions-run-local)
|
||||
- An Azure Storage account (or Azurite for local emulation)
|
||||
|
||||
### Install dependencies
|
||||
|
||||
```bash
|
||||
cd infra/change-probe
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### Copy shared scripts
|
||||
|
||||
The probe reuses scripts from the repository root. Copy them into this directory before building or running locally:
|
||||
|
||||
```bash
|
||||
cp ../../scripts/common.py scripts/
|
||||
cp ../../scripts/probe_tenant_changes.py scripts/
|
||||
cp ../../scripts/trigger_backup_pipeline.py scripts/
|
||||
```
|
||||
|
||||
### Run locally
|
||||
|
||||
```bash
|
||||
# Start Azurite (Storage emulator)
|
||||
azurite --silent --location ./azurite --debug ./azurite/debug.log
|
||||
|
||||
# Copy local settings template
|
||||
cp local.settings.json.example local.settings.json
|
||||
# Edit local.settings.json with your values
|
||||
|
||||
# Start the Functions host
|
||||
func start
|
||||
```
|
||||
|
||||
### Run the probe script standalone
|
||||
|
||||
```bash
|
||||
cd ../..
|
||||
python3 scripts/probe_tenant_changes.py \
|
||||
--client-id "$PROBE_APP_ID" \
|
||||
--client-secret "$PROBE_APP_SECRET" \
|
||||
--tenant-id "$TENANT_ID" \
|
||||
--state-file ./probe-state.json \
|
||||
--output ./probe-result.json
|
||||
```
|
||||
|
||||
### Trigger the backup pipeline standalone
|
||||
|
||||
```bash
|
||||
python3 scripts/trigger_backup_pipeline.py \
|
||||
--organization "contoso" \
|
||||
--project "Intune" \
|
||||
--pipeline-id 1 \
|
||||
--token "$ADO_TOKEN" \
|
||||
--branch refs/heads/main
|
||||
```
|
||||
|
||||
## Deployment
|
||||
|
||||
Use the unified provisioning script:
|
||||
|
||||
```powershell
|
||||
.\deploy\provision-change-probe.ps1 `
|
||||
-TenantName "contoso.onmicrosoft.com" `
|
||||
-ResourceGroupName "rg-astral-probe" `
|
||||
-Location "westeurope" `
|
||||
-DeployFunctionApp
|
||||
```
|
||||
|
||||
The script will:
|
||||
|
||||
1. Register an Entra app (or reuse an existing one).
|
||||
2. Grant admin consent for Graph permissions.
|
||||
3. Create a client secret.
|
||||
4. Provision Resource Group, Storage Account, and Function App (Linux Consumption, Python 3.11).
|
||||
5. Configure application settings.
|
||||
6. Build and deploy the function package.
|
||||
|
||||
### Manual deployment (zip package)
|
||||
|
||||
If you prefer to deploy manually:
|
||||
|
||||
```bash
|
||||
cd infra/change-probe
|
||||
|
||||
# Copy shared scripts into the package directory
|
||||
cp ../../scripts/common.py scripts/
|
||||
cp ../../scripts/probe_tenant_changes.py scripts/
|
||||
cp ../../scripts/trigger_backup_pipeline.py scripts/
|
||||
|
||||
# Install production dependencies into the package
|
||||
pip install -r requirements.txt --target .python_packages/lib/site-packages
|
||||
|
||||
# Build the zip (Linux Consumption requires .python_packages/lib/site-packages, NOT python3.11/)
|
||||
zip -r function-package.zip \
|
||||
probe_timer/ queue_consumer/ scripts/ .python_packages/ \
|
||||
host.json requirements.txt \
|
||||
-x "*.pyc" -x "__pycache__/*"
|
||||
|
||||
# Upload and set WEBSITE_RUN_FROM_PACKAGE
|
||||
az functionapp deployment source config-zip \
|
||||
--resource-group rg-astral-probe \
|
||||
--name func-astral-probe \
|
||||
--src function-package.zip
|
||||
```
|
||||
|
||||
## Permissions
|
||||
|
||||
### Entra App (Graph access)
|
||||
|
||||
The probe requires the same read permissions as the main backup pipeline:
|
||||
|
||||
- `DeviceManagementConfiguration.Read.All`
|
||||
- `DeviceManagementApps.Read.All`
|
||||
- `AuditLog.Read.All`
|
||||
- `Directory.Read.All`
|
||||
|
||||
### Azure DevOps PAT
|
||||
|
||||
The `ADO_TOKEN` must have:
|
||||
|
||||
- **Build** → *Read & execute*
|
||||
|
||||
## Monitoring
|
||||
|
||||
Check the `ProbeState` table for current debouncer state:
|
||||
|
||||
```bash
|
||||
az storage entity query --table-name ProbeState --account-name <storage>
|
||||
```
|
||||
|
||||
Check the queue depth:
|
||||
|
||||
```bash
|
||||
az storage queue list --account-name <storage>
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Likely cause | Fix |
|
||||
|---|---|---|
|
||||
| Timer fires but no state update | `schedule_status["last"]` case mismatch (fixed in current version) | Ensure deployed code uses `.get("Last")` |
|
||||
| Probe script `ModuleNotFoundError` | Bundled packages in wrong path | Use `.python_packages/lib/site-packages`, not `python3.11/site-packages` |
|
||||
| Queue message lands in poison queue | `ADO_TOKEN` missing or invalid | Verify token in Function App settings and restart |
|
||||
| Probe never triggers | No audit events in Graph window | Normal if tenant is idle; verify `AuditLog.Read.All` permission |
|
||||
| Duplicate pipeline runs | Multiple messages queued | Check debouncer state; cooldown should prevent this |
|
||||
15
infra/change-probe/host.json
Normal file
15
infra/change-probe/host.json
Normal file
@@ -0,0 +1,15 @@
|
||||
{
|
||||
"version": "2.0",
|
||||
"logging": {
|
||||
"applicationInsights": {
|
||||
"samplingSettings": {
|
||||
"isEnabled": true,
|
||||
"excludedTypes": "Request"
|
||||
}
|
||||
}
|
||||
},
|
||||
"extensionBundle": {
|
||||
"id": "Microsoft.Azure.Functions.ExtensionBundle",
|
||||
"version": "[4.*, 5.0.0)"
|
||||
}
|
||||
}
|
||||
19
infra/change-probe/local.settings.json.example
Normal file
19
infra/change-probe/local.settings.json.example
Normal file
@@ -0,0 +1,19 @@
|
||||
{
|
||||
"IsEncrypted": false,
|
||||
"Values": {
|
||||
"AzureWebJobsStorage": "UseDevelopmentStorage=true",
|
||||
"FUNCTIONS_WORKER_RUNTIME": "python",
|
||||
"PROBE_APP_ID": "",
|
||||
"PROBE_APP_SECRET": "",
|
||||
"TENANT_ID": "",
|
||||
"GRAPH_TOKEN": "",
|
||||
"ADO_ORGANIZATION": "",
|
||||
"ADO_PROJECT": "",
|
||||
"ADO_PIPELINE_ID": "",
|
||||
"ADO_TOKEN": "",
|
||||
"ADO_BRANCH": "main",
|
||||
"PROBE_QUIET_WINDOW_MINUTES": "15",
|
||||
"PROBE_COOLDOWN_MINUTES": "30",
|
||||
"REPO_ROOT": "../../"
|
||||
}
|
||||
}
|
||||
137
infra/change-probe/probe_timer/__init__.py
Normal file
137
infra/change-probe/probe_timer/__init__.py
Normal file
@@ -0,0 +1,137 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Azure Function timer trigger that probes tenant audit logs and queues a backup run when changes are detected."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
from typing import Any
|
||||
|
||||
import azure.functions as func
|
||||
from azure.data.tables import TableServiceClient
|
||||
|
||||
_TABLE_NAME = "ProbeState"
|
||||
_PARTITION_KEY = "singleton"
|
||||
_ROW_KEY = "default"
|
||||
|
||||
|
||||
def _repo_root() -> str:
|
||||
"""Resolve the repository root so we can invoke scripts/probe_tenant_changes.py."""
|
||||
env_root = os.environ.get("REPO_ROOT", "").strip()
|
||||
if env_root:
|
||||
return os.path.abspath(env_root)
|
||||
return os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
|
||||
|
||||
|
||||
def _load_state(connection_string: str) -> dict[str, Any]:
|
||||
"""Load persisted probe state from Azure Table Storage."""
|
||||
try:
|
||||
service = TableServiceClient.from_connection_string(conn_str=connection_string)
|
||||
table = service.get_table_client(table_name=_TABLE_NAME)
|
||||
entity = table.get_entity(partition_key=_PARTITION_KEY, row_key=_ROW_KEY)
|
||||
raw = entity.get("state", "{}")
|
||||
return json.loads(raw) if isinstance(raw, str) else dict(raw)
|
||||
except Exception as exc:
|
||||
logging.warning(f"Unable to load state from Table Storage ({exc}); starting fresh.")
|
||||
return {}
|
||||
|
||||
|
||||
def _save_state(connection_string: str, state: dict[str, Any]) -> None:
|
||||
"""Persist probe state to Azure Table Storage."""
|
||||
service = TableServiceClient.from_connection_string(conn_str=connection_string)
|
||||
table = service.get_table_client(table_name=_TABLE_NAME)
|
||||
table.upsert_entity(
|
||||
{
|
||||
"PartitionKey": _PARTITION_KEY,
|
||||
"RowKey": _ROW_KEY,
|
||||
"state": json.dumps(state),
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
def main(mytimer: func.TimerRequest, msg: func.Out[str]) -> None:
|
||||
utc_now = mytimer.schedule_status.get("Last", "n/a") if mytimer.schedule_status else "n/a"
|
||||
logging.info(f"Probe timer triggered at {utc_now}")
|
||||
|
||||
client_id = os.environ.get("PROBE_APP_ID", "").strip()
|
||||
client_secret = os.environ.get("PROBE_APP_SECRET", "").strip()
|
||||
tenant_id = os.environ.get("TENANT_ID", "").strip()
|
||||
token = os.environ.get("GRAPH_TOKEN", "").strip()
|
||||
|
||||
auth_args: list[str] = []
|
||||
if token:
|
||||
auth_args = ["--token", token]
|
||||
elif client_id and client_secret and tenant_id:
|
||||
auth_args = [
|
||||
"--client-id", client_id,
|
||||
"--client-secret", client_secret,
|
||||
"--tenant-id", tenant_id,
|
||||
]
|
||||
else:
|
||||
logging.error("No Graph authentication configured (PROBE_APP_ID/SECRET/TENANT_ID or GRAPH_TOKEN).")
|
||||
return
|
||||
|
||||
connection_string = os.environ.get("AzureWebJobsStorage", "").strip()
|
||||
if not connection_string:
|
||||
logging.error("AzureWebJobsStorage connection string is missing.")
|
||||
return
|
||||
|
||||
state = _load_state(connection_string)
|
||||
state_json = json.dumps(state) if state else ""
|
||||
quiet_window = os.environ.get("PROBE_QUIET_WINDOW_MINUTES", "15")
|
||||
cooldown = os.environ.get("PROBE_COOLDOWN_MINUTES", "30")
|
||||
|
||||
probe_script = os.path.join(_repo_root(), "scripts", "probe_tenant_changes.py")
|
||||
if not os.path.exists(probe_script):
|
||||
logging.error(f"Probe script not found at {probe_script}")
|
||||
return
|
||||
|
||||
cmd = [
|
||||
sys.executable,
|
||||
probe_script,
|
||||
*auth_args,
|
||||
"--quiet-window-minutes", quiet_window,
|
||||
"--cooldown-minutes", cooldown,
|
||||
]
|
||||
if state_json:
|
||||
cmd.extend(["--state-json", state_json])
|
||||
|
||||
logging.info(f"Running probe script: {probe_script}")
|
||||
try:
|
||||
result = subprocess.run(cmd, capture_output=True, text=True, timeout=60)
|
||||
except subprocess.TimeoutExpired:
|
||||
logging.error("Probe script timed out after 60 seconds.")
|
||||
return
|
||||
except Exception as exc:
|
||||
logging.error(f"Failed to run probe script ({exc}).")
|
||||
return
|
||||
|
||||
if result.returncode != 0:
|
||||
logging.error(f"Probe script failed (exit {result.returncode}): {result.stderr}")
|
||||
return
|
||||
|
||||
try:
|
||||
output = json.loads(result.stdout)
|
||||
except json.JSONDecodeError as exc:
|
||||
logging.error(f"Probe script returned invalid JSON ({exc}): {result.stdout[:500]}")
|
||||
return
|
||||
|
||||
new_state = output.get("new_state", state)
|
||||
_save_state(connection_string, new_state)
|
||||
|
||||
trigger = output.get("trigger", False)
|
||||
reason = output.get("reason", "no reason given")
|
||||
logging.info(f"Probe result: trigger={trigger}, reason={reason}")
|
||||
|
||||
if trigger:
|
||||
queue_payload = json.dumps(
|
||||
{
|
||||
"reason": reason,
|
||||
"checked_at": output.get("checked_at", ""),
|
||||
}
|
||||
)
|
||||
msg.set(queue_payload)
|
||||
logging.info("Queued backup trigger message.")
|
||||
18
infra/change-probe/probe_timer/function.json
Normal file
18
infra/change-probe/probe_timer/function.json
Normal file
@@ -0,0 +1,18 @@
|
||||
{
|
||||
"scriptFile": "__init__.py",
|
||||
"bindings": [
|
||||
{
|
||||
"name": "mytimer",
|
||||
"type": "timerTrigger",
|
||||
"direction": "in",
|
||||
"schedule": "0 */5 * * * *"
|
||||
},
|
||||
{
|
||||
"name": "msg",
|
||||
"type": "queue",
|
||||
"direction": "out",
|
||||
"queueName": "backup-trigger-queue",
|
||||
"connection": "AzureWebJobsStorage"
|
||||
}
|
||||
]
|
||||
}
|
||||
77
infra/change-probe/queue_consumer/__init__.py
Normal file
77
infra/change-probe/queue_consumer/__init__.py
Normal file
@@ -0,0 +1,77 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Azure Function queue trigger that calls the Azure DevOps REST API to queue a backup pipeline run."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
|
||||
import azure.functions as func
|
||||
|
||||
|
||||
def _repo_root() -> str:
|
||||
"""Resolve the repository root so we can invoke scripts/trigger_backup_pipeline.py."""
|
||||
env_root = os.environ.get("REPO_ROOT", "").strip()
|
||||
if env_root:
|
||||
return os.path.abspath(env_root)
|
||||
return os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
|
||||
|
||||
|
||||
def main(msg: func.QueueMessage) -> None:
|
||||
body = msg.get_body().decode("utf-8")
|
||||
logging.info(f"Queue consumer received message: {body}")
|
||||
|
||||
org = os.environ.get("ADO_ORGANIZATION", "").strip()
|
||||
project = os.environ.get("ADO_PROJECT", "").strip()
|
||||
pipeline_id = os.environ.get("ADO_PIPELINE_ID", "").strip()
|
||||
token = os.environ.get("ADO_TOKEN", "").strip()
|
||||
branch = os.environ.get("ADO_BRANCH", "main").strip()
|
||||
|
||||
if not all([org, project, pipeline_id, token]):
|
||||
logging.error("Missing one or more ADO configuration variables (ADO_ORGANIZATION, ADO_PROJECT, ADO_PIPELINE_ID, ADO_TOKEN).")
|
||||
# Re-raising causes the Functions runtime to retry the message after the visibility timeout.
|
||||
raise RuntimeError("Incomplete ADO configuration")
|
||||
|
||||
trigger_script = os.path.join(_repo_root(), "scripts", "trigger_backup_pipeline.py")
|
||||
if not os.path.exists(trigger_script):
|
||||
logging.error(f"Trigger script not found at {trigger_script}")
|
||||
raise RuntimeError("Trigger script missing")
|
||||
|
||||
cmd = [
|
||||
sys.executable,
|
||||
trigger_script,
|
||||
"--organization",
|
||||
org,
|
||||
"--project",
|
||||
project,
|
||||
"--pipeline-id",
|
||||
pipeline_id,
|
||||
"--token",
|
||||
token,
|
||||
"--branch",
|
||||
branch,
|
||||
]
|
||||
|
||||
logging.info(f"Triggering ADO pipeline {pipeline_id} ...")
|
||||
try:
|
||||
result = subprocess.run(
|
||||
cmd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=60,
|
||||
)
|
||||
except subprocess.TimeoutExpired:
|
||||
logging.error("Trigger script timed out after 60 seconds.")
|
||||
raise
|
||||
except Exception as exc:
|
||||
logging.error(f"Failed to run trigger script ({exc}).")
|
||||
raise
|
||||
|
||||
if result.returncode != 0:
|
||||
logging.error(f"Trigger script failed (exit {result.returncode}): {result.stderr}")
|
||||
raise RuntimeError(f"Trigger script failed: {result.stderr}")
|
||||
|
||||
logging.info(f"Trigger script succeeded: {result.stdout.strip()}")
|
||||
12
infra/change-probe/queue_consumer/function.json
Normal file
12
infra/change-probe/queue_consumer/function.json
Normal file
@@ -0,0 +1,12 @@
|
||||
{
|
||||
"scriptFile": "__init__.py",
|
||||
"bindings": [
|
||||
{
|
||||
"name": "msg",
|
||||
"type": "queueTrigger",
|
||||
"direction": "in",
|
||||
"queueName": "backup-trigger-queue",
|
||||
"connection": "AzureWebJobsStorage"
|
||||
}
|
||||
]
|
||||
}
|
||||
3
infra/change-probe/requirements.txt
Normal file
3
infra/change-probe/requirements.txt
Normal file
@@ -0,0 +1,3 @@
|
||||
azure-functions
|
||||
azure-data-tables
|
||||
azure-storage-queue
|
||||
@@ -150,7 +150,8 @@ def _fetch_directory_audits(
|
||||
"$top": "999",
|
||||
"$select": "activityDateTime,activityDisplayName,category,result,initiatedBy,targetResources",
|
||||
}
|
||||
filter_parts = [f"activityDateTime le {_format_filter_datetime(backup_start)}"]
|
||||
audit_end = backup_start - dt.timedelta(minutes=10)
|
||||
filter_parts = [f"activityDateTime le {_format_filter_datetime(audit_end)}"]
|
||||
if last_commit_date is not None:
|
||||
filter_parts.append(f"activityDateTime ge {_format_filter_datetime(last_commit_date)}")
|
||||
params["$filter"] = " and ".join(filter_parts)
|
||||
|
||||
@@ -114,6 +114,15 @@ def request_json(
|
||||
except urllib.error.HTTPError as exc:
|
||||
last_error = exc
|
||||
if exc.code not in retry_codes or attempt == max_retries:
|
||||
body = ""
|
||||
try:
|
||||
body = exc.read().decode("utf-8", errors="replace")[:2048]
|
||||
except Exception:
|
||||
pass
|
||||
if body:
|
||||
raise RuntimeError(
|
||||
f"{method} {url} failed: HTTP Error {exc.code}: {exc.reason} — {body}"
|
||||
) from exc
|
||||
raise
|
||||
retry_after = _get_retry_after_seconds(exc)
|
||||
sleep = retry_after if retry_after is not None else (2 ** attempt)
|
||||
|
||||
@@ -325,28 +325,10 @@ def _current_pr_merge_strategy(pr: dict[str, Any]) -> str:
|
||||
|
||||
def _build_description(workload: str, drift_branch: str, baseline_branch: str, build_number: str, build_id: str) -> str:
|
||||
is_entra = workload.lower() == "entra"
|
||||
lead = "Rolling Entra drift PR created by backup pipeline." if is_entra else "Rolling drift PR created by backup pipeline."
|
||||
lead = "Rolling Entra drift PR — backup pipeline" if is_entra else "Rolling drift PR — backup pipeline"
|
||||
return (
|
||||
f"{lead}\n\n"
|
||||
f"- Source branch: `{drift_branch}`\n"
|
||||
f"- Target branch: `{baseline_branch}`\n"
|
||||
f"- Last pipeline run: `{build_number}` (BuildId: {build_id})\n\n"
|
||||
"The automated review summary is generated immediately after PR creation and inserted "
|
||||
"above the reviewer actions section.\n\n"
|
||||
"## Reviewer Quick Actions\n\n"
|
||||
"### 1) Accept all changes\n"
|
||||
"- Merge PR to accept drift into baseline.\n\n"
|
||||
"### 2) Reject whole PR and revert\n"
|
||||
"- Set reviewer vote to **Reject**.\n"
|
||||
"- Abandon PR.\n"
|
||||
"- Auto-remediation queues restore (if `AUTO_REMEDIATE_ON_PR_REJECTION=true`).\n\n"
|
||||
"### 3) Reject only selected policy changes\n"
|
||||
"- In each `Change Needed` policy thread, comment `/reject` for changes you do not want.\n"
|
||||
"- Optional: use `/accept` for changes you want to keep.\n"
|
||||
"- Wait for review-sync pipeline (about 5 minutes) to update PR diff.\n"
|
||||
"- Merge remaining accepted changes.\n"
|
||||
"- Post-merge auto-remediation queues restore to reconcile tenant to merged baseline "
|
||||
"(if `AUTO_REMEDIATE_AFTER_MERGE=true`)."
|
||||
f"{lead} run `{build_number}` (build {build_id})\n\n"
|
||||
f"Source: `{drift_branch}` → Target: `{baseline_branch}`\n"
|
||||
)
|
||||
|
||||
|
||||
|
||||
102
scripts/filter_intune_formatting_noise.py
Normal file
102
scripts/filter_intune_formatting_noise.py
Normal file
@@ -0,0 +1,102 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Revert Intune JSON exports that differ from baseline only in formatting or key ordering."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def _run_git_show(repo_root: Path, ref: str, rel_path: str) -> str | None:
|
||||
proc = subprocess.run(
|
||||
["git", "show", f"{ref}:{rel_path}"],
|
||||
cwd=str(repo_root),
|
||||
check=False,
|
||||
capture_output=True,
|
||||
)
|
||||
if proc.returncode != 0:
|
||||
return None
|
||||
return proc.stdout.decode("utf-8", errors="replace")
|
||||
|
||||
|
||||
def revert_formatting_only_changes(
|
||||
repo_root: Path,
|
||||
backup_root: Path,
|
||||
baseline_ref: str,
|
||||
) -> tuple[list[str], list[str]]:
|
||||
reverted: list[str] = []
|
||||
kept: list[str] = []
|
||||
|
||||
for file_path in sorted(backup_root.rglob("*.json")):
|
||||
rel_path = file_path.relative_to(repo_root).as_posix()
|
||||
baseline_text = _run_git_show(repo_root, baseline_ref, rel_path)
|
||||
if not baseline_text:
|
||||
# New file — nothing to revert against
|
||||
continue
|
||||
|
||||
try:
|
||||
current_text = file_path.read_text(encoding="utf-8")
|
||||
current_payload = json.loads(current_text)
|
||||
baseline_payload = json.loads(baseline_text)
|
||||
except Exception:
|
||||
kept.append(rel_path)
|
||||
continue
|
||||
|
||||
if current_payload == baseline_payload:
|
||||
file_path.write_text(baseline_text, encoding="utf-8")
|
||||
reverted.append(rel_path)
|
||||
else:
|
||||
kept.append(rel_path)
|
||||
|
||||
return reverted, kept
|
||||
|
||||
|
||||
def main() -> int:
|
||||
parser = argparse.ArgumentParser(description=__doc__)
|
||||
parser.add_argument("--repo-root", required=True)
|
||||
parser.add_argument(
|
||||
"--backup-root",
|
||||
default="tenant-state/intune",
|
||||
help="Path to Intune backup root (default: tenant-state/intune).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--baseline-ref",
|
||||
default="HEAD",
|
||||
help="Git ref used as baseline for comparison (default: HEAD).",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
repo_root = Path(args.repo_root).resolve()
|
||||
backup_root = Path(args.backup_root)
|
||||
if not backup_root.is_absolute():
|
||||
backup_root = repo_root / backup_root
|
||||
backup_root = backup_root.resolve()
|
||||
|
||||
if not backup_root.exists():
|
||||
print(f"Backup root not found: {backup_root}")
|
||||
return 0
|
||||
|
||||
reverted, kept = revert_formatting_only_changes(
|
||||
repo_root=repo_root,
|
||||
backup_root=backup_root,
|
||||
baseline_ref=args.baseline_ref,
|
||||
)
|
||||
|
||||
if reverted:
|
||||
print(f"Reverted {len(reverted)} formatting-only Intune JSON export(s) to baseline:")
|
||||
for path in reverted:
|
||||
print(f" - {path}")
|
||||
else:
|
||||
print("No formatting-only Intune JSON exports detected.")
|
||||
|
||||
if kept:
|
||||
print(f"Files with actual semantic changes (kept): {len(kept)}")
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
444
scripts/probe_tenant_changes.py
Normal file
444
scripts/probe_tenant_changes.py
Normal file
@@ -0,0 +1,444 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Probe tenant audit logs to detect configuration changes and decide whether to trigger a backup pipeline.
|
||||
|
||||
This script is designed to run inside an Azure Function timer trigger or locally for testing.
|
||||
It queries Microsoft Graph audit endpoints for the cheapest possible signal that a configuration
|
||||
change occurred since the last check, then applies a debouncer so that a burst of changes during
|
||||
an admin sprint results in a single backup run after a configurable quiet window.
|
||||
|
||||
Usage (local testing):
|
||||
python3 scripts/probe_tenant_changes.py \
|
||||
--token "$GRAPH_TOKEN" \
|
||||
--state-path ./probe-state.json \
|
||||
--quiet-window-minutes 15 \
|
||||
--cooldown-minutes 30
|
||||
|
||||
Usage (Azure Function wrapper):
|
||||
python3 scripts/probe_tenant_changes.py \
|
||||
--token "$GRAPH_TOKEN" \
|
||||
--state-json '{"intune":{"last_check":"2026-04-20T10:00:00+00:00"},...}' \
|
||||
--quiet-window-minutes 15 \
|
||||
--cooldown-minutes 30
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import datetime as dt
|
||||
import json
|
||||
import os
|
||||
import pathlib
|
||||
import sys
|
||||
import urllib.parse
|
||||
from typing import Any
|
||||
|
||||
# scripts/ is not guaranteed to be on PYTHONPATH when loaded by the Function wrapper,
|
||||
# so we tolerate a relative import failure and fall back to an absolute import.
|
||||
try:
|
||||
from scripts.common import request_json
|
||||
except ImportError:
|
||||
from common import request_json # type: ignore[no-redef]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Constants
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_INTUNE_AUDIT_URL = "https://graph.microsoft.com/beta/deviceManagement/auditEvents"
|
||||
_ENTRA_AUDIT_URL = "https://graph.microsoft.com/v1.0/auditLogs/directoryAudits"
|
||||
|
||||
# Target resource types in Entra that map to the categories exported by export_entra_baseline.py.
|
||||
_ENTRA_TARGET_TYPES = (
|
||||
"ConditionalAccessPolicy",
|
||||
"NamedLocation",
|
||||
"AuthenticationStrengthPolicy",
|
||||
"Application",
|
||||
"ServicePrincipal",
|
||||
)
|
||||
|
||||
_DEFAULT_STATE: dict[str, Any] = {
|
||||
"intune": {"last_check": None},
|
||||
"entra": {"last_check": None},
|
||||
"debouncer": {
|
||||
"state": "idle",
|
||||
"first_event_at": None,
|
||||
"trigger_after": None,
|
||||
"cooldown_until": None,
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Token acquisition
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _acquire_graph_token(client_id: str, client_secret: str, tenant_id: str) -> str:
|
||||
"""Acquire a Graph access token via client credentials flow."""
|
||||
url = f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token"
|
||||
body = urllib.parse.urlencode(
|
||||
{
|
||||
"client_id": client_id,
|
||||
"client_secret": client_secret,
|
||||
"scope": "https://graph.microsoft.com/.default",
|
||||
"grant_type": "client_credentials",
|
||||
}
|
||||
).encode("utf-8")
|
||||
headers = {"Content-Type": "application/x-www-form-urlencoded"}
|
||||
req = urllib.request.Request(url, data=body, headers=headers, method="POST")
|
||||
with urllib.request.urlopen(req, timeout=30) as resp:
|
||||
payload = json.loads(resp.read().decode("utf-8"))
|
||||
access_token = payload.get("access_token")
|
||||
if not access_token:
|
||||
raise RuntimeError("Token endpoint did not return an access_token.")
|
||||
return str(access_token)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CLI
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(description=__doc__)
|
||||
parser.add_argument("--token", default="", help="Microsoft Graph bearer token (direct).")
|
||||
parser.add_argument("--client-id", default="", help="Entra app client ID (alternative to --token).")
|
||||
parser.add_argument("--client-secret", default="", help="Entra app client secret (alternative to --token).")
|
||||
parser.add_argument("--tenant-id", default="", help="Entra tenant ID (alternative to --token).")
|
||||
parser.add_argument(
|
||||
"--state-path",
|
||||
default="",
|
||||
help="Path to a local JSON state file (used for local testing).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--state-json",
|
||||
default="",
|
||||
help="Raw JSON state string (used when the caller manages persistence, e.g. Azure Table Storage).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--quiet-window-minutes",
|
||||
type=int,
|
||||
default=15,
|
||||
help="Minutes of silence after the last detected change before triggering a backup.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--cooldown-minutes",
|
||||
type=int,
|
||||
default=30,
|
||||
help="Minimum minutes between two triggered backup runs.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--now",
|
||||
default="",
|
||||
help="Override the current time (ISO 8601). Useful for tests.",
|
||||
)
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# State helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _load_state(path: str, json_str: str) -> dict[str, Any]:
|
||||
if json_str:
|
||||
return json.loads(json_str)
|
||||
if path:
|
||||
p = pathlib.Path(path)
|
||||
if p.exists():
|
||||
return json.loads(p.read_text(encoding="utf-8"))
|
||||
return json.loads(json.dumps(_DEFAULT_STATE))
|
||||
|
||||
|
||||
def _save_state(path: str, state: dict[str, Any]) -> None:
|
||||
if path:
|
||||
pathlib.Path(path).write_text(
|
||||
json.dumps(state, indent=2, ensure_ascii=False) + "\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
|
||||
def _parse_iso(value: str | None) -> dt.datetime | None:
|
||||
if not value:
|
||||
return None
|
||||
try:
|
||||
parsed = dt.datetime.fromisoformat(value.replace("Z", "+00:00"))
|
||||
return parsed.astimezone(dt.timezone.utc)
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
|
||||
def _format_iso(value: dt.datetime) -> str:
|
||||
return value.astimezone(dt.timezone.utc).isoformat().replace("+00:00", "Z")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Graph queries
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _build_intune_filter(since: dt.datetime, until: dt.datetime) -> str:
|
||||
since_str = since.strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
until_str = until.strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
return (
|
||||
f"activityDateTime ge {since_str}"
|
||||
f" and activityDateTime le {until_str}"
|
||||
f" and activityResult eq 'Success'"
|
||||
f" and ActivityOperationType ne 'Get'"
|
||||
)
|
||||
|
||||
|
||||
def _build_entra_filter(since: dt.datetime, until: dt.datetime) -> str:
|
||||
since_str = since.strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
until_str = until.strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
type_clauses = " or ".join(
|
||||
f"targetResources/any(t: t/type eq '{t}')" for t in _ENTRA_TARGET_TYPES
|
||||
)
|
||||
return (
|
||||
f"activityDateTime ge {since_str}"
|
||||
f" and activityDateTime le {until_str}"
|
||||
f" and result eq 'success'"
|
||||
f" and ({type_clauses})"
|
||||
)
|
||||
|
||||
|
||||
def _fetch_latest_event(url: str, token: str) -> dict[str, Any] | None:
|
||||
"""Return the single latest matching audit event, or None if nothing found."""
|
||||
try:
|
||||
payload = request_json(url, token=token, timeout=30, max_retries=2)
|
||||
except Exception as exc:
|
||||
# Defensive: log and treat as no event so a transient Graph failure does
|
||||
# not wedge the debouncer in an armed state forever.
|
||||
print(f"Warning: Graph query failed ({exc})", file=sys.stderr)
|
||||
return None
|
||||
|
||||
value = payload.get("value")
|
||||
if isinstance(value, list) and value:
|
||||
event = value[0]
|
||||
if isinstance(event, dict):
|
||||
return event
|
||||
return None
|
||||
|
||||
|
||||
def _get_latest_intune_event(
|
||||
token: str, since: dt.datetime, until: dt.datetime
|
||||
) -> dict[str, Any] | None:
|
||||
filter_str = _build_intune_filter(since, until)
|
||||
params = {
|
||||
"$filter": filter_str,
|
||||
"$orderby": "activityDateTime desc",
|
||||
"$top": "1",
|
||||
"$select": "id,activityDateTime,activityType,activityOperationType",
|
||||
}
|
||||
url = f"{_INTUNE_AUDIT_URL}?{urllib.parse.urlencode(params)}"
|
||||
return _fetch_latest_event(url, token)
|
||||
|
||||
|
||||
def _get_latest_entra_event(
|
||||
token: str, since: dt.datetime, until: dt.datetime
|
||||
) -> dict[str, Any] | None:
|
||||
filter_str = _build_entra_filter(since, until)
|
||||
params = {
|
||||
"$filter": filter_str,
|
||||
"$orderby": "activityDateTime desc",
|
||||
"$top": "1",
|
||||
"$select": "id,activityDateTime,activityDisplayName",
|
||||
}
|
||||
url = f"{_ENTRA_AUDIT_URL}?{urllib.parse.urlencode(params)}"
|
||||
return _fetch_latest_event(url, token)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Debouncer
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _evaluate_debouncer(
|
||||
state: dict[str, Any],
|
||||
intune_event: dict[str, Any] | None,
|
||||
entra_event: dict[str, Any] | None,
|
||||
now: dt.datetime,
|
||||
quiet_window: dt.timedelta,
|
||||
cooldown: dt.timedelta,
|
||||
) -> tuple[bool, dict[str, Any], str]:
|
||||
"""Return (should_trigger, updated_state, human_readable_reason)."""
|
||||
|
||||
deb = dict(state.get("debouncer") or {})
|
||||
deb_state = str(deb.get("state") or "idle")
|
||||
|
||||
# Extract event timestamps if present
|
||||
intune_time: dt.datetime | None = None
|
||||
entra_time: dt.datetime | None = None
|
||||
if intune_event:
|
||||
intune_time = _parse_iso(intune_event.get("activityDateTime"))
|
||||
if entra_event:
|
||||
entra_time = _parse_iso(entra_event.get("activityDateTime"))
|
||||
|
||||
latest_event_time = max(
|
||||
(t for t in (intune_time, entra_time) if t is not None), default=None
|
||||
)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Cooldown check
|
||||
# ------------------------------------------------------------------
|
||||
if deb_state == "cooldown":
|
||||
cooldown_until = _parse_iso(deb.get("cooldown_until"))
|
||||
if cooldown_until is not None and now < cooldown_until:
|
||||
reason = (
|
||||
f"In cooldown until {_format_iso(cooldown_until)}; "
|
||||
f"{int(intune_event is not None) + int(entra_event is not None)} event(s) ignored."
|
||||
)
|
||||
return False, state, reason
|
||||
# Cooldown expired → fall through to idle logic
|
||||
deb = {
|
||||
"state": "idle",
|
||||
"first_event_at": None,
|
||||
"trigger_after": None,
|
||||
"cooldown_until": None,
|
||||
}
|
||||
deb_state = "idle"
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Idle or armed
|
||||
# ------------------------------------------------------------------
|
||||
if latest_event_time is None:
|
||||
# No changes in this window
|
||||
if deb_state == "armed":
|
||||
trigger_after = _parse_iso(deb.get("trigger_after"))
|
||||
if trigger_after is not None and now >= trigger_after:
|
||||
# Quiet window satisfied — fire
|
||||
deb = {
|
||||
"state": "cooldown",
|
||||
"first_event_at": None,
|
||||
"trigger_after": None,
|
||||
"cooldown_until": _format_iso(now + cooldown),
|
||||
}
|
||||
reason = "Quiet window satisfied; no new events since last check."
|
||||
state["debouncer"] = deb
|
||||
return True, state, reason
|
||||
# Still waiting
|
||||
reason = f"Armed, waiting for quiet window until {_format_iso(trigger_after)}."
|
||||
state["debouncer"] = deb
|
||||
return False, state, reason
|
||||
# Idle, no changes
|
||||
reason = "No changes detected."
|
||||
state["debouncer"] = deb
|
||||
return False, state, reason
|
||||
|
||||
# There is at least one new event
|
||||
if deb_state == "idle":
|
||||
# First change in a while — arm the debouncer
|
||||
trigger_after = now + quiet_window
|
||||
deb = {
|
||||
"state": "armed",
|
||||
"first_event_at": _format_iso(latest_event_time),
|
||||
"trigger_after": _format_iso(trigger_after),
|
||||
"cooldown_until": None,
|
||||
}
|
||||
reason = (
|
||||
f"Change detected at {_format_iso(latest_event_time)}; "
|
||||
f"armed, trigger scheduled for {_format_iso(trigger_after)}."
|
||||
)
|
||||
state["debouncer"] = deb
|
||||
return False, state, reason
|
||||
|
||||
if deb_state == "armed":
|
||||
# Extend the quiet window because activity is still ongoing
|
||||
trigger_after = now + quiet_window
|
||||
first_event = deb.get("first_event_at") or _format_iso(latest_event_time)
|
||||
deb = {
|
||||
"state": "armed",
|
||||
"first_event_at": first_event,
|
||||
"trigger_after": _format_iso(trigger_after),
|
||||
"cooldown_until": None,
|
||||
}
|
||||
workloads: list[str] = []
|
||||
if intune_event:
|
||||
workloads.append("intune")
|
||||
if entra_event:
|
||||
workloads.append("entra")
|
||||
reason = (
|
||||
f"Additional change detected at {_format_iso(latest_event_time)} "
|
||||
f"({'/'.join(workloads)}); quiet window extended to {_format_iso(trigger_after)}."
|
||||
)
|
||||
state["debouncer"] = deb
|
||||
return False, state, reason
|
||||
|
||||
# Defensive fallback
|
||||
reason = f"Unexpected debouncer state '{deb_state}'; resetting to idle."
|
||||
state["debouncer"] = {
|
||||
"state": "idle",
|
||||
"first_event_at": None,
|
||||
"trigger_after": None,
|
||||
"cooldown_until": None,
|
||||
}
|
||||
return False, state, reason
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main() -> int:
|
||||
args = parse_args()
|
||||
|
||||
token = args.token.strip()
|
||||
if not token:
|
||||
if args.client_id and args.client_secret and args.tenant_id:
|
||||
token = _acquire_graph_token(args.client_id, args.client_secret, args.tenant_id)
|
||||
else:
|
||||
print(
|
||||
"ERROR: Provide --token, or all three of --client-id, --client-secret, --tenant-id.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
raise SystemExit(1)
|
||||
|
||||
quiet_window = dt.timedelta(minutes=args.quiet_window_minutes)
|
||||
cooldown = dt.timedelta(minutes=args.cooldown_minutes)
|
||||
|
||||
now = _parse_iso(args.now) or dt.datetime.now(dt.timezone.utc)
|
||||
# Truncate to second for cleaner output
|
||||
now = now.replace(microsecond=0)
|
||||
|
||||
state = _load_state(args.state_path, args.state_json)
|
||||
|
||||
# Initialise missing last_check values to a safe default (24 hours ago).
|
||||
# This prevents a brand-new state file from scanning the entire audit log history.
|
||||
default_since = now - dt.timedelta(hours=24)
|
||||
intune_since = _parse_iso(state.get("intune", {}).get("last_check")) or default_since
|
||||
entra_since = _parse_iso(state.get("entra", {}).get("last_check")) or default_since
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Query Graph
|
||||
# ------------------------------------------------------------------
|
||||
intune_event = _get_latest_intune_event(token, intune_since, now)
|
||||
entra_event = _get_latest_entra_event(token, entra_since, now)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Debounce
|
||||
# ------------------------------------------------------------------
|
||||
trigger, state, reason = _evaluate_debouncer(
|
||||
state, intune_event, entra_event, now, quiet_window, cooldown
|
||||
)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Advance watermarks regardless of trigger decision so the next run
|
||||
# does not re-scan the same window.
|
||||
# ------------------------------------------------------------------
|
||||
state.setdefault("intune", {})["last_check"] = _format_iso(now)
|
||||
state.setdefault("entra", {})["last_check"] = _format_iso(now)
|
||||
|
||||
_save_state(args.state_path, state)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Emit decision
|
||||
# ------------------------------------------------------------------
|
||||
result = {
|
||||
"trigger": trigger,
|
||||
"reason": reason,
|
||||
"checked_at": _format_iso(now),
|
||||
"intune_event": intune_event,
|
||||
"entra_event": entra_event,
|
||||
"new_state": state,
|
||||
}
|
||||
print(json.dumps(result, indent=2, ensure_ascii=False))
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
86
scripts/trigger_backup_pipeline.py
Normal file
86
scripts/trigger_backup_pipeline.py
Normal file
@@ -0,0 +1,86 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Trigger an Azure DevOps pipeline run via REST API.
|
||||
|
||||
Intended to be invoked from the queue-consumer Azure Function or locally for testing.
|
||||
|
||||
Usage:
|
||||
python3 scripts/trigger_backup_pipeline.py \
|
||||
--organization "my-org" \
|
||||
--project "my-project" \
|
||||
--pipeline-id 123 \
|
||||
--token "$ADO_PAT" \
|
||||
--branch "main" \
|
||||
--parameters '{"forceFullRun": false}'
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import sys
|
||||
from typing import Any
|
||||
|
||||
try:
|
||||
from scripts.common import request_json
|
||||
except ImportError:
|
||||
from common import request_json # type: ignore[no-redef]
|
||||
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(description=__doc__)
|
||||
parser.add_argument("--organization", required=True)
|
||||
parser.add_argument("--project", required=True)
|
||||
parser.add_argument("--pipeline-id", type=int, required=True)
|
||||
parser.add_argument("--token", required=True, help="Azure DevOps PAT or OAuth token.")
|
||||
parser.add_argument("--branch", default="main", help="Git ref to run against.")
|
||||
parser.add_argument(
|
||||
"--parameters",
|
||||
default="{}",
|
||||
help='JSON object of pipeline template parameters (e.g. \'{"forceFullRun": true}\').',
|
||||
)
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def main() -> int:
|
||||
args = parse_args()
|
||||
|
||||
base_url = (
|
||||
f"https://dev.azure.com/{args.organization}/{args.project}"
|
||||
f"/_apis/pipelines/{args.pipeline_id}/runs?api-version=7.1"
|
||||
)
|
||||
|
||||
body: dict[str, Any] = {
|
||||
"resources": {
|
||||
"repositories": {
|
||||
"self": {"refName": f"refs/heads/{args.branch.lstrip('refs/heads/')}"}
|
||||
}
|
||||
},
|
||||
}
|
||||
|
||||
params = json.loads(args.parameters)
|
||||
if isinstance(params, dict) and params:
|
||||
body["templateParameters"] = params
|
||||
|
||||
# ADO REST API accepts Basic auth with an empty username and the PAT as password.
|
||||
import base64
|
||||
encoded = base64.b64encode(f":{args.token}".encode("utf-8")).decode("utf-8")
|
||||
auth_header = f"Basic {encoded}"
|
||||
|
||||
print(f"Triggering pipeline {args.pipeline_id} on branch {args.branch} ...")
|
||||
response = request_json(
|
||||
base_url,
|
||||
method="POST",
|
||||
body=body,
|
||||
headers={"Authorization": auth_header},
|
||||
timeout=30,
|
||||
max_retries=2,
|
||||
)
|
||||
|
||||
run_id = response.get("id")
|
||||
run_url = response.get("url")
|
||||
print(f"Queued run id={run_id} url={run_url}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
@@ -54,6 +54,14 @@ TICKET_BLOCK_END = "<!-- AUTO-CHANGE-TICKETS:END -->"
|
||||
AUTO_TICKET_THREAD_PREFIX = "AUTO-CHANGE-TICKET:"
|
||||
AUTO_AI_REVIEW_THREAD_PREFIX = "AUTO-AI-REVIEW:"
|
||||
COMPACT_AI_THREAD_NOTE = "_Full AI reviewer narrative is posted in a dedicated PR thread due PR description limits._"
|
||||
AUTO_DETERMINISTIC_THREAD_PREFIX = "AUTO-DETERMINISTIC-SUMMARY:"
|
||||
COMPACT_DETERMINISTIC_THREAD_NOTE = (
|
||||
"_Full deterministic summary (including Top Risk Items) is posted in a dedicated PR thread "
|
||||
"due to Azure DevOps description size limits._"
|
||||
)
|
||||
ADO_PR_DESCRIPTION_MAX_LEN = 4000
|
||||
AUTO_REVIEWER_GUIDE_THREAD_PREFIX = "AUTO-REVIEWER-GUIDE:"
|
||||
COMPACT_REVIEWER_GUIDE_NOTE = "> 📋 Full **reviewer guide** is posted in a dedicated PR thread."
|
||||
|
||||
THREAD_STATUS_ACTIVE = 1
|
||||
THREAD_STATUS_FIXED = 2
|
||||
@@ -2035,6 +2043,29 @@ def _compact_deterministic_summary(deterministic_summary: str) -> str:
|
||||
return deterministic_summary[:idx].strip()
|
||||
|
||||
|
||||
def _compact_reviewer_guide(description: str) -> str:
|
||||
"""Replace the legacy long reviewer guide with a compact reference."""
|
||||
description = description or ""
|
||||
marker = "## Reviewer Quick Actions"
|
||||
idx = description.find(marker)
|
||||
if idx == -1:
|
||||
return description
|
||||
prefix = description[:idx].rstrip()
|
||||
if not prefix:
|
||||
return COMPACT_REVIEWER_GUIDE_NOTE + "\n"
|
||||
return prefix + "\n\n" + COMPACT_REVIEWER_GUIDE_NOTE + "\n"
|
||||
|
||||
|
||||
def _append_reviewer_guide_note(description: str) -> str:
|
||||
"""Append the compact reviewer guide note if not already present."""
|
||||
description = description or ""
|
||||
if COMPACT_REVIEWER_GUIDE_NOTE in description:
|
||||
return description
|
||||
if description.endswith("\n"):
|
||||
return description + COMPACT_REVIEWER_GUIDE_NOTE + "\n"
|
||||
return description + "\n\n" + COMPACT_REVIEWER_GUIDE_NOTE + "\n"
|
||||
|
||||
|
||||
def _remove_marked_block(description: str, start_marker: str, end_marker: str) -> str:
|
||||
description = description or ""
|
||||
pattern = re.compile(
|
||||
@@ -2273,6 +2304,185 @@ def _sync_full_ai_review_thread(
|
||||
return True
|
||||
|
||||
|
||||
def _deterministic_thread_marker(workload: str) -> str:
|
||||
return f"Automation marker: {AUTO_DETERMINISTIC_THREAD_PREFIX}{workload.strip().lower()}"
|
||||
|
||||
|
||||
def _build_full_deterministic_thread_content(workload: str, deterministic_summary: str) -> str:
|
||||
marker = _deterministic_thread_marker(workload)
|
||||
return (
|
||||
"Automated review summary (full)\n\n"
|
||||
"PR description uses a compact review summary because of Azure DevOps description size limits.\n\n"
|
||||
f"{deterministic_summary}\n\n"
|
||||
f"{marker}"
|
||||
).strip()
|
||||
|
||||
|
||||
def _create_deterministic_thread(
|
||||
repo_api: str,
|
||||
pr_id: int,
|
||||
token: str,
|
||||
workload: str,
|
||||
deterministic_summary: str,
|
||||
) -> None:
|
||||
content = _build_full_deterministic_thread_content(workload, deterministic_summary)
|
||||
_request_json(
|
||||
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
|
||||
token=token,
|
||||
method="POST",
|
||||
body={
|
||||
"comments": [
|
||||
{
|
||||
"parentCommentId": 0,
|
||||
"content": content,
|
||||
"commentType": 1,
|
||||
}
|
||||
],
|
||||
"status": THREAD_STATUS_ACTIVE,
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
def _sync_deterministic_thread(
|
||||
repo_api: str,
|
||||
pr_id: int,
|
||||
token: str,
|
||||
workload: str,
|
||||
deterministic_summary: str,
|
||||
) -> bool:
|
||||
marker = _deterministic_thread_marker(workload)
|
||||
desired_content = _build_full_deterministic_thread_content(workload, deterministic_summary)
|
||||
threads_payload = _request_json(
|
||||
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
|
||||
token=token,
|
||||
)
|
||||
threads = threads_payload.get("value", []) if isinstance(threads_payload, dict) else []
|
||||
thread = _find_marked_thread(threads, marker)
|
||||
if thread is None:
|
||||
_create_deterministic_thread(repo_api, pr_id, token, workload, deterministic_summary)
|
||||
return True
|
||||
|
||||
comments = thread.get("comments", []) if isinstance(thread.get("comments"), list) else []
|
||||
if _thread_has_matching_comment(comments, desired_content):
|
||||
return False
|
||||
|
||||
thread_id = _thread_id(thread)
|
||||
if thread_id <= 0:
|
||||
_create_deterministic_thread(repo_api, pr_id, token, workload, deterministic_summary)
|
||||
return True
|
||||
|
||||
if _is_thread_resolved(thread):
|
||||
_set_thread_status(repo_api, pr_id, thread_id, token, THREAD_STATUS_ACTIVE)
|
||||
_add_thread_comment(repo_api, pr_id, thread_id, token, desired_content)
|
||||
return True
|
||||
|
||||
|
||||
def _close_deterministic_thread(
|
||||
repo_api: str,
|
||||
pr_id: int,
|
||||
token: str,
|
||||
workload: str,
|
||||
) -> bool:
|
||||
marker = _deterministic_thread_marker(workload)
|
||||
threads_payload = _request_json(
|
||||
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
|
||||
token=token,
|
||||
)
|
||||
threads = threads_payload.get("value", []) if isinstance(threads_payload, dict) else []
|
||||
thread = _find_marked_thread(threads, marker)
|
||||
if thread is None:
|
||||
return False
|
||||
thread_id = _thread_id(thread)
|
||||
if thread_id <= 0:
|
||||
return False
|
||||
if _is_thread_resolved(thread):
|
||||
return False
|
||||
_set_thread_status(repo_api, pr_id, thread_id, token, THREAD_STATUS_CLOSED)
|
||||
return True
|
||||
|
||||
|
||||
def _reviewer_guide_thread_marker(workload: str) -> str:
|
||||
return f"Automation marker: {AUTO_REVIEWER_GUIDE_THREAD_PREFIX}{workload.strip().lower()}"
|
||||
|
||||
|
||||
def _build_full_reviewer_guide_thread_content(workload: str) -> str:
|
||||
marker = _reviewer_guide_thread_marker(workload)
|
||||
return (
|
||||
"## Reviewer Quick Actions\n\n"
|
||||
"### 1) Accept all changes\n"
|
||||
"- Merge PR to accept drift into baseline.\n\n"
|
||||
"### 2) Reject whole PR and revert\n"
|
||||
"- Set reviewer vote to **Reject**.\n"
|
||||
"- Abandon PR.\n"
|
||||
"- Auto-remediation queues restore (if `AUTO_REMEDIATE_ON_PR_REJECTION=true`).\n\n"
|
||||
"### 3) Reject only selected policy changes\n"
|
||||
"- In each `Change Needed` policy thread, comment `/reject` for changes you do not want.\n"
|
||||
"- Optional: use `/accept` for changes you want to keep.\n"
|
||||
"- Wait for review-sync pipeline (about 5 minutes) to update PR diff.\n"
|
||||
"- Merge remaining accepted changes.\n"
|
||||
"- Post-merge auto-remediation queues restore to reconcile tenant to merged baseline "
|
||||
"(if `AUTO_REMEDIATE_AFTER_MERGE=true`).\n\n"
|
||||
f"{marker}"
|
||||
).strip()
|
||||
|
||||
|
||||
def _create_reviewer_guide_thread(
|
||||
repo_api: str,
|
||||
pr_id: int,
|
||||
token: str,
|
||||
workload: str,
|
||||
) -> None:
|
||||
content = _build_full_reviewer_guide_thread_content(workload)
|
||||
_request_json(
|
||||
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
|
||||
token=token,
|
||||
method="POST",
|
||||
body={
|
||||
"comments": [
|
||||
{
|
||||
"parentCommentId": 0,
|
||||
"content": content,
|
||||
"commentType": 1,
|
||||
}
|
||||
],
|
||||
"status": THREAD_STATUS_ACTIVE,
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
def _sync_reviewer_guide_thread(
|
||||
repo_api: str,
|
||||
pr_id: int,
|
||||
token: str,
|
||||
workload: str,
|
||||
) -> bool:
|
||||
marker = _reviewer_guide_thread_marker(workload)
|
||||
desired_content = _build_full_reviewer_guide_thread_content(workload)
|
||||
threads_payload = _request_json(
|
||||
f"{repo_api}/pullrequests/{pr_id}/threads?api-version=7.1",
|
||||
token=token,
|
||||
)
|
||||
threads = threads_payload.get("value", []) if isinstance(threads_payload, dict) else []
|
||||
thread = _find_marked_thread(threads, marker)
|
||||
if thread is None:
|
||||
_create_reviewer_guide_thread(repo_api, pr_id, token, workload)
|
||||
return True
|
||||
|
||||
comments = thread.get("comments", []) if isinstance(thread.get("comments"), list) else []
|
||||
if _thread_has_matching_comment(comments, desired_content):
|
||||
return False
|
||||
|
||||
thread_id = _thread_id(thread)
|
||||
if thread_id <= 0:
|
||||
_create_reviewer_guide_thread(repo_api, pr_id, token, workload)
|
||||
return True
|
||||
|
||||
if _is_thread_resolved(thread):
|
||||
_set_thread_status(repo_api, pr_id, thread_id, token, THREAD_STATUS_ACTIVE)
|
||||
_add_thread_comment(repo_api, pr_id, thread_id, token, desired_content)
|
||||
return True
|
||||
|
||||
|
||||
def _set_thread_status(
|
||||
repo_api: str,
|
||||
pr_id: int,
|
||||
@@ -2530,12 +2740,16 @@ def main() -> int:
|
||||
)
|
||||
|
||||
full_pr = _request_json(f"{repo_api}/pullrequests/{pr_id}?api-version=7.1", token=token)
|
||||
current_description = full_pr.get("description", "")
|
||||
current_description = full_pr.get("description") or ""
|
||||
pr_is_draft = bool(full_pr.get("isDraft"))
|
||||
existing_fingerprint = _existing_change_fingerprint(current_description)
|
||||
existing_summary_version = _existing_summary_version(current_description)
|
||||
current_auto_body = _auto_block_body(current_description)
|
||||
deterministic_already_present = deterministic in current_auto_body if current_auto_body else False
|
||||
compact_deterministic = _compact_deterministic_summary(deterministic)
|
||||
deterministic_already_present = (
|
||||
(deterministic in current_auto_body)
|
||||
or (compact_deterministic in current_auto_body)
|
||||
) if current_auto_body else False
|
||||
ai_fallback_in_current_block = _auto_block_contains_ai_fallback(current_auto_body)
|
||||
refresh_on_fallback = _env_bool("PR_AI_FORCE_REFRESH_ON_FALLBACK", default=True)
|
||||
if existing_fingerprint and existing_fingerprint == changes_fingerprint:
|
||||
@@ -2549,7 +2763,7 @@ def main() -> int:
|
||||
repo_api=repo_api,
|
||||
token=token,
|
||||
pr_id=int(pr_id),
|
||||
title=full_pr.get("title", pr.get("title", f"{args.workload} drift review (rolling)")),
|
||||
title=full_pr.get("title") or pr.get("title") or f"{args.workload} drift review (rolling)",
|
||||
description=current_description,
|
||||
is_draft=pr_is_draft,
|
||||
)
|
||||
@@ -2625,29 +2839,44 @@ def main() -> int:
|
||||
updated_description = _upsert_auto_block(current_description, auto_block)
|
||||
# Cleanup legacy description-based ticket checklist if present.
|
||||
updated_description = _remove_marked_block(updated_description, TICKET_BLOCK_START, TICKET_BLOCK_END)
|
||||
# Strip legacy long reviewer guide and ensure compact note is present.
|
||||
updated_description = _compact_reviewer_guide(updated_description)
|
||||
updated_description = _append_reviewer_guide_note(updated_description)
|
||||
|
||||
patch_url = f"{repo_api}/pullrequests/{pr_id}?api-version=7.1"
|
||||
patch_title = full_pr.get("title", pr.get("title", f"{args.workload} drift review (rolling)"))
|
||||
patch_title = full_pr.get("title") or pr.get("title") or f"{args.workload} drift review (rolling)"
|
||||
summary_updated = False
|
||||
final_description = current_description
|
||||
description_compacted = False
|
||||
print(
|
||||
f"DEBUG summary: pr_id={pr_id} workload={args.workload} "
|
||||
f"status={full_pr.get('status')} isDraft={full_pr.get('isDraft')} "
|
||||
f"mergeStatus={full_pr.get('mergeStatus')} title_len={len(patch_title)} "
|
||||
f"current_desc_len={len(current_description or '')} updated_desc_len={len(updated_description or '')}"
|
||||
)
|
||||
# Proactively compact if we are near the Azure DevOps PR description limit.
|
||||
if len(updated_description) > (ADO_PR_DESCRIPTION_MAX_LEN - 100):
|
||||
description_compacted = True
|
||||
|
||||
if updated_description != current_description:
|
||||
try:
|
||||
_request_json(
|
||||
patch_url,
|
||||
token=token,
|
||||
method="PATCH",
|
||||
body={
|
||||
"title": patch_title,
|
||||
"description": updated_description,
|
||||
},
|
||||
)
|
||||
summary_updated = True
|
||||
final_description = updated_description
|
||||
except RuntimeError as exc:
|
||||
if not _is_description_limit_error(exc):
|
||||
raise
|
||||
description_compacted = True
|
||||
if not description_compacted:
|
||||
try:
|
||||
_request_json(
|
||||
patch_url,
|
||||
token=token,
|
||||
method="PATCH",
|
||||
body={
|
||||
"title": patch_title,
|
||||
"description": updated_description,
|
||||
},
|
||||
)
|
||||
summary_updated = True
|
||||
final_description = updated_description
|
||||
except RuntimeError as exc:
|
||||
if not _is_description_limit_error(exc):
|
||||
raise
|
||||
description_compacted = True
|
||||
if description_compacted:
|
||||
compact_ai_block = ""
|
||||
if ai_summary:
|
||||
compact_ai_block = "\n### AI Reviewer Narrative\n" + COMPACT_AI_THREAD_NOTE
|
||||
@@ -2660,6 +2889,8 @@ def main() -> int:
|
||||
"",
|
||||
f"- **Summary Version:** `{AUTO_SUMMARY_VERSION}`",
|
||||
_compact_deterministic_summary(deterministic),
|
||||
"",
|
||||
COMPACT_DETERMINISTIC_THREAD_NOTE,
|
||||
compact_ai_block,
|
||||
AUTO_BLOCK_END,
|
||||
]
|
||||
@@ -2670,10 +2901,11 @@ def main() -> int:
|
||||
)
|
||||
if compact_description == updated_description:
|
||||
raise
|
||||
print(
|
||||
"WARNING: Full PR summary update failed; retrying with compact summary block. "
|
||||
f"Reason: {exc}"
|
||||
)
|
||||
if not summary_updated:
|
||||
print(
|
||||
"INFO: Full PR summary exceeds Azure DevOps description limit; "
|
||||
"using compact summary in description and posting full details to a PR thread."
|
||||
)
|
||||
try:
|
||||
_request_json(
|
||||
patch_url,
|
||||
@@ -2697,6 +2929,7 @@ def main() -> int:
|
||||
f"- **Summary Version:** `{AUTO_SUMMARY_VERSION}`",
|
||||
_compact_deterministic_summary(deterministic),
|
||||
"",
|
||||
COMPACT_DETERMINISTIC_THREAD_NOTE,
|
||||
COMPACT_AI_THREAD_NOTE,
|
||||
AUTO_BLOCK_END,
|
||||
]
|
||||
@@ -2720,6 +2953,34 @@ def main() -> int:
|
||||
else:
|
||||
final_description = updated_description
|
||||
|
||||
if description_compacted:
|
||||
try:
|
||||
thread_updated = _sync_deterministic_thread(
|
||||
repo_api=repo_api,
|
||||
pr_id=int(pr_id),
|
||||
token=token,
|
||||
workload=args.workload,
|
||||
deterministic_summary=deterministic,
|
||||
)
|
||||
if thread_updated:
|
||||
print(f"Updated full deterministic summary thread for PR #{pr_id} ({args.workload}).")
|
||||
else:
|
||||
print(f"Full deterministic summary thread already up to date for PR #{pr_id} ({args.workload}).")
|
||||
except Exception as exc:
|
||||
print(f"WARNING: Failed to sync full deterministic summary thread for PR #{pr_id}: {exc}")
|
||||
else:
|
||||
try:
|
||||
closed = _close_deterministic_thread(
|
||||
repo_api=repo_api,
|
||||
pr_id=int(pr_id),
|
||||
token=token,
|
||||
workload=args.workload,
|
||||
)
|
||||
if closed:
|
||||
print(f"Closed full deterministic summary thread for PR #{pr_id} ({args.workload}) because description now fits.")
|
||||
except Exception as exc:
|
||||
print(f"WARNING: Failed to close deterministic summary thread for PR #{pr_id}: {exc}")
|
||||
|
||||
if summary_updated:
|
||||
print(f"Updated automated review summary for PR #{pr_id} ({args.workload}).")
|
||||
else:
|
||||
@@ -2739,6 +3000,19 @@ def main() -> int:
|
||||
print(f"Full AI reviewer narrative thread already up to date for PR #{pr_id} ({args.workload}).")
|
||||
except Exception as exc:
|
||||
print(f"WARNING: Failed to sync full AI reviewer narrative thread for PR #{pr_id}: {exc}")
|
||||
try:
|
||||
guide_updated = _sync_reviewer_guide_thread(
|
||||
repo_api=repo_api,
|
||||
pr_id=int(pr_id),
|
||||
token=token,
|
||||
workload=args.workload,
|
||||
)
|
||||
if guide_updated:
|
||||
print(f"Updated reviewer guide thread for PR #{pr_id} ({args.workload}).")
|
||||
else:
|
||||
print(f"Reviewer guide thread already up to date for PR #{pr_id} ({args.workload}).")
|
||||
except Exception as exc:
|
||||
print(f"WARNING: Failed to sync reviewer guide thread for PR #{pr_id}: {exc}")
|
||||
if _publish_draft_pr(
|
||||
repo_api=repo_api,
|
||||
token=token,
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# tenant-state
|
||||
|
||||
This directory is populated automatically by the ASTRAL pipeline.
|
||||
This directory is populated automatically by the ASTRAL backup pipeline.
|
||||
Do not place manual files here; they will be overwritten on the next export.
|
||||
|
||||
Reference in New Issue
Block a user