corvus,
an audit of who you are, across systems.
A read-only PowerShell module that triages duplicate Active Directory user objects in environments federating Oracle HCM, Workday, UKG, SailPoint, CyberArk, AD, and Entra ID.
corvus is one PowerShell module file --- ~1500 lines, 29 exported functions --- that runs against an existing AD environment and writes JSON / CSV / summary artifacts. It is read-only by design. Detectors investigate and report; nothing in the module ever calls Set-AD* or Remove-AD*. The wrapper script Invoke-Corvus.ps1 runs preflight + a full sweep and lands a run folder under %TEMP%\Corvus\<RunId>\.
ITech scope
- Detectors named
Find-Corvus*follow a strict contract: each takes-Inventory(the output ofGet-CorvusInventory) and emits findings as a side effect viaNew-CorvusFinding. Adding a new detector means writing the function, exporting it viaExport-ModuleMember -Function, and inserting one call in the orchestrator. The contract is uniform on purpose — a contributor adding a vendor-specific check should not have to learn a new function shape. - Module-scoped state lives in four variables —
$script:RunContext,$script:Findings,$script:Inventory,$script:Config— mutated across calls within a run.Initialize-CorvusRunresets all four; detectors append findings viaNew-CorvusFinding; the orchestrator reads$script:Findingsat the end. Module-scope is a deliberate choice over a pipeline of return values because PowerShell pipelines flatten nested objects in ways that complicate "one run, many findings, many sources" without an external accumulator. - Configuration ships with vendor-account name patterns for SailPoint, Workday, Oracle HCM, UKG, CyberArk, and Entra Connect; the patterns are regex, not glob, because vendor-account naming conventions change with major release upgrades and a regex is easier to ratchet without breaking older environments. AD attribute mappings live in the same config:
extensionAttribute1carries the source system code,extensionAttribute14carries the source record id. Override per-environment withSet-CorvusConfiguration; never edit the in-module defaults, because the defaults are what makes a fresh deploy reproduce the same triage as the reference environment. - Reads the Security event log (event id 4720, "A user account was created") from domain controllers when available. The event-log pull is what lets corvus detect a freshly-provisioned shadow account before it shows up in the next AD inventory snapshot.
-SkipEventLogfalls back to inventory-only mode for environments where the operator does not haveRead-EventLogrights on the DCs — the run is still useful, just narrower. - RSAT’s
ActiveDirectorymodule is the only hard dependency. Everything else (HCM, Workday, UKG, SailPoint, CyberArk, Entra) is consumed via REST or LDAP from insideGet-CorvusInventory, with provider-specific connectors that fail soft — if Workday is unreachable, the run continues with the other six sources and the report calls out the missing source explicitly.
IIWhat “read-only” means here
The module never calls a write cmdlet against any directory. Detectors investigate and report; remediation is the operator’s responsibility, performed in their own change-control process under their own approvals. This is preserved as an invariant: every new detector is reviewed against the rule, the test harness uses a mock directory whose write methods throw NotImplementedException, and a contributor adding a Set-AD* or Remove-AD* call would have to do so deliberately and against active resistance. The "RO" plate above is a contract, not a description.
The deeper reason is that auditors and IGA leads have to be able to point at corvus and say "this can’t make the problem worse." A triage tool that holds write credentials introduces a new compromise vector by existing; corvus eliminates it by construction.
IIIIdentity normalization
Every connector has its own definition of "the same person." Workday’s worker id is canonical inside Workday and meaningless to AD; SailPoint’s identityName may match Entra’s userPrincipalName but only after a normalization pass that strips domain suffixes and Unicode case-folds. corvus does the normalization explicitly — the rules live in $script:Config.IdentityKeys as data, not as code, so a triage-time rule change is a config edit rather than a module re-publish. Idempotency is non-negotiable: the same input across the same seven sources yields a byte-identical duplicate ledger across runs, which is what makes corvus reports comparable week over week.
IVOutput
One run produces three artifacts in %TEMP%\Corvus\<RunId>\, where RunId is an ISO-8601-ish stamp plus a four-character entropy suffix so concurrent runs from a shared host don’t collide:
- JSON dump of every finding. Machine-readable, schema-stable, suitable for feeding into a SIEM, a ticketing system, or a downstream PowerBI report. Every finding carries a
FindingHashderived from(detector_id, identity_hash, source_system), which is what lets the operator dedupe across runs without inventing a join key. - CSV with one row per finding. Spreadsheet-friendly for the analysts who triage by sort-and-filter rather than by SQL.
- Markdown summary written for a human to read first. The summary leads with the count of "duplicate identity clusters needing review", names the top three vendor-account collisions, and links every cluster back to a section of the JSON for the deeper drill.
VSurface
The surface is a PowerShell module: Import-Module corvus; Invoke-CorvusTriage -Sources @(...). Read-only by contract — the module is configured with read-only credentials for the seven sources, so the worst it can do operationally is exhaust a connector’s rate budget. Output is a normalized inventory table per source (one row per identity record, with a stable schema across vendors) plus a cross-source duplicate ledger keyed on a deterministic identity hash. The hash is the join key that lets a duplicate cluster span Workday + AD + Entra without depending on any one vendor’s notion of canonical id.
VIConstraints
Every connector has its own definition of “the same person.” Workday’s worker id is canonical inside Workday and meaningless to AD; SailPoint’s identityName may match Entra’s userPrincipalName but only after a normalization pass that strips domain suffixes and Unicode case-folds. corvus does the normalization explicitly and exposes the rules as data (in $script:Config.IdentityKeys), not as code, so a triage-time rule change is a config edit rather than a module re-publish. Idempotency is non-negotiable: the same input across the same seven sources yields a byte-identical duplicate ledger across runs, which is what makes corvus reports comparable week over week and trustworthy as evidence in an audit conversation.