Data Factory Data Validation Gate
After an Azure Data Factory load completes, the flow runs validation queries (row counts, null checks, referential integrity, reconciliation to source) against the target, and only on pass promotes the data or signals downstream; on fail it quarantines and alerts. Adds a quality gate to ADF loads.
Provided as-is, without warranty of any kind. Review and test each pattern in a non-production environment before deploying it to live automations. See our Terms.
Overview
This flow adds a post-load quality gate to Azure Data Factory data loads. When an ADF load pipeline finishes, ADF's tail Web activity calls this flow with the completed run id. The flow confirms the load succeeded, runs a SQL validation query against the load target (row counts, null checks, referential integrity, reconciliation), and then gates: if the data is clean it starts a promote pipeline to publish downstream; if not it starts a quarantine pipeline and alerts. Every evaluation is logged to Dataverse and announced in Teams.
Why it matters: loading bad data silently corrupts downstream reports and consumers. A post-load validation gate stops bad batches from propagating and gives a full audit trail of every gate decision.
Ships Off (demo).
Use Case
A data engineering team runs nightly/continuous ADF loads into a SQL target. Before downstream models, dashboards, or exports consume a freshly loaded batch, they want it automatically validated and either promoted (clean) or quarantined (dirty) - with no manual checking and a record of every decision.
Flow Architecture
When an ADF Load Completes
Request (HTTP)Receives the completed load's runId and pipelineName from ADF's tail Web activity.
Initialize Trace & Config
Initialize variableMints a correlation id and binds the ADF subscription/resource group/factory, promote and quarantine pipelines, SQL server/database, validation query, Teams ids, and fail threshold from env vars.
Confirm Load Run Status
Azure Data Factory - GetPipelineRunReads the completed load run to confirm its status before trusting the data.
Run Validation Query
SQL - ExecutePassThroughNativeQuery_V2Runs the data-quality validation query against the target and extracts a FailCount.
Evaluate Validation Gate
ConditionPasses when the load status is Succeeded AND FailCount is within the fail threshold.
Promote (Pass)
Azure Data Factory - CreatePipelineRunStarts the promote pipeline, logs a Passed row to Dataverse, and posts a success message to Teams.
Quarantine (Fail)
Azure Data Factory - CreatePipelineRunStarts the quarantine pipeline, logs a Failed row to Dataverse, and alerts Teams.
Environment Variables
| Schema name | Type | Default | Description |
|---|---|---|---|
| flowlibs_AdfSubscriptionId | String | <your-subscription-id> | Azure subscription hosting the Data Factory. |
| flowlibs_AdfResourceGroup | String | rg-data-platform | Resource group of the Data Factory. |
| flowlibs_AdfFactoryName | String | adf-analytics | The Data Factory name. |
| flowlibs_PromotePipelineName | String | pl_PromoteToProduction | ADF pipeline run on PASS. |
| flowlibs_QuarantinePipelineName | String | pl_QuarantineLoad | ADF pipeline run on FAIL. |
| flowlibs_SqlServerName | String | your-sqlserver.database.windows.net | SQL server hosting the load target. |
| flowlibs_SqlDatabaseName | String | AnalyticsDb | SQL database hosting the load target. |
| flowlibs_ValidationQuery | String | SELECT COUNT(*) AS FailCount FROM dbo.SalesStaging WHERE Amount IS NULL OR CustomerId IS NULL | Validation query returning a single FailCount column. |
| flowlibs_AdminTeamsGroupId | String | <your-team-id> | Teams team (group) id for notifications. |
Connectors & Connections
| Connector | API name | Actions used |
|---|---|---|
| Azure Data Factory | shared_azuredatafactory | GetPipelineRun CreatePipelineRun |
| SQL Server | shared_sql | ExecutePassThroughNativeQuery_V2 |
| Microsoft Teams | shared_teams | PostMessageToConversation |
| Microsoft Dataverse | shared_commondataserviceforapps | CreateRecord |
Note — All connections are referenced as solution connection references; the flow is portable between environments as long as a connection is mapped at import time.
Customization Guide
Almost every realistic variant of this flow can be implemented by changing environment variable values. A few cases require small edits inside the flow definition — those are called out explicitly below.
- Validation query
- Replace flowlibs_ValidationQuery with your own checks. It must return a single column named FailCount. For reconciliation, return the count of mismatched/missing rows.
- Soft vs hard gate
- Raise the fail threshold above 0 to tolerate a small number of exceptions (soft gate) instead of blocking on any defect (hard gate).
- Outcome pipelines
- Point the promote and quarantine pipeline names at your real ADF pipelines. Promote can publish to production or refresh a dataset; quarantine can move the batch aside and open a ticket.
- Multiple checks
- Split the single query into several SQL actions (nulls, RI, reconciliation) and sum their fail counts before the gate, or branch per check type.
- Trend reporting
- The Dataverse audit table makes pass-rate-over-time and quarantine-frequency reports trivial in Power BI.
Key Expressions
The flow is intentionally light on Power Fx / WDL gymnastics — the heaviest expressions are the branch-name concatenation and the approval outcome check. They are listed below in the order they appear in the flow.
EXPR.01Correlation id
Reuse an upstream correlation id or mint a new one.
EXPR.02Fail count (robust parse)
Safely extracts FailCount from the first result row.
EXPR.03Pipeline run id
Run id returned by CreatePipelineRun for the audit log.
Comments
Sign in to join the conversation.
Sign inNo comments yet. Be the first to share your experience with this flow.