Data Factory Source Arrival Watchdog
On a schedule, the flow checks whether expected source files have arrived in a Blob landing folder before their ADF pipeline is due; if present it triggers the pipeline, and if a source is late it posts a within-grace notice or, past the grace period, a high-importance escalation naming the data provider. Prevents pipelines from running on missing inputs.
Provided as-is, without warranty of any kind. Review and test each pattern in a non-production environment before deploying it to live automations. See our Terms.
Overview
This solution prevents Azure Data Factory pipelines from running on missing inputs. On a schedule it checks whether expected source files have arrived in an Azure Blob landing folder before the ADF load pipeline is due. If the expected files are present, it triggers the pipeline and posts a success notice to Microsoft Teams. If they are missing, it computes how overdue the source is against a configurable daily deadline and either posts an informational within-grace notice or - once the grace period is exceeded - posts a high-importance escalation naming the upstream data provider to chase.
Why it matters: running a pipeline on incomplete inputs produces bad or partial data. A source watchdog ensures the load only fires when inputs are ready, and gives the team early, actionable warning when a provider is late.
Ships Off (demo).
Use Case
A data team depends on partner/source files landing in a blob folder before a nightly ADF load. The watchdog runs on an interval through the load window: it confirms arrival, kicks off the pipeline when ready, and escalates to the provider if a source is overdue beyond an agreed grace period.
Flow Architecture
Recurrence Check Window
RecurrenceRe-checks source arrival hourly within the grace window.
Initialize Config (x12)
Initialize variableLoads the expected prefix, grace minutes, deadline hour, storage account, source folder, ADF subscription/RG/factory/pipeline, Teams ids, and provider email.
List & Filter Source Files
Azure Blob ListFolder_V4 + Filter arrayLists the landing folder and keeps only files whose name starts with the expected prefix.
Sources Ready?
Conditionlength(filtered) > 0 gates whether the load pipeline runs.
Trigger Load + Notify (ready)
ADF CreatePipelineRun + TeamsStarts the ADF load pipeline and posts a success notice with the new run id.
Compute Overdue & Escalate (not ready)
Compose + Condition + TeamsComputes today's deadline and minutes overdue; posts a within-grace notice, or once past the grace window a high-importance escalation naming the data provider.
Environment Variables
| Schema name | Type | Default | Description |
|---|---|---|---|
| flowlibs_WatchdogExpectedPrefix | String | sales_export_ | File-name prefix an arriving source file must match. |
| flowlibs_WatchdogGraceMinutes | String | 120 | Minutes past the deadline before escalating. |
| flowlibs_WatchdogDeadlineHourUtc | String | 6 | Hour of day (UTC, 0-23) by which sources must arrive. |
| flowlibs_WatchdogStorageAccount | String | stsourcefilesprod | Azure Blob storage account (dataset) holding source files. |
| flowlibs_WatchdogSourceFolder | String | /incoming | Blob folder path scanned for arriving files. |
| flowlibs_AdfSubscriptionId | String | <your-subscription-id> | Azure subscription GUID hosting the Data Factory. |
| flowlibs_AdfResourceGroup | String | rg-data-prod | Resource group containing the Data Factory. |
| flowlibs_AdfFactoryName | String | adf-enterprise-prod | Name of the Azure Data Factory. |
| flowlibs_AdfLoadPipeline | String | pl_nightly_load | Pipeline triggered once all sources have arrived. |
Connectors & Connections
| Connector | API name | Actions used |
|---|---|---|
| Azure Data Factory | shared_azuredatafactory | CreatePipelineRun |
| Azure Blob Storage | shared_azureblob | ListFolder_V4 |
| Microsoft Teams | shared_teams | PostMessageToConversation |
Note — All connections are referenced as solution connection references; the flow is portable between environments as long as a connection is mapped at import time.
Customization Guide
Almost every realistic variant of this flow can be implemented by changing environment variable values. A few cases require small edits inside the flow definition — those are called out explicitly below.
- Per-source SLA / multiple files
- Extend the expected prefix into a list and loop the filter/condition per expected file, tracking a missing array.
- Different deadline cadence
- Change the recurrence frequency and the deadline hour / grace minutes values.
- Partial run
- In the ready branch, pass pipeline parameters to CreatePipelineRun to load only the sources that have arrived.
- Provider ping by email
- Add an Outlook SendEmailV2 in the escalation branch using the provider email.
- Run-status follow-up
- Add ADF GetPipelineRun after CreatePipelineRun to confirm the run started, using the returned runId.
Key Expressions
The flow is intentionally light on Power Fx / WDL gymnastics — the heaviest expressions are the branch-name concatenation and the approval outcome check. They are listed below in the order they appear in the flow.
EXPR.01Sources arrived
Gate for triggering the pipeline.
EXPR.02Match expected file
Filter predicate for expected source files.
EXPR.03Today's deadline (UTC)
Arrival deadline for the day.
EXPR.04Minutes overdue
How late the source is right now.
Comments
Sign in to join the conversation.
Sign inNo comments yet. Be the first to share your experience with this flow.