Cosmos DB Document Validation and Enrichment
When new documents arrive (change feed), the flow validates them against rules, enriches missing fields (geocode, currency, lookups), writes the enriched version back to Cosmos, and routes invalid documents to a dead-letter container with reasons. Keeps Cosmos data clean and complete.
Provided as-is, without warranty of any kind. Review and test each pattern in a non-production environment before deploying it to live automations. See our Terms.
Overview
This flow keeps Azure Cosmos DB data clean and complete. On a schedule it polls a source container for documents that have not yet been processed, validates each against a configurable required-fields ruleset, enriches valid documents via an external API (geocode / FX / lookups) and writes the enriched version back, and dead-letters invalid documents (with the list of missing fields) into a separate container. A Microsoft Teams summary reports the counts each run. A single correlationId (minted with guid() in the first action) is stamped on every written document and the Teams summary so a run can be traced end to end. Ships Off — going live requires only authorizing the connections and setting the environment variable values.
Use Case
A Cosmos-backed application accepts documents from many sources, with inconsistent completeness. The data team wants every document validated against a minimum schema, missing reference data filled in automatically, and anything that cannot be validated quarantined with a reason so it can be fixed and replayed — all without touching the app.
The flow is ideal for teams that:
- Validates every document against a configurable required-fields ruleset
- Enriches valid documents via an external API and writes them back to Cosmos
- Quarantines invalid documents to a dead-letter container with the missing-field reasons
- Loop-safe: each document is flagged so subsequent runs never reprocess it
Flow Architecture
Recurrence
Recurrence (Hour/1)Polls hourly for unprocessed documents.
Initialize varCorrelationId
InitializeVariable@guid() trace id stamped on every written document and the Teams summary.
Initialize binding + counter variables
InitializeVariable (x14)Bind Cosmos account/db/container, dead-letter container, enrichment API base + key, validation rules, status field, enriched-flag field, Teams group/channel; plus integer counters varValidCount and varInvalidCount.
Compose Cosmos Query
ComposeSELECT * FROM c WHERE (NOT IS_DEFINED(c[enrichedField]) OR c[enrichedField] = false).
Query Unprocessed Documents
Azure Cosmos DB - QueryDocuments_V5Reads all unprocessed documents from the source container.
Apply to each Document - Is Document Valid
Foreach (concurrency = 1) + If conditionCompose required fields, compute missing fields, then branch on whether the document has zero missing required fields.
- Enrich Via HTTP — GET to the enrichment API.
- Compose Enriched Document — setProperty chain: status = validated, enriched flag = true, enrichment payload, enrichedAt, correlationId.
- Upsert Enriched Document — Cosmos CreateDocument_V3 (upsert) back into the source container.
- Increment Valid Count — IncrementVariable varValidCount.
Environment Variables
| Schema name | Type | Default | Description |
|---|---|---|---|
| flowlibs_CosmosAccountName | String | flowlibs-cosmos | Cosmos DB (SQL API) account name. |
| flowlibs_CosmosDatabaseId | String | ReferenceDb | Cosmos database id. |
| flowlibs_CosmosContainerId | String | ReferenceData | Source container holding incoming documents. |
| flowlibs_TeamsGroupId | String | <configure> | Microsoft Teams team (group) ID for the run summary. |
| flowlibs_TeamsChannelId | String | <configure> | Microsoft Teams channel ID for the run summary. |
| flowlibs_DeadLetterContainer | String | deadletter | Container that receives invalid documents. |
| flowlibs_EnrichApiBase | String | https://api.contoso-enrichment.example | Base URL of the enrichment API. |
| flowlibs_EnrichApiKey | String | REPLACE_WITH_ENRICHMENT_API_KEY | API key sent as the x-api-key header to the enrichment API. |
| flowlibs_ValidationRules | String | {"required":["docType","country","amount"]} | Required-fields ruleset (JSON). |
Connectors & Connections
| Connector | API name | Actions used |
|---|---|---|
| Azure Cosmos DB | shared_documentdb | QueryDocuments_V5 CreateDocument_V3 (upsert) |
| Microsoft Teams | shared_teams | PostMessageToConversation |
| HTTP | GET (enrichment API call) |
Note — All connections are referenced as solution connection references; the flow is portable between environments as long as a connection is mapped at import time.
Customization Guide
Almost every realistic variant of this flow can be implemented by changing environment variable values. A few cases require small edits inside the flow definition — those are called out explicitly below.
- Validation rules
- Edit flowlibs_ValidationRules to add/remove required fields; no flow edit needed. Extend Compose Missing Fields for format/type checks.
- Enrichment
- Point flowlibs_EnrichApiBase at the real API and map its response fields into Compose Enriched Document. Only write high-confidence enrichment if the API returns a confidence score.
- Replay
- After fixing a dead-lettered document, clear its _enriched flag (or copy it back to the source container) and the next run reprocesses it.
- Trigger
- Swap the hourly recurrence for the Cosmos change-feed trigger if near-real-time processing is required.
- Concurrency
- The loop runs sequentially (concurrency = 1) so the counters are exact; raise it for throughput if you replace the counters with a different aggregation.
Key Expressions
The flow is intentionally light on Power Fx / WDL gymnastics — the heaviest expressions are the branch-name concatenation and the approval outcome check. They are listed below in the order they appear in the flow.
EXPR.01Unprocessed query
Builds the Cosmos SQL query for documents not yet enriched.
EXPR.02Missing required fields
Filters the required-field list down to the ones missing/empty on the document.
EXPR.03Validity test
Document is valid when there are zero missing required fields.
EXPR.04Enriched write-back (preserves all original fields)
setProperty chain stamps status, enriched flag, enrichment payload, timestamp, and correlation id.
Comments
Sign in to join the conversation.
Sign inNo comments yet. Be the first to share your experience with this flow.