Org Public Repo Inventory
Weekly inventories all public org repos via Lists All Public Repositories For An Organization into a Dataverse table with stars, forks, language, last push; drives a Power BI dashboard of the org's open-source footprint.
Provided as-is, without warranty of any kind. Review and test each pattern in a non-production environment before deploying it to live automations. See our Terms.
Overview
FlowLibs - Org Public Repo Inventory is a weekly scheduled cloud flow that takes a full snapshot of every public repository under a GitHub organization and writes each repo as a row in a custom Dataverse table (flowlibs_githubrepoinventory). Each snapshot captures name, description, language, stars, forks, open issue count, last push date, default branch, and repository URL — so Power BI can chart the org's open-source footprint over time without ever calling GitHub directly.
The flow is a canonical example of the "GitHub → Dataverse → Power BI" reporting pipeline: a premium connector replaces HTTP calls, an environment-variable-driven org name makes the flow portable across tenants, and a Dataverse table replaces fragile SharePoint lists as the analytical store.
Use Case
Without this flow, someone manually exports repo data from GitHub every week and pastes into Excel. The flow removes that toil and gives the data team a persistent, queryable, versioned store.
The flow is ideal for teams that:
- Executives — a dashboard that quantifies the organization's open-source footprint (star count trend, top languages, most active repos), refreshed every Monday morning.
- Marketing / DevRel — a weekly data point to measure community growth (stars, forks) without manual GitHub exports.
- Engineering leads — visibility into which public repos have stale pushed_at dates or unresolved open_issues, useful for grooming the public surface area.
Flow Architecture
Recurrence
RecurrenceRuns weekly on Monday at 06:00 Eastern; first run scheduled 2026-04-20T06:00:00.
Initialize variable: varOrgName
Initialize variableReads the `flowlibs_GitHubOrganization` environment variable into a string variable used as the org slug for all GitHub calls.
Initialize variable: varMaxPages
Initialize variableReads `flowlibs_RepoInventoryMaxPages` env var; caps how many pages of GetOrgRepos the flow will request (100 repos per page).
Initialize variable: varInventoryTable
Initialize variableReads `flowlibs_RepoInventoryTable` env var — logical collection name of the destination Dataverse table.
Initialize variable: varSnapshotDate
Initialize variableCaptures `utcNow()` once per run so every row written in this run shares one snapshot timestamp.
Initialize variable: varRepoCount
Initialize variableInteger counter, starts at 0; incremented inside the Apply to each loop to feed the run-history summary.
GitHub - List organization public repos
GetOrgReposCalls `GetOrgRepos` (Lists all public repositories for an organization) with `repositoryOwner: @variables('varOrgName')`, `type: public`, `per_page: 100`, `page: 1`. Pagination runtime setting is enabled with threshold `@variables('varMaxPages')` × 100.
Apply to each repo
Apply to eachFor each item in `body('List_Organization_Public_Repos')`: add a new row to the `flowlibs_githubrepoinventories` Dataverse table with the 11 mapped fields, then increment `varRepoCount` by 1.
Environment Variables
| Schema name | Type | Default | Description |
|---|---|---|---|
| flowlibs_GitHubOrganization | String | flowlibs-demo-org | The GitHub org slug to inventory. Change in the target environment to your real org name. |
| flowlibs_RepoInventoryMaxPages | Number | 10 | Ceiling on paged `GetOrgRepos` calls (100 repos per page → 1,000 repos max at the default of 10). |
| flowlibs_RepoInventoryTable | String | flowlibs_githubrepoinventories | Logical collection name of the destination Dataverse table. |
Connectors & Connections
| Connector | API name | Actions used |
|---|---|---|
| GitHub | shared_github | GetOrgRepos (Lists all public repositories for an organization) |
| Microsoft Dataverse | shared_commondataserviceforapps | AddRecord (Add a new row) |
Note — All connections are referenced as solution connection references; the flow is portable between environments as long as a connection is mapped at import time.
Customization Guide
Almost every realistic variant of this flow can be implemented by changing environment variable values. A few cases require small edits inside the flow definition — those are called out explicitly below.
- Change the GitHub org
- Solution → Environment variables → flowlibs_GitHubOrganization → Current value → paste your GitHub org slug, then re-enable the flow.
- Increase the repo ceiling
- Raise flowlibs_RepoInventoryMaxPages (default 10 = 1,000 repos). Above ~50 pages, consider paging on a cursor instead of bumping this.
- Add columns (e.g. archived, license.spdx_id)
- Add the column in Dataverse (e.g. flowlibs_archived, Two Options, default No). In the designer, open the Add_Repo_Row_to_Dataverse action → Show all advanced parameters → bind archived → @{items('For_Each_Repo')?['archived']}.
- Inventory private repos instead of public
- Change the GetOrgRepos type parameter from public to private (or all) and grant the GitHub connection account private-repo scope.
- Switch to daily cadence
- Edit the recurrence trigger to Daily. At that cadence, add a Dataverse 'delete old snapshots' cleanup step (or trust Power BI's incremental refresh) — the table will otherwise grow linearly.
Key Expressions
The flow is intentionally light on Power Fx / WDL gymnastics — the heaviest expressions are the branch-name concatenation and the approval outcome check. They are listed below in the order they appear in the flow.
EXPR.01Snapshot timestamp
Captured once into `varSnapshotDate` so every row in a single run shares the same snapshot date.
EXPR.02Repo pushed_at → Dataverse UTC
Maps GitHub's `pushed_at` ISO timestamp directly into the `flowlibs_lastpush` DateTime column.
EXPR.03Org env var
Reads the org slug environment variable; used to initialize `varOrgName`.
EXPR.04Max-pages cast to int
Casts the numeric env var to an integer before using it as the pagination threshold multiplier.
EXPR.05Safe description (null-guarded)
Null-safe fallback so repos with no GitHub description don't fail the Dataverse Memo column write.
EXPR.06Org env var
Reads the org slug environment variable; used to initialize `varOrgName`.
EXPR.07Summary output
Builds the human-readable summary string rendered by the Compose action into the run history.
Comments
Sign in to join the conversation.
Sign inNo comments yet. Be the first to share your experience with this flow.