📊 Agentic Workflow Lock File Statistics - November 2025 #4908

2025-11-27T03:32:27Z

github-actions[bot]
bot Nov 27, 2025

This report provides a comprehensive statistical analysis of all 95 agentic workflow lock files (.lock.yml) in the repository, revealing usage patterns, popular configurations, structural characteristics, and interesting insights about how gh-aw workflows are designed and deployed.

The analysis examined workflow triggers, safe outputs, permissions, file sizes, job complexity, and tool configurations to understand the landscape of agentic workflows in this repository.

Full Statistical Report

Executive Summary

Total Lock Files: 95
Total Size: 25.43 MB (26,660,619 bytes)
Average File Size: 274.06 KB (280,638 bytes)
Analysis Date: 2025-11-27
Repository: githubnext/gh-aw

File Size Distribution

Size Range	Count	Percentage
< 100 KB	5	5.3%
100-200 KB	5	5.3%
200-300 KB	47	49.5%
300-400 KB	37	38.9%
400+ KB	1	1.1%

Key Statistics:

Smallest: arxiv.lock.yml (80.23 KB)
Largest: poem-bot.lock.yml (460.65 KB)
Median Size: 280.92 KB
Standard Deviation: 70.51 KB

Analysis: The majority of lock files (88.4%) fall within the 200-400 KB range, indicating relatively consistent workflow complexity. Only one outlier (poem-bot) exceeds 400 KB, suggesting it has significantly more features or safe output configurations.

Trigger Analysis

Most Popular Triggers

Trigger Type	Count	Percentage	Description
workflow_dispatch	72	75.8%	Manual trigger capability
schedule	54	56.8%	Cron-based scheduled runs
command	14	14.7%	Slash command triggers
reaction	10	10.5%	Emoji reaction triggers
pull_request	7	7.4%	PR event triggers
push	4	4.2%	Push event triggers
issues	2	2.1%	Issue event triggers
workflow_run	2	2.1%	Dependent workflow triggers
workflow_call	1	1.1%	Reusable workflow call

Insights:

Manual Control: 75.8% of workflows support manual triggering via workflow_dispatch, enabling on-demand execution
Automation First: 56.8% use scheduled triggers, showing strong preference for automated, periodic execution
Interactive Workflows: 14.7% support command-based triggers (e.g., /review, /tidy), enabling conversational interaction
Creative Triggers: 10.5% use reaction-based triggers, allowing users to trigger workflows with emojis

Common Trigger Combinations

Combination	Count	Use Case
schedule + workflow_dispatch	54	Automated daily/weekly runs with manual override
pull_request + workflow_dispatch	7	PR automation with manual testing
reaction + workflow_dispatch	6	Interactive with manual fallback
pull_request + reaction	5	PR interaction via emojis
pull_request + schedule	5	PR checks plus periodic audits

Pattern: The most common pattern is schedule + workflow_dispatch (54 workflows), enabling both automated periodic execution and manual on-demand runs - a best practice for operational flexibility.

Trigger Distribution by Workflow Count

Number of Triggers	Workflows	Percentage
0 triggers	9	9.5%
1 trigger	20	21.1%
2 triggers	58	61.1%
3 triggers	2	2.1%
4 triggers	6	6.3%

Finding: 61.1% of workflows use exactly 2 triggers, typically combining automation with manual control. Only 9 workflows have no triggers (likely reusable workflows or shared configurations).

Safe Outputs Analysis

Safe Output Types Distribution

Safe Output Type	Count	Percentage	Use Case
create-discussion	33	34.7%	Post reports and findings
add-comment	17	17.9%	Add comments to issues/PRs
create-issue	14	14.7%	Create tracking issues
upload-assets	14	14.7%	Attach artifacts (charts, reports)
create-pull-request	14	14.7%	Automated code changes
push-to-pull-request-branch	6	6.3%	Update PR branches directly
assign-to-agent	3	3.2%	Route to specialized agents
create-pull-request-review-comment	3	3.2%	Line-specific PR feedback
add-labels	3	3.2%	Label management
close-discussion	3	3.2%	Cleanup old discussions
close-issue	2	2.1%	Close resolved issues
threat-detection	2	2.1%	Security monitoring
missing-tool	2	2.1%	Report missing capabilities
update-issue	1	1.1%	Update issue metadata
update-release	1	1.1%	Modify release notes
create-code-scanning-alert	1	1.1%	Security alerts
update-project	1	1.1%	Project board updates
staged	1	1.1%	Staged safe outputs

Key Insights:

Discussion-First: create-discussion is the most popular (34.7%), indicating a preference for threaded conversations over issues
Multi-Modal Output: 14.7% of workflows can create issues, PRs, or upload assets - showing diverse output capabilities
Automation Depth: 6.3% directly push to PR branches, indicating high-trust automation scenarios
Security Focus: Multiple workflows use threat-detection and code-scanning-alert outputs for security monitoring

Safe Output Combinations

Number of Outputs	Workflows	Percentage	Example Use Case
0 outputs	19	20.0%	Read-only workflows or shared configs
1 output	45	47.4%	Single-purpose workflows
2 outputs	24	25.3%	Multi-output workflows (e.g., discussion + assets)
3 outputs	6	6.3%	Complex workflows with multiple targets
10 outputs	1	1.1%	poem-bot (highly versatile)

Notable: The poem-bot workflow stands out with 10 different safe output types, making it the most versatile workflow in the repository.

Discussion Categories

For workflows using create-discussion, the most popular categories are:

Category	Count	Use Case
audits	13	Audit reports and analysis
General	5	General discussions
Audits	3	Alternative casing
artifacts	2	Artifact summaries
dev	2	Development discussions
Other	10	Various categories

Standardization Opportunity: There are two variants of "audits" (lowercase and capitalized), indicating a potential for category name standardization.

Structural Characteristics

Job Complexity

Average Jobs per Workflow: 12.1
Median Jobs: 12
Min Jobs: 5
Max Jobs: 21 (in cloclo.lock.yml)

Distribution: Most workflows have 10-15 jobs, with cloclo being an outlier at 21 jobs, suggesting it handles multiple complex tasks or has extensive orchestration requirements.

Step Complexity

Average Steps per Workflow: 60.9
Median Steps: 62
Min Steps: 27
Max Steps: 99 (in poem-bot.lock.yml)

Insights:

The typical workflow has ~61 steps distributed across ~12 jobs (about 5 steps per job average)
poem-bot with 99 steps is the most step-intensive workflow, correlating with its 10 safe output types
Minimal workflows (tests, shared configs) have as few as 27 steps

Average Lock File Structure

Based on statistical medians, a typical .lock.yml file has:

Size: ~281 KB
Jobs: 12 jobs
Steps per Workflow: 62 steps (~5 per job)
Triggers: 2 triggers (usually schedule + workflow_dispatch)
Safe Outputs: 1-2 safe output types
Permissions: 3-4 permissions (contents, issues, pull-requests)

Permission Patterns

Most Common Permissions

Permission	Count	Percentage of Workflows	Typical Access Level
contents	83	87.4%	read or write
pull-requests	78	82.1%	read or write
issues	77	81.1%	read or write
actions	37	38.9%	read
discussions	12	12.6%	read or write
security-events	6	6.3%	write
repository-projects	3	3.2%	write

Key Findings:

Core Trio: Most workflows (>80%) request contents, issues, and pull-requests permissions - the essential trio for repository interaction
Action Awareness: 38.9% request actions permission, likely for workflow metadata and run information
Security First: Only 6 workflows request security-events permission, indicating careful scoping to security-specific use cases
Average Permission Count: 3.1 permissions per workflow (total: 296 grants across 95 workflows)

Security Posture: Workflows follow the principle of least privilege, with most requesting only the 3-4 core permissions needed for their specific tasks.

Tool & MCP Server Patterns

Most Used MCP Servers

MCP Server	Usage Count	Percentage	Primary Use Case
github	3,491	99.8%	GitHub API operations
playwright	210	6.0%	Browser automation & web scraping
arxiv	6	0.2%	Research paper access
deepwiki	6	0.2%	Wikipedia/knowledge base queries
context7	4	0.1%	Context retrieval

Insights:

GitHub Dominance: The github MCP server appears 3,491 times across workflows, making it by far the most essential integration
Web Automation: 210 uses of playwright indicate significant browser automation for testing docs, scraping, or UI verification
Research Workflows: Specialized servers like arxiv and deepwiki enable knowledge retrieval workflows
Emerging Patterns: Low usage of context7 suggests it's either new or experimental

Tool Configuration Patterns

Based on workflow analysis:

Standard Toolset: Most workflows use the default GitHub toolset (issues, PRs, discussions, commits)
Extended Capabilities: ~20% of workflows enable extended toolsets for specialized operations
Browser Automation: Workflows using Playwright typically involve documentation testing or multi-device compatibility checks
Firewall Policies: Most workflows operate within the default network firewall, with explicit exceptions for specific use cases

Interesting Findings

1. The "Poem Bot" Powerhouse

poem-bot.lock.yml is the most complex workflow by multiple metrics:

Largest file size: 460.65 KB
Most steps: 99 steps
Most safe output types: 10 different output methods
One of the most jobs: 19 jobs

This suggests it's a highly versatile, multi-purpose workflow capable of interacting with the repository in numerous ways.

2. Schedule + Manual: The Winning Combination

54 out of 95 workflows (56.8%) use the schedule + workflow_dispatch trigger combination, establishing this as the de facto standard pattern for agentic workflows. This enables:

Automated periodic execution for regular tasks
Manual override for testing and one-off runs
Operational flexibility without separate workflow definitions

3. Discussion-Driven Reporting

With 33 workflows creating discussions (vs. 14 creating issues), there's a clear preference for discussion-based reporting. Discussions provide:

Threaded conversations for follow-up
Better organization for ongoing monitoring
Less noise in the issue tracker
Categorization for different report types

4. Minimal Variance in Structure

The low standard deviation in file sizes (70.51 KB) and consistent job counts (avg: 12.1, median: 12) indicate that workflows follow similar architectural patterns, likely due to the gh-aw framework's opinionated structure. This consistency benefits:

Maintainability
Predictable resource usage
Easier debugging

5. Security-Conscious Permission Model

Only 7 permission types are used across all workflows, with most workflows requesting just 3-4. This minimalism indicates:

Strong adherence to least-privilege principles
Clear understanding of required permissions
Reduced attack surface

6. Shared Configuration Pattern

9 workflows have 0 triggers, located in shared/ and tests/ directories. These are reusable workflow components, promoting:

Code reuse
Consistent behavior across workflows
Easier testing

7. Command-Driven Interactivity

14 workflows support command triggers, enabling conversational interaction:

/review - Code review workflows
/tidy - Cleanup operations
/plan - Planning workflows
Custom commands for domain-specific tasks

This pattern transforms workflows from passive automation to interactive assistants.

Historical Trends

This is the first comprehensive analysis - future runs will compare against this baseline.

Baseline Metrics for Future Comparison:

Lock file count: 95
Average file size: 274.06 KB
Total repository size: 25.43 MB
Average jobs per workflow: 12.1
Average steps per workflow: 60.9

Recommendations

1. Standardize Discussion Categories

Consolidate category naming (e.g., "audits" vs "Audits") to improve organization and discoverability of reports.

2. Document the "Standard Workflow Pattern"

The schedule + workflow_dispatch with 12 jobs and ~60 steps represents the canonical workflow structure. Documenting this pattern would help new contributors understand best practices.

3. Consider Workflow Size Monitoring

With files ranging from 80 KB to 460 KB, consider implementing size warnings when workflows exceed 400 KB, as this may indicate over-complexity or opportunity for refactoring.

4. Expand MCP Server Usage

Only 5 MCP servers are actively used. Evaluating additional integrations (e.g., Slack, Notion, Jira) could enhance workflow capabilities.

5. Leverage Shared Configurations

With strong structural consistency across workflows, extracting more common patterns into shared configurations could reduce duplication and maintenance burden.

6. Safe Output Combinations

20% of workflows have no safe outputs. Evaluate whether these are purely observational or if they could benefit from notification/reporting capabilities.

Methodology

Analysis Tool: Python scripts with YAML parsing and statistical analysis
Lock Files Analyzed: 95 (100% coverage)
Cache Memory: Used /tmp/gh-aw/cache-memory/ for script persistence and historical data tracking
Data Sources: .github/workflows/**/*.lock.yml files
Parsing Method: Frontmatter extraction with regex pattern matching
Statistical Tools: Python statistics module for mean, median, standard deviation

Analysis Scripts Available in Cache

The following scripts are available for future runs:

/tmp/gh-aw/cache-memory/scripts/analyze_lockfiles.sh - Bash-based lock file data extraction
/tmp/gh-aw/cache-memory/scripts/analyze.py - Python YAML parser and data collector
/tmp/gh-aw/cache-memory/scripts/compute_stats.py - Statistical analysis and metric computation

Historical Data

Baseline data for this analysis is stored at:

/tmp/gh-aw/cache-memory/history/lockfile-stats-2025-11-27.json

Future analyses can compare against this baseline to identify trends in workflow growth, complexity changes, and pattern evolution.

Generated by Lockfile Statistics Analysis Agent on 2025-11-27

AI generated by Lockfile Statistics Analysis Agent

2025-12-04T00:22:40Z

github-actions[bot]
bot Dec 4, 2025
Author

This discussion was automatically closed because it was created by an agentic workflow more than 3 days ago.

0 replies

📊 Agentic Workflow Lock File Statistics - November 2025 #4908

Uh oh!

github-actions[bot] bot Nov 27, 2025

Executive Summary

File Size Distribution

Trigger Analysis

Most Popular Triggers

Common Trigger Combinations

Trigger Distribution by Workflow Count

Safe Outputs Analysis

Safe Output Types Distribution

Safe Output Combinations

Discussion Categories

Structural Characteristics

Job Complexity

Step Complexity

Average Lock File Structure

Permission Patterns

Most Common Permissions

Tool & MCP Server Patterns

Most Used MCP Servers

Tool Configuration Patterns

Interesting Findings

1. The "Poem Bot" Powerhouse

2. Schedule + Manual: The Winning Combination

3. Discussion-Driven Reporting

4. Minimal Variance in Structure

5. Security-Conscious Permission Model

6. Shared Configuration Pattern

7. Command-Driven Interactivity

Historical Trends

Recommendations

1. Standardize Discussion Categories

2. Document the "Standard Workflow Pattern"

3. Consider Workflow Size Monitoring

4. Expand MCP Server Usage

5. Leverage Shared Configurations

6. Safe Output Combinations

Methodology

Analysis Scripts Available in Cache

Historical Data

Replies: 1 comment

Uh oh!

github-actions[bot] bot Dec 4, 2025 Author

github-actions[bot]
bot Nov 27, 2025

github-actions[bot]
bot Dec 4, 2025
Author