Skip to content
ANDREW W TAYLOR edited this page Aug 19, 2025 · 5 revisions

🧠 AI-Powered GitHub Contribution Summarizer & Skill Tagger

Phase: Proof of Concept (PoC) Scope: Single GitHub repository, closed issues only
Review Model: Human-in-the-loop (HITL) review for all AI outputs

📘 Table of Contents


🌟 Overview

This tool uses AI and the GitHub API to:

  1. Summarize contributor impact for each closed issue and its associated PRs in resume-style bullets using the STAR method.
  2. Suggest skill labels from a predefined CSV taxonomy, posting suggestions for human review before they are applied.
  3. Populate a skill dashboard for each GitHub user to show earned vs. unearned skills based on tagged issues they’ve contributed to.

⭐ What is the STAR Method?

The STAR method is a structured format used to describe accomplishments clearly and concisely. It’s especially effective in resumes, performance reviews, and AI-generated contribution summaries.

STAR stands for:

  • S – Situation: What was the context or background?
  • T – Task: What responsibility or challenge did the person face?
  • A – Action: What steps did they take to address it?
  • R – Result: What was the outcome or impact?

Example: “Coordinated migration of 50+ issues to GitHub Projects (Action) to streamline team workflow (Task) during a repo consolidation (Situation), improving sprint planning efficiency by 30% (Result).”


🧩 Feature 1: Contribution Summarization

Goal: Provide a concise, resume-style summary of each person’s contributions to an issue, including activity on any associated PRs.

Inputs:

  • GitHub Issue + Timeline (authors, comments, assignment history)
  • Associated PRs (commits, comments, reviews)

Contributors Considered:

  • Issue/PR authors
  • Current and historical assignees
  • Supervisors (e.g., people requesting updates)
  • Coaches (e.g., users giving feedback or guidance)
  • Anyone who commented or interacted meaningfully

Rules:

  • Contributions are traced across the issue and all related PRs.
  • Bot comments are ignored using a predefined list of known bot handles.
  • Each user receives one or more STAR-style resume bullets describing their contribution.
  • The final summary is posted directly to the issue as a comment.

🧩 Feature 2: Skill Label Suggestion

Goal: Recommend relevant skill labels for the issue, based on its content and the work done, using a pre-curated skill taxonomy CSV.

Skill Taxonomy:

  • CSV format with skill names grouped by category
  • No explicit proficiency levels are defined in the taxonomy (yet)

Labeling Process:

  1. The tool analyzes the issue and associated PRs.
  2. Suggested skill labels are posted as a comment on the issue, including rationale (optional).
  3. A human reviewer verifies and refines the suggested labels.
  4. Final labels are applied using the GitHub API.

🧩 Feature 3: Skill Coverage Dashboard

Goal: Generate a visual dashboard (e.g., in Looker or similar BI tool) for each GitHub user that shows:

  • Earned skills: Skill labels associated with issues the user contributed to
  • Unearned skills: Skill labels that have not yet appeared on any of their issues
  • 📌 Linked issues: All issues they’ve contributed to, grouped by skill

Data Infrastructure:

To support this, the tool must export structured data to a spreadsheet or database (e.g., BigQuery, Google Sheets, or CSV). This structured export will contain:

  • Contributor GitHub handle
  • Issue/PR IDs
  • Skill labels per issue
  • Type and extent of contribution (e.g., authored, reviewed, coached)
  • Inferred proficiency level (optional/experimental)

This dataset will serve as the source of truth for the dashboard and allow filtering by contributor, skill category, repo area, or date.


🔎 Human-in-the-Loop (HITL) Review

  • During the proof of concept, all AI outputs (resume bullets and skill labels) will be reviewed by a human for accuracy.
  • In future phases (e.g., MVP):
    • AI will suggest skill labels immediately after an issue is created and deemed ready to prioritize.
    • A human reviewer will confirm or adjust these labels before they are finalized.
    • Open questions remain about when and how the final label application is triggered post-review.

🛠️ Implementation Phase: Proof of Concept (POC)

Scope:

  • Closed issues only
  • Predefined list of GitHub user handles
  • One repository
  • All summaries and labels are applied retroactively

Goals:

  • Understand AI token usage and cost
  • Validate prompt quality and labeling logic
  • Identify pain points in automated summarization and labeling

🔁 Next Phase (Post-POC)

  • Transition to live issues and open issue monitoring
  • Skill labels applied during triage (post-creation, pre-prioritization)
  • More automation of post-review label application
  • Potential integration with project workflows (e.g., issue templates, status labels)