feat: improve repository recommendations using GitHub topic matching by tejeshvenkat · Pull Request #12 · OWASP-BLT/BLT-OSSH

tejeshvenkat · 2026-03-05T19:40:51Z

This PR improves the repository recommendation system by incorporating GitHub repository topics into the ranking algorithm.

Key improvements:

• Fetch repository topics using the GitHub API preview header
• Extract contributor interests from repository topics
• Introduce topicScore to evaluate topic relevance
• Update ranking formula to include topicScore

New ranking formula:
(stars * 0.5) + (activityScore * 0.2) + (languageScore * 0.2) + (topicScore * 0.1)

Additional improvements:
• Safe handling of missing topic data
• Display repository topics in recommendation cards
• Improved recommendation accuracy using contributor interests

Summary by CodeRabbit

New Features
- Contributor Activity Score and Top Languages are displayed in results; recommendations incorporate contributor topics, activity and language relevance.
Performance
- Short-term response caching (10-minute TTL) reduces network calls.
Documentation
- Contributing guide replaced with a structured workflow, branching conventions, code style, testing and clearer PR steps.

Jayant2908 · 2026-03-05T20:19:36Z

Hey man, really good changes. I am having some similar changes with a major big one, pushing it soon and you can iterate on that. Thank you!

tejeshvenkat · 2026-03-05T21:01:04Z

Thanks for the feedback!

That sounds great. I’ll wait for the upcoming changes and then update this PR to align with the new implementation. Happy to iterate on it further.

…ntribution guidelines

…mance

owasp-blt · 2026-03-19T05:07:54Z

👋 Hi @tejeshvenkat!

This pull request needs a peer review before it can be merged. Please request a review from a team member who is not:

The PR author
coderabbitai
copilot

Once a valid peer review is submitted, this check will pass automatically. Thank you!

⚠️ Peer review enforcement is active.

coderabbitai · 2026-03-19T05:08:04Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository: OWASP-BLT/coderabbit/.coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 7c9d783b-3ac6-4d6c-abda-317f9547c149

📥 Commits

Reviewing files that changed from the base of the PR and between bcef24c and 2e9c152.

📒 Files selected for processing (2)

index.html
js/app.js

✅ Files skipped from review due to trivial changes (1)

index.html

🚧 Files skipped from review as they are similar to previous changes (1)

js/app.js

Walkthrough

Added client-side caching (localStorage, 10-minute TTL), extended recommendation logic to use public events and repo languages/topics (activity score, top languages), updated UI to show contributor activity score and top languages, and revised README contributing instructions and API endpoint documentation.

Changes

Cohort / File(s)	Summary
Documentation Updates `README.md`	Removed a documented GitHub REST API endpoint and replaced the previous CONTRIBUTING instructions with a structured guide: branching conventions, files to edit, `npm test`, code style, PR steps targeting `main`, and guidance on contributions.
UI Additions `index.html`	Added DOM elements for contributor metrics: `#contributor-activity-score`, `#contributor-activity-label`, and `#top-languages-list`; inserted placeholders and adjusted markup ordering.
Caching & Recommendation Engine `js/app.js`	Introduced `CACHE_TTL`, `getCachedData()`, `setCachedData()` using `localStorage` (10-minute TTL) with safe parsing. Modified submit flow to use cache-first for repos/events, added error handling/fallbacks, changed `buildRecommendations(userData, repos)` → `buildRecommendations(userData, repos, eventsData = [])`, computed `activity_score` and `activity_breakdown`, derived `top_languages`, integrated contributor topics into scoring, augmented `github_stats` with new fields, and updated `displayResults` and generated Markdown to show activity and top languages.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant App as App (js/app.js)
    participant Cache as localStorage
    participant API as GitHub API
    participant Recommend as buildRecommendations()
    participant UI as DOM

    User->>App: Submit username form
    App->>Cache: Check `github_repos_${username}` cache
    alt Cache hit
        Cache-->>App: Return cached repos
    else Cache miss
        App->>API: GET /users/{username}/repos (Accept: topics)
        API-->>App: Repos data
        App->>Cache: Store repos with TTL
    end

    App->>Cache: Check `github_events_${username}` cache
    alt Cache hit
        Cache-->>App: Return cached events
    else Cache miss
        App->>API: GET /users/{username}/events
        alt API success
            API-->>App: Events data
            App->>Cache: Store events with TTL
        else API error
            App-->>App: Use default events = []
        end
    end

    App->>Recommend: buildRecommendations(userData, repos, events)
    Recommend->>Recommend: Compute activity_score & activity_breakdown
    Recommend->>Recommend: Extract top_languages from repos
    Recommend->>Recommend: Score topics & rank recommendations
    Recommend-->>App: Recommendations + github_stats (activity_score, top_languages)
    App->>UI: Display activity_score, top_languages, and recommendations
    UI-->>User: Show results

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels: quality: high

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: improve repository recommendations using GitHub topic matching' accurately reflects the main change: integrating GitHub topics into the recommendation algorithm to improve ranking.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

You can disable sequence diagrams in the walkthrough.

Disable the reviews.sequence_diagrams setting to disable sequence diagrams in the walkthrough.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

js/app.js (1)
174-208: ⚠️ Potential issue | 🟠 Major

topicScore is self-referential with the current candidate set.

contributorTopics is built from the same repos array that you later rank, so every repo with topics already matches its own topics. In practice this is mostly a bonus for “has more topics,” not a relevance signal for contributor interests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@js/app.js` around lines 174 - 208, The topic scoring is biased because
contributorTopics is built from the same repos being scored, so each repo
matches its own topics; change getTopicScore to compute contributor interest
topics excluding the candidate repo (or accept a precomputed interest set keyed
by repo id) and compare the candidate repo.topics against that exclusion set;
keep the fallback relevantTopics list logic for when the exclusion set is empty,
and update references to contributorTopics/relevantTopics in getTopicScore so it
uses the exclusion set (use repo.id or another unique repo identifier to exclude
the candidate).

🧹 Nitpick comments (1)

README.md (1)
306-312: Don’t present a placeholder npm test as required validation.

This asks contributors to run npm test and then immediately says the command is only a placeholder. Until there is a real automated check behind it, I’d rename this step to manual validation and point contributors at the smoke-test workflow above.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 306 - 312, Update the README section titled "### 3.
Run Tests Before Submitting" to stop presenting `npm test` as a required
validation step when it’s just a placeholder: rename the section to something
like "Manual validation before submitting", remove or de-emphasize the `npm
test` command as an automated check, and instead reference the existing
smoke-test workflow (mentioned earlier in the README) as the authoritative
pre-submit check and/or provide explicit manual steps to perform; ensure the
text around the "npm test" snippet clarifies it is a placeholder and not an
automated gate.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@index.html`:
- Around line 282-300: The Contributor Activity section is using inline analysis
that is out-of-sync with the main app bundle, so update the page to use the
single source of truth: remove or disable the inline analyzer and instead import
and invoke the analysis/rendering from js/app.js (or if you prefer to keep
inline, call the same exported functions from js/app.js). Specifically, ensure
the js/app.js logic that computes activity_score and top_languages updates the
DOM elements with IDs contributor-activity-score and top-languages-list (or
expose and call renderActivityScore() and renderTopLanguages() from js/app.js),
so the values replace the placeholder "0" and "—" after a real analysis runs.

In `@js/app.js`:
- Around line 153-159: In buildRecommendations, activityScore is computed once
for the whole user (eventsData → activityScore) and then added to every repo, so
it doesn't affect ordering; either compute a per-repo activity score by
filtering eventsData by event.repo.name for each repo (use event.repo.name to
attribute PushEvent/PullRequestEvent/IssuesEvent weights when calculating
repoActivityScore inside the loop that processes repos) and use that
repoActivityScore in the ranking formula, or remove activityScore from the
ranking and only include the global activityScore as display-only metadata in
github_stats; update any references to activityScore in buildRecommendations
(and the similar block around lines 210-216) accordingly.
- Around line 104-128: The code currently converts any non-OK GitHub fetch into
an empty array and persists it, which can poison the cache on transient errors;
update both repo and event fetch flows (symbols: getCachedData, setCachedData,
reposResponse, eventsResponse, reposData, eventsData) to only call setCachedData
when the HTTP response is ok and the parsed JSON is a valid array, and avoid
persisting or overwrite the cache when responses are not ok (instead return/keep
undefined or previous cache and optionally log the response status/error);
ensure the events try/catch also doesn’t setCachedData on fetch failures or
non-ok responses.

---

Outside diff comments:
In `@js/app.js`:
- Around line 174-208: The topic scoring is biased because contributorTopics is
built from the same repos being scored, so each repo matches its own topics;
change getTopicScore to compute contributor interest topics excluding the
candidate repo (or accept a precomputed interest set keyed by repo id) and
compare the candidate repo.topics against that exclusion set; keep the fallback
relevantTopics list logic for when the exclusion set is empty, and update
references to contributorTopics/relevantTopics in getTopicScore so it uses the
exclusion set (use repo.id or another unique repo identifier to exclude the
candidate).

---

Nitpick comments:
In `@README.md`:
- Around line 306-312: Update the README section titled "### 3. Run Tests Before
Submitting" to stop presenting `npm test` as a required validation step when
it’s just a placeholder: rename the section to something like "Manual validation
before submitting", remove or de-emphasize the `npm test` command as an
automated check, and instead reference the existing smoke-test workflow
(mentioned earlier in the README) as the authoritative pre-submit check and/or
provide explicit manual steps to perform; ensure the text around the "npm test"
snippet clarifies it is a placeholder and not an automated gate.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: OWASP-BLT/coderabbit/.coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 1c9aacd2-0032-42ad-a272-758523934595

📥 Commits

Reviewing files that changed from the base of the PR and between ebec537 and 75ec1cd.

📒 Files selected for processing (3)

README.md
index.html
js/app.js

…xternal links

…, catalog, cache, events, topic matching

karunarapolu

Suggestion: Ranking formula might be better if topics and languages are given more weight than stars.

tejeshvenkat · 2026-03-22T17:35:32Z

good idea — can tune weights in a follow-up or explain catalog scoring

owasp-blt Bot mentioned this pull request Mar 9, 2026

fix: action button ui inconsistency #22

Merged

tejeshvenkat added 4 commits March 19, 2026 10:35

docs: add architecture overview, API usage, local development, and co…

1e3fc75

…ntribution guidelines

feat: add contributor activity signals using GitHub events API

20abca0

feat: add contributor language detection and language match ranking

b8902bb

feat: add GitHub API caching to reduce rate limits and improve perfor…

75ec1cd

…mance

tejeshvenkat force-pushed the feature-topic-matching branch from 5d484a2 to 75ec1cd Compare March 19, 2026 05:07

owasp-blt Bot added the needs-peer-review PR needs peer review label Mar 19, 2026

coderabbitai Bot added the quality: high label Mar 19, 2026

coderabbitai Bot requested changes Mar 19, 2026

View reviewed changes

Comment thread index.html

Comment thread js/app.js Outdated

Comment thread js/app.js Outdated

refactor: move sanitizeExternalUrl to module scope and apply to all e…

bcef24c

…xternal links

coderabbitai Bot removed the quality: high label Mar 19, 2026

merge upstream/main; resolve index.html + js/app.js — external script…

2e9c152

…, catalog, cache, events, topic matching

coderabbitai Bot added the quality: high label Mar 22, 2026

coderabbitai Bot approved these changes Mar 22, 2026

View reviewed changes

karunarapolu reviewed Mar 22, 2026

View reviewed changes

Tejas-Ladhani approved these changes Mar 22, 2026

View reviewed changes

owasp-blt Bot added has-peer-review PR has received peer review and removed needs-peer-review PR needs peer review labels Mar 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve repository recommendations using GitHub topic matching#12

feat: improve repository recommendations using GitHub topic matching#12
tejeshvenkat wants to merge 6 commits into
OWASP-BLT:mainfrom
tejeshvenkat:feature-topic-matching

tejeshvenkat commented Mar 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

Jayant2908 commented Mar 5, 2026

Uh oh!

tejeshvenkat commented Mar 5, 2026

Uh oh!

owasp-blt Bot commented Mar 19, 2026

Uh oh!

coderabbitai Bot commented Mar 19, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

karunarapolu left a comment

Uh oh!

tejeshvenkat commented Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

tejeshvenkat commented Mar 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

Jayant2908 commented Mar 5, 2026

Uh oh!

tejeshvenkat commented Mar 5, 2026

Uh oh!

owasp-blt Bot commented Mar 19, 2026

Uh oh!

coderabbitai Bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

karunarapolu left a comment

Choose a reason for hiding this comment

Uh oh!

tejeshvenkat commented Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tejeshvenkat commented Mar 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 19, 2026 •

edited

Loading